Get variable names matched with TR. item names.

Hello,

I'm curious if it's possible to use ek.get_data or rdp.get_data functions in a way that returns data items based on the specified fields query (ie starting with 'TR.'), rather than using the default column names.

To provide a concrete example, let's say I have a fields query like this:

df, err = ek.get_data(
    instruments = ['AAPL.O'],
    fields = [ 
    'TR.ISIN',
    'TR.PriceClose',
    'TR.PriceClose.date',
    'TR.EnvironmentPillarScore',
    "TR.AnalyticRenewEnergyUse",
  "TR.PercentageGreenProducts",
  "TR.BlackorAfricanAmerican-MinoritiesEmployeesPercentage",
  "TR.HispanicorLatino-MinoritiesEmployeesPercentage",
  "TR.OtherMinoritiesEmployeesPercentage"]
)

I would like to have the variables as they appear in fields, instead of these column names.

df.columns
Index(['Instrument', 'ISIN', 'Price Close', 'Date',
'Environmental Pillar Score', 'Renewable Energy Use Ratio',
'Percentage of Green Products',
'Other - Ethnic Minorities Employees Percentage'],
dtype='object')

The central issue revolves around certain variables not being retrieved as expected. Consequently, the approach of renaming df.columns using the fields list is not a feasible solution.

Is there a method or option that allows me to obtain the data in this format? Your insights would be greatly appreciated. Thank you!

Best Answer

  • pf
    pf
    Answer ✓

    Hi @ricardo.henriquez ,

    If it could help, check field_name parameter in ek.get_data() function:
    get_data(instruments, fields, parameters=None, field_name=False, raw_output=False, debug=False)

    Set it to True, you'll retrieve a DataFrame with following column headers:

    df, err = ek.get_data(
        instruments = ['AAPL.O'],
        fields = [ 
        'TR.ISIN',
        'TR.PriceClose',
        'TR.PriceClose.date',
        'TR.EnvironmentPillarScore',
        "TR.AnalyticRenewEnergyUse",
        "TR.PercentageGreenProducts",
        "TR.BlackorAfricanAmerican-MinoritiesEmployeesPercentage",
        "TR.HispanicorLatino-MinoritiesEmployeesPercentage",
        "TR.OtherMinoritiesEmployeesPercentage"],
        field_name=True
    )
    df.columns
    Index([
    'Instrument',
    'TR.ISIN',
    'TR.PRICECLOSE',
    'TR.PRICECLOSE.DATE',
    'TR.ENVIRONMENTPILLARSCORE',
    'TR.ANALYTICRENEWENERGYUSE',
    'TR.PERCENTAGEGREENPRODUCTS',
    'TR.OTHERMINORITIESEMPLOYEESPERCENTAGE'],
    dtype='object')


Answers

  • @ricardo.henriquez

    Thank you for reaching out to us.

    I can't any parameter that supports this feature.

    You can get the raw output and then rename the field names. After that, re-create a new dataframe from the raw output.

    raw_output  = ek.get_data(
        instruments = ['AAPL.O'],
        fields = [ 
        'TR.ISIN',
        'TR.PriceClose',
        'TR.PriceClose.date',
        'TR.EnvironmentPillarScore',
        "TR.AnalyticRenewEnergyUse",
        "TR.PercentageGreenProducts",
        "TR.BlackorAfricanAmerican-MinoritiesEmployeesPercentage",
        "TR.HispanicorLatino-MinoritiesEmployeesPercentage",
        "TR.OtherMinoritiesEmployeesPercentage"],
        raw_output=True
    )

    The output looks like this:

    {'columnHeadersCount': 1,
    'data': [['AAPL.O',
    'US0378331005',
    175.01,
    '2023-09-15T00:00:00Z',
    65.3695219368318,
    0.062223362244837,
    '',
    4.1]],
    'headerOrientation': 'horizontal',
    'headers': [[{'displayName': 'Instrument'},
     {'displayName': 'ISIN', 'field': 'TR.ISIN'},
    {'displayName': 'Price Close', 'field': 'TR.PRICECLOSE'},
    {'displayName': 'Date', 'field': 'TR.PRICECLOSE.DATE'},
    {'displayName': 'Environmental Pillar Score',
    'field': 'TR.ENVIRONMENTPILLARSCORE'},
    {'displayName': 'Renewable Energy Use Ratio',
    'field': 'TR.ANALYTICRENEWENERGYUSE'},
    {'displayName': 'Percentage of Green Products',
    'field': 'TR.PERCENTAGEGREENPRODUCTS'},
    {'displayName': 'Other - Ethnic Minorities Employees Percentage',
    'field': 'TR.OTHERMINORITIESEMPLOYEESPERCENTAGE'}]],
    'rowHeadersCount': 1,
    'totalColumnsCount': 8,
    'totalRowsCount': 2}

    You can map the TR fields to the new names.