Extracting and Matching Data from an Excel file with Eikon API

I have an excel file containing a companies/employee names, their patent applications and year of filing. However, now I need the financial information for these firms for the given year (from eikon), as listed in the excel sheet. The main problem i have is that the names in the excel sheet are not an exact match with Eikon (no RIC, PermID, or any other corresponding code). How do I work around this.



Here is an extract of my current code:


import eikon as tr

import pandas as pd

resulttable_df = pd.read_excel(r'C:\xxxxx\resulttable.xlsx')

def get_financial_data(company, year):

# Example: fetching market cap for a given company (RIC) and year

fields = ['TR.MktCap']

response = tr.get_data([company], fields=fields, raw_output=True)


if 'data' in response and response['data']:

for item in response['data']:

if 'data' in item:

financial_data_list = item['data']

for data_item in financial_data_list:

if 'field' in data_item and data_item['field'] == 'TR.MktCap':

return data_item.get('value', None)


return None

# Iterate over rows in the "resulttable" dataset and fetch financial data

for index, row in resulttable_df.iterrows():

company = row['person_name']

year = row['appln_filing_year']

financial_data = get_financial_data(company, year)

# Print or process the financial data as needed

if financial_data is not None:

print(f"For {company} in {year}, Market Cap: {financial_data}")

else:

print(f"No financial data found for {company} in {year}")


@nick.zincone

Best Answer

  • Hi @Ollivier Taramasco,


    I believe the Search API (more on it here, in Nick's article) is what you're after. Having had a look myself, I found that 'EQUITY_QUOTES' is the view you're after; for e.g., with an excel workbook 'resulttable.xlsx' like this:
    1699868118646.png


    I wrote the below with the RD Lib for Python (which is the new version of Eikon's Data API) (more on the RD Lib for Python can be found here):


    import refinitiv.data as rd
    import pandas as pd
    try: # The following libraries are not available in Codebook, thus this try loop
    rd.open_session(
    config_name="C:\\Example.DataLibrary.Python-main\\Configuration\\refinitiv-data.config.json", # this is where my config file is located
    name="desktop.workspace")
    except:
    rd.open_session()

    resulttable_df = pd.read_excel("resulttable.xlsx")

    response=rd.content.search.Definition(
    view=rd.content.search.Views.EQUITY_QUOTES,
    query=resulttable_df.Company.iloc[0],
    filter = "RIC ne null",
    select = "RIC, _, IndustrySectorDescription",
    top = 20,
    ).get_data()
    display(response.data.df)

    1699868360476.png


    I assumed, above, that you were after floated companies; therefore the EQUITY_QUOTES was the best `view`; but you can choose another from the list found in `help(rd.content.search.Views)`; in line with this assumption - and to make it more likely that the correct company comes up 1st in `response.data.df`, I added "Ordinary Share" at the end of what could be seen as the comany name.

    I would advise using the `filter` more, to make it more likely that the correct company comes up 1st in `response.data.df`. E.g.: filter by company country, market cap, industry, ... The list of filter fields can be found with this browser or with:

    response_EQ_Q = rd.content.search.metadata.Definition(
    view = rd.content.search.Views.EQUITY_QUOTES).get_data()
    response_EQ_Q.data.df.to_excel("EQUITY_QUOTES.xlsx")

    The 'grammar' you can use these fields in is described in Nick's article as well as on the Playgound Reference.

Answers

  • P.S.: I tried implementing the above with your workflow, but found it dificult to decrypt it without tabs; when inserting code on this Developer Q&A Forum, please use the code button:

    1699869000107.png

  • Thank you for your response @jonathan.legrand. I appreciate your time and effort! Yes, I am interested in floated companies and the EQUITY_QUOTES is the best view option. However, the output shows that no RIC was found for any of the companies. I made modifications to the view, query, etc but had the same issue.

    One probable cause which i suspect is that some of the companies no longer exist. However, I am interested in their financial data, eg MarketCap, based on the year a particular patent application was filed, as given in the resulttable.xlsx, before the firm was dissolved. Can I, and how do I, account for this in the code?

    Also, from the company name search, I am interested in getting the first, and probably the likeliest result from which I can extract a RIC.

    Here are snippets of my current workflow:

    import eikon as tr
    import pandas as pd
    import refinitiv.data as rdp
    from refinitiv.data.content import search

    tr.set_app_key('xxx')

    rdp.open_session()


    resulttable_df = pd.read_excel(r'C:\xxx\resulttable.xlsx')

    for company_name in resulttable_df['person_name'].unique():
    # Converting the company_name to a string
    company_name = str(company_name)
       # Use RDP for searching the company and getting its RIC
    search_results = rdp.content.search.Definition(
    view=rdp.content.search.Views.EQUITY_QUOTES,
    query=resulttable_df.person_name.iloc[0],
    filter="RIC ne null",
    select="RIC, _, IndustrySectorDescription",
    top=1,
    )
    # Extract the RIC from the search results
    ric = None
    if not search_results.empty:
    first_result = search_results.iloc[0]
    ric = first_result.get('RIC')
    # Call the function to get financial data if RIC is not None
    if ric is not None:
    financial_data = get_financial_data(ric)

    # Print or process the financial data as needed
    if financial_data is not None:
    print(f"For {company_name}, Market Cap: {financial_data}")
    else:
    print(f"No financial data found for {company_name}")
    else:
    print(f"No RIC found for {company_name}")

    @nick.zincone

  • Hi @Ollivier Taramasco, I tried having a look again, and I'm afraid that without the resulttable.xlsx fiole, it will be dificult for me to investigate any further. Would you mind sharing that document here? Or maybe a curated version of that document with examples of companies causing trouble?