How to efficiently and briskly retrieve datapoints for all SNP 500 RICs
Hi everyone,
I am trying to retrieve some Fundamental and Reference data for all of the SNP 500 stocks. These datapoints include values such as Income before Extraordinary Items, EBIT, Capital, etc. Now, I am trying to asynchronously retrieve all of this data. At first, I tried to loop ethrough the list of the RICs of the SNP, and from there, retrieve the data for each RIC one at a time (by looping with a for loop). However, this was rather inefficient, and it took about 9 minutes to run.
Now, I tried to separate each call into multiple tasks and I used asyncio.gather() to run all of these requests concurrently.
import asyncio
import pandas as pd
from datetime import datetime
async def fetch_ebit_data(ric, start_year, current_year):
try:
ebit_datapoints = await content.fundamental_and_reference.Definition(
universe=[f'{ric}'],
fields=["TR.F.EBIT.date", "TR.F.EBIT"],
parameters={"SDate": f"{start_year}-01-01", "EDate": f"{current_year}-12-31", "Frq": "Y"}
).get_data_async(closure=f'')
# Check if the response contains data
if ebit_datapoints and not ebit_datapoints.data.df.empty:
return ebit_datapoints.data.df
else:
print(f"No data returned for {ric}")
return None
except Exception as e:
print(f"Error retrieving data for {ric}: {str(e)}")
return None
async def retrieve_ebit_data(list_of_snp_rics, start_year, current_year):
tasks = [fetch_ebit_data(ric, start_year, current_year) for ric in list_of_snp_rics]
ebit_values_list = await asyncio.gather(*tasks)
# Filter out None values (i.e., failed or empty responses)
ebit_values_list = [df for df in ebit_values_list if df is not None]
combined_ebit_df = pd.concat(ebit_values_list, ignore_index=True)
display(combined_ebit_df)
return combined_ebit_df
# Setup
current_year = datetime.today().year
start_year = current_year - 8
list_of_snp_rics = await retrieve_ric_snp_stocks()
# Retrieve and combine EBIT data
combined_ebit_df = await retrieve_ebit_data(list_of_snp_rics, start_year, current_year)
The problem with this method is that, I couldn't get any values for MAJORITY of the tickers / RICs. So currently, I am faced with a problem as I am unsure of a workaround in this situation. One thing I can think of is to use the ThreadPoolExecutor for the number of RICs there are (deploying about 503 threads), but this might be a really CPU-intensive process. I need some guidance on how I can navigate through this, as I can then apply it to retrieving multiple datapoints (EBIT and Capital for example).
Thanks!
Best Answer
-
Hi @vishal.nanwani ,
I don't think you need to use asynchronous calls for the task you are describing above. Using get_data or content.fundamental_and_reference.Definition function and passing all constituent rics (or Index chain) to the function will produce the result you are after in about a second. See below an example:
1. with index chain:
spx = rd.get_data("0#.SPX", fields =["TR.F.EBIT.date", "TR.F.EBIT"])
spx2. by passing all rics:
spx_constituents = rd.get_data("0#.SPX", fields =["TR.CompanyName"])
df = rd.get_data(universe=spx_constituents['Instrument'].to_list(), fields = ["TR.F.EBIT.date", "TR.F.EBIT"])
dfIf you are using other fields for which you don't get values, best would be to reach the content team via HelpDesk in Workspace or via my.refinitiv.com. They will help you identify the correct fields as not getting values for the fields I don't think is related the API calls.
Best regards,
Haykaz
0
Answers
-
Hi @aramyan.h ,
1. Is it possible to do this for the past 8 years? Something like this:
response = await content.fundamental_and_reference.Definition(
universe=[f'{ric}'],
fields=specific_fields,
parameters={"SDate": f"{start_year}-01-01", "EDate": f"{current_year}-12-31", "Frq": "Y"}
).get_data_async(closure='');2. With the method described above, I get usually 2-3/10 RICs that have empty values for some of the data I am trying to pull , for specific years. I am wondering,
a. why does this happen in the first place? and is there a workaround if I want to get all valid values of annual data for different datapoints for the last eight years?
b. can this be prevented with your solution?
Thanks!
0 -
Hi @vishal.nanwani ,
Yes, you can request the history, see below:
spx_constituents = rd.get_data("0#.SPX", fields =["TR.CompanyName"])
df = rd.get_data(universe=spx_constituents['Instrument'].to_list(), fields = ["TR.F.EBIT.date", "TR.F.EBIT"], parameters={"SDate": "2018-01-01", "EDate": "2024-12-31", "Frq": "Y"})
dfThe content should be same under async and non-async, however, for empty values you can raise a content question via my.refinitiv.com.
Best regards,
Haykaz
0
Categories
- All Categories
- 6 AHS
- 37 Alpha
- 161 App Studio
- 4 Block Chain
- 4 Bot Platform
- 16 Connected Risk APIs
- 47 Data Fusion
- 30 Data Model Discovery
- 608 Datastream
- 1.3K DSS
- 577 Eikon COM
- 4.9K Eikon Data APIs
- 7 Electronic Trading
- Generic FIX
- 7 Local Bank Node API
- Trading API
- 2.7K Elektron
- 1.3K EMA
- 236 ETA
- 519 WebSocket API
- 33 FX Venues
- 10 FX Market Data
- 1 FX Post Trade
- 1 FX Trading - Matching
- 12 FX Trading – RFQ Maker
- 5 Intelligent Tagging
- 2 Legal One
- 20 Messenger Bot
- 2 Messenger Side by Side
- 9 ONESOURCE
- 7 Indirect Tax
- 59 Open Calais
- 264 Open PermID
- 39 Entity Search
- 2 Org ID
- PAM
- PAM - Logging
- 8.4K Private Comments
- 6 Product Insight
- Project Tracking
- ProView
- ProView Internal
- 20 RDMS
- 1.4K Refinitiv Data Platform
- 367 Refinitiv Data Platform Libraries
- 3 Refinitiv Due Diligence
- LSEG Due Diligence Portal API
- 3 Refinitiv Due Dilligence Centre
- Rose's Space
- 1.1K Screening
- 18 Qual-ID API
- 13 Screening Deployed
- 23 Screening Online
- 10 World-Check Customer Risk Screener
- 990 World-Check One
- 44 World-Check One Zero Footprint
- 45 Side by Side Integration API
- Test Space
- 3 Thomson One Smart
- 1.2K TR Internal
- Global Hackathon 2015
- 2 Specialists Who Code
- 10 TR Knowledge Graph
- 150 Transactions
- 142 REDI API
- 1.7K TREP APIs
- 4 CAT
- 21 DACS Station
- 117 Open DACS
- 1.1K RFA
- 103 UPA
- 172 TREP Infrastructure
- 224 TRKD
- 886 TRTH
- 5 Velocity Analytics
- 5 Wealth Management Web Services
- 59 Workspace SDK
- 9 Element Framework
- 5 Grid
- 13 World-Check Data File
- Yield Book Analytics
- 46 中文论坛