I am attempting to get data on a set of RICs and getting error.

JoanneCamille.Andes · February 26

I am attempting to get data on a set of rics for these variables (['TR.InvestorFullName', 'TR.InvestorFullName.investorid','TR.InvestorFullName.investorpermid','TR.CategoryOwnershipPct','TR.InvestorType','TR.InvParentType', 'TR.PctOfSharesOutHeld', 'TR.PctofSharesOutHeld.Date', 'TR.InvAddrCountry', 'TR.OwnTrnverRating', 'TR.OwnTurnover','TR.OwnTurnover.date','TR.InvInvestmentStyleCode']) across 20 years (suffixes_list= [('_1999','_2000'),('_2001','_2001'),('_2001','_2002'),('_2003','_2003'),('_2003','_2004'),('_2005','_2005'),('_2005','_2006'),('_2007','_2007'), ('_2007','_2008'), ('_2009','_2009'),('_2009','_2010'),('_2011',"_2011"),('_2011',"_2012"),('_2013',"_2013"),('_2013',"_2014"),('_2015',"_2015"), ('_2015',"_2016"),('_2017',"_2017"),('_2017',"_2018"),('_2019',"_2019"),('_2019',"_2020"),('_2021',"_2021"),('_2021',"_2022")] . I am running this chunk by changing the value of RICs from my list.

# Chunk for RIC at index 238 (Number) instruments = sortedrics[238] # Adjust the index for each RIC data_frames = [] for i in range(-24, -1): s_date = str(i) e_date = str(i) df, err = ek.get_data(instruments, fields, {'SDate': s_date, 'Edate': e_date, 'Frq':"Y"}) keylist = df.keys() for item in keylist: df[item] = df[item].astype(str) print(s_date, e_date) data_frames.append(df) merged_df = data_frames[0] for i in range(1, len(data_frames)): merged_df = pd.merge(merged_df, data_frames[i], on='Investor Full Name', how='outer', suffixes=suffixes_list[i-1]) merged_df.to_csv(f'{instruments}.csv')

I encounter memory errors. I am wondering if there are any suggestions on how to mitigate that. I tried looping this request, but API can only take 5 minutes per request, so it did not work well for me. Thanks.

The issue is that the API response pulls the request, but on the step of the merger, it is not consistent, meaning that after I get the memory error, I run the chunk again, and it works fine sometimes. It changes from one RIC to another.

raksina.samasiri · April 22

The answer has been provided in this thread

Jirapongse · February 28

@JoanneCamille.Andes

Please share the runnable code and the error message.

Therefore, we can run the code and verify the problem.

If the application requests a lot of data, the request can be timed out by the server. However, the application can catch the exception and re-run that request.

JoanneCamille.Andes · February 28

@Jirapongse

Here is the code;

fields = ['TR.InvestorFullName', 'TR.InvestorFullName.investorid','TR.InvestorFullName.investorpermid','TR.CategoryOwnershipPct','TR.InvestorType','TR.InvParentType','TR.PctOfSharesOutHeld', 'TR.PctofSharesOutHeld.Date', 'TR.InvAddrCountry','TR.OwnTrnverRating', 'TR.OwnTurnover','TR.OwnTurnover.date','TR.InvInvestmentStyleCode']

suffixes_list= [('_1999','_2000'),('_2001','_2001'),('_2001','_2002'),('_2003','_2003'),('_2003','_2004'),('_2005','_2005'),('_2005','_2006'),('_2007','_2007'),

('_2007','_2008'), ('_2009','_2009'),('_2009','_2010'),('_2011',"_2011"),('_2011',"_2012"),('_2013',"_2013"),('_2013',"_2014"),('_2015',"_2015"),

('_2015',"_2016"),('_2017',"_2017"),('_2017',"_2018"),('_2019',"_2019"),('_2019',"_2020"),('_2021',"_2021"),('_2021',"_2022")]

# Chunk for RIC at index 282

instruments = sortedrics['BROG.OQ'] # Adjust the index for each RIC

data_frames = []

for i in range(-24,-1):

s_date = str(i)

e_date = str(i)

df, err = ek.get_data(instruments,fields, {'SDate':s_date, 'Edate': e_date, 'Frq':"Y"})

keylist = df.keys()

for item in keylist:

df[item]= df[item].astype(str)

print(s_date,e_date)

data_frames.append(df)

merged_df = data_frames[0]

for i in range(1, len(data_frames)):

merged_df = pd.merge(merged_df,data_frames[i],on='Investor Full Name',how='outer',suffixes=suffixes_list[i-1])

merged_df.to_csv(f'{instruments}.csv')

mae.diaz · February 29

@Jirapongse please see attached file for the full code.

API Code.txt

When I've tried running this in Codebook app it gives me this error:

File "/tmp/ipykernel_213/1336849579.py", line 16

= ek.get_timeseries('AAPL.O', # the RIC for Apple, Inc.

^

SyntaxError: invalid syntax

JoanneCamille.Andes · February 29

@Jirapongse any updates please?

mae.diaz · March 1

Main issue is - API response pulls the request, but on the step of the merger, it is not consistent, meaning that after client gets the memory error, he run the chunk again, and it works fine sometimes. It changes from one RIC to another.

May I know if this is a behavioral issue of the app or if this is due to the data client wants to retrieve which is related to the API data limitation? Full code was attached to the above comment.

Jirapongse · March 4

@JoanneCamille.Andes

To replicate this issue, we need the cut-down and runnable version of the code that we can run it without any modications.

It is also better if you can scope down this issue by finding RICs and exact parameters that cause this issue.

Moreover, please paste the source code in the code block when sharing the code.

mae.diaz · March 5

Hi @Jirapongse, we have attached the file with the full code as we are getting the error below when using the <Code> option. API Code.txt

Main issue is - API response pulls the request, but on the step of the merger, it is not consistent, meaning that after client gets the memory error, he run the chunk again, and it works fine sometimes. It changes from one RIC to another.

JoanneCamille.Andes · March 10

@Jirapongse may we please follow up the above?

Jirapongse · March 22

The answer has been provided on this discussion.

I am attempting to get data on a set of RICs and getting error.

Best Answer

Answers

Categories