Python: Eikon is generating unrequested dates for some ID's, using get_data.

I have the following code that requests local returns for a list of companies.

import pandas as pd
import eikon as ek #import Eikon module
import datetime
import time

start_time=time.time()
ek.set_app_id('someId') #setting AppID
#Getting a list of dates
start=datetime.date(2017,11,27)
end=datetime.date.today() - datetime.timedelta(days=1)
row_dates=[x.strftime('%m/%d/%Y') for x in pd.bdate_range(start,end).tolist()]
#getting identifiers to be used on Eikon
csv_data=pd.read_csv('identifiers_test.csv', header=None)
identifiers=csv_data[0].tolist()

This is a list with 2500 identifiers or so (all SEDOLS).

To request the returns, I do this:

df=ek.get_data(identifiers,["TR.TotalReturn.Date","TR.TotalReturn"], {'SDate': row_dates[0], 'EDate': row_dates[len(row_dates)-1], 'Frq':'D'})[0]

To pivot the table, and re-arrange it in a way that I can have dates as column headers and identifiers as indexes without being repeated, I do this:

df=pd.crosstab(df.Instrument, df.Date,values=df['Total Return'], aggfunc='mean')

But the outcome is really strange, it places columns with dates that I didn't even requested, filled with NaN's, and I would like to know how to filter that info. I already tried a couple of approaches with dropna() and other Pandas functions but I can't seem to get them to work :(, I'll attach 2 photos so you can see what I'm talking about).

image

And it properly shows dates I've requested, like this (shown 28-Nov, 29-Nov and 30-Nov. Not displaying in the picture but appeared properly in the dataframe, are 27-Nov and 01-Dec).

image

How can I get rid of those dates? Obviously these dates were produced by Eikon's API, somehow. I printed the original dataframe into a CSV file, I should have only 5 dates, I obviously found more. I can't upload the screencapture for this since this forum doesn't allow more than 2 image uploads.


Why Eikon does this? Is there a way to ellude it? A way to fix it? I'd like to know.

Thanks in advance!

Best Answer

  • Ok, I was able to get the list of the instruments, that are generating incorrect dates, you can use the following code:

    instruments = {}


    for ix, row in df.iterrows():
    instruments.setdefault(row[1], set()).add(row[0])

    instrument

    A quick manual check shows that the incorrect date is when the asset traded last, for example:

    B2Q1MC6 (PGD.P) - 5th of September
    B4TDMX4 (SBV.P) - 19th of September

    Etc.

    So, when you are requesting the latest Total return value, what it can give you is the last possible estimate.

Answers

  • Can you share the list of 2500 identifiers? I have used you code with some instruments and it works fine. I am using Eikon 0.1.9.

    image

  • I can not reproduce the issue by requesting the same data for S&P 500 constituents, so there might be an issue with one of the instruments in your list

  • files.zipjirapongse.phuriphanvichai, here I uploaded two files. The first one has all the identifiers I use for this code. And the other file is where I saw which SEDOL's are giving me issues.

    Is there a way to code an exception for these type of errors?

  • Yes, I'm guessing that as well. But why? I mean, how does Eikon handle errors with identifiers? Is there a way to detect/handle that kind of error and delete it from the outcome?

  • Do you know a way to code a 'filter' for these unrequested dates? Since they are very specific, I didn't find a 'conventional way' that works.

    e.g: I cannot use dropna() pandas function because, even though these dates are mostly filled with NA's, a single return value is not a NaN, so I can't use that solution.

  • you can drop the rows that have the dates that are not equal to the requested range with the standard pandas function

  • I tried another approach that worked:

    added this after the crosstab line:

    df=df[row_dates]

    Before doing that, I had to convert the Date column to Datetime using:

    df['Date']=pd.to_datetime(df['Date']) 
    df['Date'] = df['Date'].dt.strftime('%m/%d/%Y')