EIkon 1.1.4 is not backward compatible with 1.1.2

Another upgrade caused the problem with existing scripts. We've got 2 problems with existing scripts after 1.0.1. to 1.1.2 upgrade. Then we've got another problem after 1.1.2 -> 1.1.4 (1.1.5).

There is nothing on release notes page about the changes in 1.1.2. There is no release notes for 1.1.4 and 1.1.5 !

These issues makes us never upgrade the working modules at all.


Here is the problem I'm talking about :

dataFrame = ek.get_timeseries(rics='AC.MX', fields='*', start_date='2020-09-03T20:00:00', end_date='2020-09-03T20:10:00', interval='tick', normalize=False)


ricName = dataFrame.columns.name

ricName has RIC name value in 1.1.2. In 1.14 - it's nan.


This request returns the data in different formats using 1.1.2 and 1.1.4 Eikon python mode.

Best Answer

  • ts[ric] works only for a multi ric timeseries

    I tested different requests et checked dataframe structure with following code:

    def get_ts(rics, fields):
    ts = rdp.get_timeseries(rics, fields,
    start_date='2020-09-03T20:00:00',
    end_date='2020-09-03T20:10:00',
    interval='minute',
    normalize=False, count=1)
    if hasattr(ts.columns, "name"):
    print("ts.columns.name: ", ts.columns.name)
    else:
    print("ts.columns.name: <No name>")
    print(ts)

    1 RIC, N fields

    => ts.columns.name contains the ric name

    get_ts(["MSFT.O"], '*')

    Output:

    ts.columns.name:  MSFT.O
    MSFT.O                 HIGH     LOW    OPEN   CLOSE  COUNT  VOLUME
    Date                                                              
    2020-09-03 20:07:00  218.70  218.30  218.53  218.45    162   11654

    N RIC, 1 field

    => ts.columns.name contains the field name

    get_ts(["AAPL.O", "MSFT.O"], 'HIGH')

    Output:

    ts.columns.name:  HIGH
    HIGH                 AAPL.O  MSFT.O
    Date                               
    2020-09-03 20:07:00  127.67  218.70

    N RIC, N fields

    => ts.columns.name contains "Security" label

    get_ts(["AAPL.O", "MSFT.O"], ["HIGH", "LOW"])

    Output:

    ts.columns.name:  None 
    Security             AAPL.O          MSFT.O        
    Field                  HIGH     LOW    HIGH     LOW
    Date                                               
    2020-09-03 20:07:00  127.67  120.86  218.70  218.30

Answers

  • Hi,

    We apologize if these changes cause you any inconvenience.
    We noticed your remark on release note and will improve this point

    Rather than using dataFrame.columns.name property which is not standard in pandas (this property was added by eikon lib), we suggest to rely on your rics request parameter.
    column names should only refer to related field names.

    In last 1.1.4 and 1.1.5, changes were done to align with pandas standard output.

    • numpy.NaN value was change for pandas.NA
    • all values are converted to the most suitable type
      (in 1.1.2, all numeric values were converted in float64, other values like str keep object type)

    As you'll see below, this impacts only empty data :

    ts = ek.get_timeseries(rics='AC.MX', fields='*', start_date='2020-09-03T20:00:00', end_date='2020-09-03T20:10:00', interval='tick', normalize=False) print(ts) print(ts.dtypes)
    1.1.2
    AC.MX                VALUE   VOLUME
    Date                              
    2020-09-03 20:00:01    NaN   4400.0
    2020-09-03 20:00:01    NaN   3720.0
    2020-09-03 20:00:01    NaN   5814.0
    2020-09-03 20:00:01    NaN   2494.0
    2020-09-03 20:00:01    NaN   2507.0
    2020-09-03 20:00:01    NaN    436.0
    2020-09-03 20:00:01    NaN    356.0
    2020-09-03 20:00:01    NaN  16933.0

    AC.MX
    VALUE     float64
    VOLUME    float64
    dtype: object

    1.1.4
                         VALUE  VOLUME
    Date                              
    2020-09-03 20:00:01   <NA>    4400
    2020-09-03 20:00:01   <NA>    3720
    2020-09-03 20:00:01   <NA>    5814
    2020-09-03 20:00:01   <NA>    2494
    2020-09-03 20:00:01   <NA>    2507
    2020-09-03 20:00:01   <NA>     436
    2020-09-03 20:00:01   <NA>     356
    2020-09-03 20:00:01   <NA>   16933

    VALUE     Int64
    VOLUME    Int64
    dtype: object

    1.1.5
                         VALUE  VOLUME
    Date                              
    2020-09-03 20:00:01   <NA>    4400
    2020-09-03 20:00:01   <NA>    3720
    2020-09-03 20:00:01   <NA>    5814
    2020-09-03 20:00:01   <NA>    2494
    2020-09-03 20:00:01   <NA>    2507
    2020-09-03 20:00:01   <NA>     436
    2020-09-03 20:00:01   <NA>     356
    2020-09-03 20:00:01   <NA>   16933

    VALUE     Int64
    VOLUME    Int64
    dtype: object

  • Do you see that 'AC.MX' in 1.1.2 printed results? It's missing in 1.1.4 and 1.1.5

  • Yes, I see.
    we're releasing a new version 1.1.6 and will restore it for your convenience.

  • This sounds great. Do you have an ETA for 1.1.6?

  • It needs to be validated but I'm confident for next week

  • Thanks. I will check it next week

  • About 'using dataFrame.columns.name' - yes, we rely on this field to get the RIC name. This was a workaround to solve the more general issue :

    The response representation is not consistent and different for :

    - if there are multiply rics received

    - if there is 1 ric

    - if there is 1 field requested and more than 1 ric returned


    The representations of last 2 are similar , but for 3rd response instead of RIC , it has a field name.

    I'm not a pro in python/pandas. But using common logic I do not understand why the data structure of the result depends on the number of requested RICs and fields. I would use the same data structure like there are multiple RICs and multiple fields.

  • Hi,
    eikon 1.1.6 was release on pypi.org

    It contains a fix for the raised issue, but I remind that dataFrame.columns.name property is not a standard from pandas but 'added value' from eikon lib for timeseries on 1 RIC.

    If you call any functions that modify the dataframe, you can loose name property.

    And iIf you request a timeseries for list of rics, you don't have it but you can extract dataframe for each ric by this way:

    ts = ek.get_timeseries(['AC.MX', 'MSFT.O'], ...)
    ac_mx = ts['AC.MX']
    msft = ts['MSFT.O']

    (ts.columns is MultiIndex)

    More generally, to get column names from a pandas.DataFrame is:

    names = df.columns
    print([name for name in ts.columns])
    ['VALUE', 'VOLUME']


  • Hi,

    Thanks for the update. You also used that 'added field' in case I request 1 field only. Was it removed as well? That added value had a field name.

    Another question is why do you use different dataframe structure if there is only 1 RIC requested?

    This does not work :

    ts = ek.get_timeseries(['AC.MX'], ...)
    ac_mx = ts['AC.MX]