rd.get_history does not respect SDate and EDate with mixed timeseries and data fields

I noticed today that if a gethistory request (both in CodeBook and Pycharm) is made of which the requested fields are a mixture of get_data and historical pricing fields. The described dates are not respected for the timeseries fields.

As can be observed from below the dates for the get_data part are respected but not for the interday timeseries. This is in my judgement a bug.


import refinitiv.data as rd



rd.open_session()



test = rd.get_history( universe="AAPL.O",

fields=["TR.IssueMarketCap(Scale=6,ShType=FFL)","TR.FreeFloatPct()/100/*FreefloatWeight*/","TR.IssueSharesOutstanding(Scale=3)/*shares outstanding*/","TR.CLOSEPRICE(Adjusted=0)/*close*/","BID", "ASK"],

parameters={"Curn": "USD", "SDate": "2020-10-27", "EDate": "2020-12-01"})



test


AAPL.OIssue Market CapTR.FREEFLOATPCT()/100/*FREEFLOATWEIGHT*/Issue Default Shares OutstandingClose PriceBIDASKDate






2020-10-01<NA>0.999315<NA><NA><NA><NA>2020-10-271981052.854803<NA>17001802116.6<NA><NA>2020-10-281889305.981596<NA>17001802111.2<NA><NA>2020-10-291959305.447821<NA>17001802115.32<NA><NA>2020-10-301849549.003206<NA>17001802108.86<NA><NA>2020-11-01<NA>0.999313<NA><NA><NA><NA>2020-11-021848016.003824<NA>17001802108.77<NA><NA>2020-11-031876389.514225<NA>17001802110.44<NA><NA>2020-11-041953014.982436<NA>17001802114.95<NA><NA>2020-11-052022334.696471<NA>17001802119.03<NA><NA>2020-11-062016558.053634<NA>17001802118.69<NA><NA>2020-11-091976291.45504<NA>17001802116.32<NA><NA>2020-11-101970344.910944<NA>17001802115.97<NA><NA>2020-11-112030150.154426<NA>17001802119.49<NA><NA>2020-11-122025392.919149<NA>17001802119.21<NA><NA>2020-11-132026242.425448<NA>17001802119.26<NA><NA>2020-11-162043912.156477<NA>17001802120.3<NA><NA>2020-11-172028451.141827<NA>17001802119.39<NA><NA>2020-11-182005344.570482<NA>17001802118.03<NA><NA>2020-11-192015708.547335<NA>17001802118.64<NA><NA>2020-11-201993621.383549<NA>17001802117.34<NA><NA>2020-11-231934325.843848<NA>17001802113.85<NA><NA>2020-11-241956752.810153<NA>17001802115.17<NA><NA>2020-11-251971364.318504<NA>17001802116.03<NA><NA>2020-11-271980878.789058<NA>17001802116.59<NA><NA>2020-11-302022674.49899<NA>17001802119.05<NA><NA>2020-12-012062915.8350820.99920817001802122.72<NA><NA>2023-09-14<NA><NA><NA><NA>175.72175.732023-09-15<NA><NA><NA><NA>174.93174.962023-09-18<NA><NA><NA><NA>177.98177.992023-09-19<NA><NA><NA><NA>179.09179.12023-09-20<NA><NA><NA><NA>175.48175.492023-09-21<NA><NA><NA><NA>173.92173.932023-09-22<NA><NA><NA><NA>174.77174.812023-09-25<NA><NA><NA><NA>176.08176.12023-09-26<NA><NA><NA><NA>171.89171.92023-09-27<NA><NA><NA><NA>170.42170.432023-09-28<NA><NA><NA><NA>170.68170.692023-09-29<NA><NA><NA><NA>171.18171.222023-10-02<NA><NA><NA><NA>173.74173.752023-10-03<NA><NA><NA><NA>172.46172.472023-10-04<NA><NA><NA><NA>173.65173.662023-10-05<NA><NA><NA><NA>174.97174.982023-10-06<NA><NA><NA><NA>177.47177.482023-10-09<NA><NA><NA><NA>178.99179.02023-10-10<NA><NA><NA><NA>178.41178.422023-10-11<NA><NA><NA><NA>179.73179.75


Best Answer

  • @laurens

    Could you please, be more specific.

    If we talking about "interval":

    in eikon with get_timeseries:

    Possible values: 'tick', 'minute', 'hour', 'daily', 'weekly', 'monthly', 'quarterly', 'yearly' (Default 'daily')

    in RD with get_history:

    Date interval. Supported intervals are:
    tick, tas, taq, minute, 1min, 5min, 10min, 30min, 60min, hourly, 1h, daily,
    1d, 1D, 7D, 7d, weekly, 1W, monthly, 1M, quarterly, 3M, 6M, yearly, 1Y

    so in RD we fully cover the eikon interval requirements.

    And as we use historical pricing endpoint with the own "interval" requirements (you can take a look here refinitiv/data/content/historical_pricing/summaries.py module), you are right...

        Time:

    Backend will return complete N-minute summaries data.
    When the request start and/or end does not at the N minutes boundary,
    the response will be adjusted.

    MINUTE - return complete 1-minute
    ONE_MINUTE - return complete 1-minute
    FIVE_MINUTES - return complete 5-minutes
    TEN_MINUTES - return complete 10-minutes
    THIRTY_MINUTES - return complete 30-minutes
    SIXTY_MINUTES - return complete 60-minutes
    ONE_HOUR - return complete 1-hour
    HOURLY - return complete 1-hour

    Days:

    DAILY - This is end of day, daily data
    ONE_DAY - This is end of day, daily data
    SEVEN_DAYS - Weekly boundary based on the exchange's
    week summarization definition
    WEEKLY - Weekly boundary based on the exchange's
    ONE_WEEK - Weekly boundary based on the exchange's
    week summarization definition
    MONTHLY - Monthly boundary based on calendar month
    ONE_MONTH - Monthly boundary based on calendar month
    THREE_MONTHS - Quarterly boundary based on calendar quarter
    QUARTERLY - Quarterly boundary based on calendar quarter
    TWELVE_MONTHS - Yearly boundary based on calendar year
    YEARLY - Yearly boundary based on calendar year
    ONE_YEAR - Yearly boundary based on calendar year
    """

    MINUTE = "PT1M"
    ONE_MINUTE = "PT1M"
    FIVE_MINUTES = "PT5M"
    TEN_MINUTES = "PT10M"
    THIRTY_MINUTES = "PT30M"
    SIXTY_MINUTES = "PT60M"
    HOURLY = "PT1H"
    ONE_HOUR = "PT1H"
    DAILY = "P1D"
    ONE_DAY = "P7D"
    SEVEN_DAYS = "P7D"
    WEEKLY = "P1W"
    ONE_WEEK = "P1W"
    MONTHLY = "P1M"
    ONE_MONTH = "P1M"
    THREE_MONTHS = "P3M"
    QUARTERLY = "P3M"
    TWELVE_MONTHS = "P12M"
    YEARLY = "P1Y"
    ONE_YEAR = "P1Y"

    ... we need to have conversion.
    And "Yes" we have it under the hood. You can find in the next library module - refinitiv/data/_access_layer/_intervals_consts.py

    All you need, just define in get_history function supported intervals :

    tick, tas, taq, minute, 1min, 5min, 10min, 30min, 60min, hourly, 1h, daily,
    1d, 1D, 7D, 7d, weekly, 1W, monthly, 1M, quarterly, 3M, 6M, yearly, 1Y

    and all conversion will be done for historical pricing request automatically.


    Regarding date format in "start", "end" parameters:

    in get_history we support the same one as in eikon:

    end_date: string or datetime.datetime or datetime.timedelta
    End date and time of the historical range.

    string format could be
    - '%Y-%m-%d' (e.g. '2017-01-20')
    - '%Y-%m-%dT%H:%M:%S' (e.g. '2017-01-20T15:04:05')
    datetime.timedelta is negative number of day relative to datetime.now().


    I hope this information will help you.

    Regarding warning message we to consider to add. Additional information I will provide lately next week.

    Thank you for reaching us.





Answers

  • Hi @laurens ,

    May I ask what you mean by 'The described dates are not respected for the timeseries fields.'?
    As an example, is there an issue with the date shown here?
    1697191587069.png

    Do you mean that you do not expect these <NA> values?

  • please check the timeseries fields BID and ASK. the requested "SDate": "2020-10-27" and "EDate": "2020-12-01" are not respected. In stead it shows date for this year (2023)

  • see also here direct codebook snipping with same result as shown yesterday in initial post1697192292602.png

  • Hi @laurens,


    The `parameters` argument in the `rd.get_history` & `rd.get_data` applies only to TR fields. This detail can be found in the doc. strings:

    parameters: str | dict, optional
        Single global parameter key=value or dictionary
        of global parameters to request.
    Applies only to TR fields.


    The reason for this is that `parameters` is parameter to send the Datagrid server, the one that includes all fields with TR. `start`, `end` parameters will apply dates to ADC (another server/service/endpoint-family) and timeseries (a service accessible via the "historical-pricing" endpoint-family). All these servers/services/endpoint-families are accessible via the Refinitiv Data Library that you are using here.

    The discrepancy you're experiencing is caused by the fact that "historical-pricing" has no global or local parameters (such as `parameters`) at all.

    We do not advise using `parameters` for historical usage.

    In your specific call, Datagrid's 1st row returned includes data for the date "2020-10-01", but "historical-pricing" is not, thus the <NA>s.


    It seems as though the TR fields you are requesting do not change when <NA>s show up in the response. In this case, I would suggest using seperate calls. I do this myself with fields that only return data when a change occurs, just like what we see here.



    import pandas as pd
    import refinitiv.data as rd
    rd.open_session(
        # name="platform.rdp",
        config_name="C:/Example.DataLibrary.Python-main/Configuration/refinitiv-data.config.json")
    TRflds: list[str]=["TR.CLOSEPRICE(Adjusted=0)/*close*/.date",
    "TR.CLOSEPRICE(Adjusted=0)/*close*/",
    "TR.IssueMarketCap(Scale=6,ShType=FFL)",
    "TR.FreeFloatPct()/100/*FreefloatWeight*/",
    "TR.IssueSharesOutstanding(Scale=3)/*shares outstanding*/"]
    RTflds: list[str] = ["BID", "ASK"]
    test0: pd.DataFrame = rd.get_data(
    universe="AAPL.O", fields=TRflds,
    parameters={"Curn": "USD", "SDate": "2020-10-27", "EDate": "2020-12-01"})
    test: pd.DataFrame = test0.dropna()
    test1: pd.DataFrame = rd.get_history(
    universe="AAPL.O", fields=RTflds,# interval='daily',
    start="2020-10-26", # Note that this is one day before the `get_data` call, as it is exclusive with respect to it's 1st index (in this case, date).
    end="2020-12-01")
    test[["BID", "ASK"]] = test1[["BID", "ASK"]].values
    rd.close_session()
    test



    1697457056582.png


    Please do let me know if this is a satisfactory resolution, else I will revert back tot he product owners to see what could be done, and update the RD library for python.

  • Hi Jonathan,


    Thanks for your message. I understand that the get_history function in essence makes two calls: a historical_pricing and get_data api request.


    The example you are giving in your answer can be executed by only using:


    rd.get_data()

    There is then no use case for the get_history function.


    The major reason for using get_history (and not making 2 separate requests) is that the dates are synchronized across all requested fields (historical pricing and get_data).


    In my first post this synchronization is gone. This is off course (despite of being mentioned in the documentation) unexpected behaviour.


    The most logical fix is to check if SDate and EDate are in the parameters and put this as start and end parameters in the api call for the historical_pricing request.


    The other solution is to simply remove the get_history function from the package. Or at least provide a warning that if parameters are entered that they might not be respected.


    In its current state the get_history function is very confusing in my opinion.

    I am currently transition a lot of my code from Eikon to the new RD library and I off course like to keep my formulas and parameters as much as possible the same .

  • Hi @laurens .

    I see your intentions and understand your issue, and get_history design allows you to use date ranges for both requests under the hood with the help of "start" and "end" parameters. Other global "parameters" values you can use exclusively for Datagrid request.

    let's take a look for next example

    flds: list[str] = [
    "TR.IssueMarketCap(Scale=6,ShType=FFL)",
    "TR.FreeFloatPct()/100/*FreefloatWeight*/",
    "TR.IssueSharesOutstanding(Scale=3)/*shares outstanding*/",
    "TR.CLOSEPRICE(Adjusted=0)/*close*/",
    "BID",
    "ASK",
    ]
    test = rd.get_history(
    universe="AAPL.O",
    fields=flds,
    start="2020-10-21",
    end="2020-11-17",
    parameters={"Curn": "USD"},
    )

    In this example we created under the hood two requests:

    • Datagrid request that included date range:
    {'Entity': {'E': 'DataGrid_StandardAsync', 'W': {'requests': [{'instruments': ['AAPL.O'], 'fields': [{'name': 'TR.ISSUEMARKETCAP(SCALE=6,SHTYPE=FFL)'}, {'name': 'TR.FREEFLOATPCT()/100/*FREEFLOATWEIGHT*/'}, {'name': 'TR.ISSUESHARESOUTSTANDING(SCALE=3)/*SHARES OUTSTANDING*/'}, {'name': 'TR.CLOSEPRICE(ADJUSTED=0)/*CLOSE*/'}], 'parameters': {'SDate': '2020-10-21', 'EDate': '2020-11-17', 'Curn': 'USD'}, 'layout': {'columns': [{'item': 'dataitem'}], 'rows': [{'item': 'instrument'}, {'item': 'date'}]}}]}}}
    • Historical Pricing request included the same date range
     http://localhost:9001/api/rdp/data/historical-pricing/v1/views/interday-summaries/AAPL.O?start=2020-10-21T00:00:00.000000000Z&amp;end=2020-11-17T00:00:00.000000000Z&amp;fields=BID,ASK,DATE

    if there is no dates intersection between Datagrid and Historical Pricing endpoints in response we fill-in the dataframe with N/A values.

    screenshot-2023-10-20-at-135120.png

    raw responses also attached

    responses.zip


    I agree that information for parameters could not be visible, and there is also a reference guide - https://doccloud.int.refinitiv.com/content/rdp-python-library/1.0.0-beta/rdp-python-library

    where you can see the instructions that could help for migration process

  • Hi andrii.sidachenko01.

    Thanks for your answer. I agree with you that a rewrite of the request with character start and end dates will provide a correct answer.

    I am however of the opinion that get_history() should at least provide a warning when it detects SDate and EDate in the parameter fields in combination with historical pricing fields.

    I also notice that there a currently a lot of different date formats

    1. the old eikon date format like e.g. "0D"
    2. new refinitiv date format e.g. "P1D"

    3. date format as character "YYYY-MM-DD"

    Maybe it would be a good idea to have an under the hood conversion between those different date formats so that the underlying api calls can be made correctly.

  • 1. Regarding interval:

    You are right "P1D" is the new interval format for RD in stead of "daily" for Eikon. I made a mistake there.

    2. However you missed my point regarding the fact that I can now still use other Eikon date formats like e.g. the following query which will also result in a wrong result for the fields BID and ASK without warning:

    test = rd.get_history( universe="AAPL.O",
    fields=["TR.IssueMarketCap(Scale=6,ShType=FFL)","TR.FreeFloatPct()/100/*FreefloatWeight*/","TR.IssueSharesOutstanding(Scale=3)/*shares outstanding*/","TR.CLOSEPRICE(Adjusted=0)/*close*/","BID", "ASK"],
    parameters={"Curn": "USD", "SDate": "FY0", "EDate": "0D"})

    test

    using SDate = "FY0" and EDate = "0D"

    So the fact that Eikon input like "FY0" and "0D" is accepted and provide a result makes it already very confusing. Especially if you are transitioning from Eikon to RD.

    3. Conversion should best take place within the function that accepts the arguments, if you ask me. Why should I as end user being bothered with converting something that is already in an Eikon accepted format?

  • 1. you still can use interval="daily" in get_history.

    2, 3. I mentioned that we consider to add warning for "parameter". That is reasonable.

    Kindly, ask you share the example from eikon usage where you can retrieve datagrid and pricing data using "parameters" with FY0", "0D" data, that you need adopt to RD to see your use case much clearly.

    Thank you.