API Pagination in Python Using Refinitiv Data API

Hello,

I'm working on a Python project using the Refinitiv Data API to retrieve large sets of equity quote data. I need to fetch data in chunks of 10,000 records due to API limitations, but I'm struggling with implementing effective pagination to avoid overlapping data without missing any records. The API doesn't seem to support direct offset management.

Here's what I'm trying to do:

  1. Retrieve 10,000 records at a time.
  2. Ensure the next batch of 10,000 starts right after the last record of the previous batch.
  3. I'm using rd.discovery.search method but can't find a way to paginate properly.

Here is a code with offset(I know that search() doesn't have such an argument, it's just an example):


import refinitiv.data as rd
import pandas as pd

def retrieve_data():
rd.open_session()
offset = 0
all_data = pd.DataFrame()

while True:
results = rd.discovery.search(
view=rd.discovery.Views.EQUITY_QUOTES,
top=10000,
filter="(RCSAssetCategoryLeaf eq 'Ordinary Share')",
select="RIC, DTSubjectName, RCSAssetCategoryLeaf, ExchangeName, ExchangeCountry",
offset=offset # Adjust the offset for each iteration
)

df = pd.DataFrame(results)
if df.empty:
break

all_data = pd.concat([all_data, df], ignore_index=True)
offset += 10000 # Increase the offset to get the next batch of records


df.to_csv(f'./Retrieved_RICs_{offset}.csv')

all_data.to_csv('./Retrieved_RICs_All.csv')
print("Data retrieval complete.")

rd.close_session()

retrieve_data()
 

Any advice on managing continuation tokens or other effective methods would be greatly appreciated!

Thank you in advance!

Best Answer

  • nick.zincone
    Answer ✓

    Hi @vitali

    I don't know the policies regarding rate limit and recovery. I did research the API Playground Reference document around this:

    1713538975069.png

    I would suggest you open a ticket and they will bring in the Search team managing policy around these issues.

Answers

  • Hi @vitali

    Within the Building Search into your Application Workflow Article, there is a section dedicated on Limits. Accompanying that section, there are a number of examples and ways to work around the challenges with limits in your workflow.

  • Hi, @nick.zincone! Thank you for your response. I attempted to implement the code as per the guidelines in Article.DataLibrary.Python.Search. However, I encountered an error: Error code 429, which states, {"message":"Too many requests, please try again later."}. Could you please advise on how to proceed?
    Code:

    import refinitiv.data as rd
    from refinitiv.data.content import search
    import pandas as pd


    rd.open_session()


    response = search.Definition(
        view=search.Views.EQUITY_QUOTES,
        filter="RCSExchangeCountryLeaf eq 'United States' and RCSAssetCategoryLeaf eq 'Ordinary Share'",
        top=0,
        navigators="MktCapTotal(type:range, buckets:13)"
    ).get_data()



    market_cap_filter = response.data.raw["Navigators"]["MktCapTotal"]["Buckets"][1]["Filter"]
    full_filter = f"RCSExchangeCountryLeaf eq 'United States' and RCSAssetCategoryLeaf eq 'Ordinary Share' and {market_cap_filter}"
    print(market_cap_filter)

    response = search.Definition(
        view=search.Views.EQUITY_QUOTES,
        filter=full_filter,
        top=10000
    ).get_data()


    print(f"Request resulted in a segment of {response.total} documents.")


    rd.close_session()


    Output:

    ---------------------------------------------------------------------------
    RDError                                   Traceback (most recent call last)
    /tmp/ipykernel_442/2001824659.py in <module>
          6 
          7 # Fetching market cap ranges to use for filters
    ----> 8 response = search.Definition(
          9     view=search.Views.EQUITY_QUOTES,
         10     filter="RCSExchangeCountryLeaf eq 'United States' and RCSAssetCategoryLeaf eq 'Ordinary Share'",


    /opt/conda/lib/python3.8/site-packages/refinitiv/data/delivery/_data/_data_provider_layer.py in get_data(self, session, on_response)
        154             )
        155         on_response and emit_event(on_response, response, self, session)
    --> 156         self._check_response(response, session.config)
        157         return response
        158 


    /opt/conda/lib/python3.8/site-packages/refinitiv/data/delivery/_data/_data_provider_layer.py in _check_response(self, response, config)
        124 
        125     def _check_response(self, response: Response, config):
    --> 126         _check_response(response, config)
        127 
        128     def get_data(


    /opt/conda/lib/python3.8/site-packages/refinitiv/data/delivery/_data/_data_provider_layer.py in _check_response(response, config, response_class)
         31 
         32             error.response = response
    ---> 33             raise error
         34 
         35 


    RDError: Error code 429 | {"message":"Too many requests, please try again later."}
  • Hi again,
    @nick.zincone. Could you help me with the issue I described bellow?
  • Hi @vitali

    This is what I was referring to within the Limits section in that article I linked. The service has guardrails to prevent excessive load. One of the main reasons why users run into this issue is due to asking for too much data continuously. I don't know what your specific requirements are and whether you need to request for large amounts of data, but in most cases if the user places filters around their data requests to limit the load and the amount of data, that usually helps. What clients mistakenly do is request for a massive amount of data then filter it out within their applications - this will ultimately lead to limits being hit.
    The only thing I can suggest is that you will need to wait or temper/control your requests unfortunately.

  • @nick.zincone, I am currently encountering an issue where I receive the following error message each time I attempt to execute any code:

    Error code 429 | {"message":"Too many requests, please try again later."}

    It appears that I have reached the daily limit for requests. Could you please confirm if this is the case and advise when the limit will reset, allowing me to submit requests again?

    Thank you!

  • Got it, thank you,
    @nick.zincone, for your time. I appreciate it.