how to correctly paginate through search results?
Hi,
i have a search which would result in roughly 450,000 results. So, i thought i could just paginate through the search(results) via using the top and skip parameters. but that does not work.
The issue i get is:
Result is maxed at 10000 while the total is 453761 rows.
Requested - 100, skipped - 9900 rows.
the code i use
search = rd.discovery.search(
view = rd.discovery.Views.INDICATOR_QUOTES,
top = 100,
skip = 9900,
filter = "( SearchAllCategoryv2 eq 'Economic Indicators')",
select = "RIC, RCSCountryOfIndicatorLeaf,CommonName, Periodicity, StartDate, EndDate, ObservationDate, PreviousReleaseDate, NextRelease")
when i try to skip 10,000, i get this error:
RDError: Error code 400 | Invalid result window: (top + skip) must not exceed 10,000
so, how do i get the other 443,761 rows?
thanks
Andreas
Best Answer
-
Hi @andreas01
If you need to get this significant amount of RIC codes you can use the benefit of navigators that can split your search request in a reasonable buckets where the number of results is less than an accepted max number of 10k.
The original request brings around 450k results but if you remove China (which you can later also split into similar baskets) it will give arount 250k results. The idea is to segregate the response by the country "RCSCountryofindicatorLeaf" that can be later on used to create filter that can narrow down the search results
response=search.Definition(
view = rd.discovery.Views.INDICATOR_QUOTES,
filter = "SearchAllCategoryv2 eq 'Economic Indicators' and RCSCountryOfIndicatorLeaf ne 'China'",
top = 0,
navigators = "RCSCountryOfIndicatorLeaf"
).get_data()
#the format of the output
response.data.raw['Navigators']['RCSCountryOfIndicatorLeaf']['Buckets']Looking at the out put you can see that the possible max result for a single country is 8902 - United States.
Now you can create a loop going through each of the list items and putting that into a search syntax and merge to a final results list.results = []
for i in response.data.raw['Navigators']['RCSCountryOfIndicatorLeaf']['Buckets']:
response=search.Definition(
view = rd.discovery.Views.INDICATOR_QUOTES,
filter = f"SearchAllCategoryv2 eq 'Economic Indicators' and RCSCountryOfIndicatorLeaf xeq '{i}'",
select = "RIC, RCSCountryOfIndicatorLeaf,CommonName, Periodicity, StartDate, EndDate, ObservationDate, PreviousReleaseDate, NextRelease",
top = 10000,
).get_data()
results.extend(response.data.raw['Hits'])1
Answers
-
Hi @andreas01 ,
As much as I know, unfortunately, there is no way to introduce pagination as such in Search. What you can do is to provide additional filters which will reduce the size of your request below 10,000 and you can extract through iterating the filter criteria. One example could be playing with the StartDate, EndDate parameters and/or RCSCountryOfIndicatorLeaf and Periodicity. See an example filter below with these:
filter = "( SearchAllCategoryv2 eq 'Economic Indicators') and StartDate ge 1996-01-01 and StartDate le 1997-01-01 and Periodicity eq 'Annual' and RCSCountryOfIndicatorLeaf eq 'China (Mainland)' "
I hope this helps.
Best regards,
Haykaz
0 -
that's brilliant. thank you! i had to adjust it a little bit
{i}
had to become
{i}['Label']
but other than that it works nicely.
do you happen to have a suggestion for the China "problem"? What could be a good basket for this one?
thanks a lot, @m.bunkowski
0 -
Hi @andreas01
You can do it in a similar way but with a different filter:
response=search.Definition(
view = rd.discovery.Views.INDICATOR_QUOTES,
filter = "SearchAllCategoryv2 eq 'Economic Indicators' and RCSCountryOfIndicatorLeaf eq 'China'",
top = 0,
navigators = "ObservationValue(buckets:50)"
).get_data()and then:
results = []
for i in response.data.raw['Navigators']['ObservationValue']['Buckets']:
response=search.Definition(
view = rd.discovery.Views.INDICATOR_QUOTES,
filter = f"SearchAllCategoryv2 eq 'Economic Indicators' and RCSCountryOfIndicatorLeaf eq 'China' and {i['Filter']}",
select = "RIC, RCSCountryOfIndicatorLeaf,CommonName, Periodicity, StartDate, EndDate, ObservationDate, PreviousReleaseDate, NextRelease",
top = 10000,
).get_data()
results.extend(response.data.raw['Hits'])1 -
that is awesome. learning some great new tricks here. thank you @m.bunkowski
0
Categories
- All Categories
- 6 AHS
- 37 Alpha
- 161 App Studio
- 4 Block Chain
- 4 Bot Platform
- 16 Connected Risk APIs
- 47 Data Fusion
- 30 Data Model Discovery
- 608 Datastream
- 1.3K DSS
- 577 Eikon COM
- 4.9K Eikon Data APIs
- 7 Electronic Trading
- Generic FIX
- 7 Local Bank Node API
- Trading API
- 2.7K Elektron
- 1.3K EMA
- 236 ETA
- 519 WebSocket API
- 33 FX Venues
- 10 FX Market Data
- 1 FX Post Trade
- 1 FX Trading - Matching
- 12 FX Trading – RFQ Maker
- 5 Intelligent Tagging
- 2 Legal One
- 20 Messenger Bot
- 2 Messenger Side by Side
- 9 ONESOURCE
- 7 Indirect Tax
- 59 Open Calais
- 264 Open PermID
- 39 Entity Search
- 2 Org ID
- PAM
- PAM - Logging
- 8.4K Private Comments
- 6 Product Insight
- Project Tracking
- ProView
- ProView Internal
- 20 RDMS
- 1.4K Refinitiv Data Platform
- 367 Refinitiv Data Platform Libraries
- 3 Refinitiv Due Diligence
- LSEG Due Diligence Portal API
- 3 Refinitiv Due Dilligence Centre
- Rose's Space
- 1.1K Screening
- 18 Qual-ID API
- 13 Screening Deployed
- 23 Screening Online
- 10 World-Check Customer Risk Screener
- 990 World-Check One
- 44 World-Check One Zero Footprint
- 45 Side by Side Integration API
- Test Space
- 3 Thomson One Smart
- 1.2K TR Internal
- Global Hackathon 2015
- 2 Specialists Who Code
- 10 TR Knowledge Graph
- 150 Transactions
- 142 REDI API
- 1.7K TREP APIs
- 4 CAT
- 21 DACS Station
- 117 Open DACS
- 1.1K RFA
- 103 UPA
- 172 TREP Infrastructure
- 224 TRKD
- 886 TRTH
- 5 Velocity Analytics
- 5 Wealth Management Web Services
- 59 Workspace SDK
- 9 Element Framework
- 5 Grid
- 13 World-Check Data File
- Yield Book Analytics
- 46 中文论坛