How do I clean the dataset?
Hi Dev Community,
Just a little background to my coding experience, I'm fairly green with regards to coding in Python. Picking up things and learning as I go along. I've only really got experience with VBA and am self taught. Apologies in advance if the solution is pretty obvious.
I'm trying to pull some historical FX data to determine the average daily % change between close dates.
df = ek.get_timeseries(rics,
start_date='2021-03-01',
end_date='2022-06-30',
fields='Open Close',
interval='daily
*** On a separate note, is there a limit to the amount of data I can pull on these queries?
I've printed out the dataframe to excel and have noticed that some currencies include prices on weekends. I've managed to handle these by converting the index to include the day name and then to exclude the weekends from the dataframe.
df.index = df.index.strftime('%a, %d-%b-%y')
df = df[df.index.str.contains('Sat', 'Sun') == False]
I've also noticed that some currencies have missing opening or closing prices. I'm struggling to figure out how to cycle through the dataframe (array?) to account for these missing prices.
- If there is a missing OPEN price, I want to populate the field with the prior day's CLOSE price
- If there is a missing CLOSE price, I want to populate the field the the next day's OPEN price.
Once this is all sorted I'll use the following calculate the daily % change and the average.
df_delta = df.pct_change()
df_mean = df_delta.mean()
Thanks!
Best Answer
-
hi @allatuw ,
First, the limit can be checked at Eikon Data API Usage and Limits Guideline
Then about the missing prices, let's say the RICs list is the below
rics = ['AUD=','EUR=','GBP=','CNY=','CHF=','NZD=']
df = ek.get_timeseries(rics,
start_date='2021-03-01',
end_date='2022-06-30',
fields='Open Close',
interval='daily')
df.index = df.index.strftime('%a, %d-%b-%y')
df.loc[~df.index.str.contains('Sat', 'Sun')](I've adjusted the code to filter out Sat and Sun rows a bit)
To replace the missing prices, the code below can be used
for ric in rics:
df[(ric, 'OPEN')].fillna(df[(ric, 'CLOSE')].shift(1), inplace=True)
df[(ric, 'CLOSE')].fillna(df[(ric, 'OPEN')].shift(-1), inplace=True)Here's an output example of the case when there is a missing OPEN price, I want to populate the field with the prior day's CLOSE price
Hope this helps and please let me know in case you have further questions
2
Answers
-
Testing with the mock up data
0 -
Brilliant, thanks for your help @raksina.samasiri !
0
Categories
- All Categories
- 6 AHS
- 37 Alpha
- 161 App Studio
- 4 Block Chain
- 4 Bot Platform
- 16 Connected Risk APIs
- 47 Data Fusion
- 30 Data Model Discovery
- 608 Datastream
- 1.3K DSS
- 577 Eikon COM
- 4.9K Eikon Data APIs
- 7 Electronic Trading
- Generic FIX
- 7 Local Bank Node API
- Trading API
- 2.7K Elektron
- 1.3K EMA
- 236 ETA
- 519 WebSocket API
- 33 FX Venues
- 10 FX Market Data
- 1 FX Post Trade
- 1 FX Trading - Matching
- 12 FX Trading – RFQ Maker
- 5 Intelligent Tagging
- 2 Legal One
- 20 Messenger Bot
- 2 Messenger Side by Side
- 9 ONESOURCE
- 7 Indirect Tax
- 59 Open Calais
- 264 Open PermID
- 39 Entity Search
- 2 Org ID
- PAM
- PAM - Logging
- 8.4K Private Comments
- 6 Product Insight
- Project Tracking
- ProView
- ProView Internal
- 20 RDMS
- 1.4K Refinitiv Data Platform
- 367 Refinitiv Data Platform Libraries
- 3 Refinitiv Due Diligence
- LSEG Due Diligence Portal API
- 3 Refinitiv Due Dilligence Centre
- Rose's Space
- 1.1K Screening
- 18 Qual-ID API
- 13 Screening Deployed
- 23 Screening Online
- 10 World-Check Customer Risk Screener
- 990 World-Check One
- 44 World-Check One Zero Footprint
- 45 Side by Side Integration API
- Test Space
- 3 Thomson One Smart
- 1.2K TR Internal
- Global Hackathon 2015
- 2 Specialists Who Code
- 10 TR Knowledge Graph
- 150 Transactions
- 142 REDI API
- 1.7K TREP APIs
- 4 CAT
- 21 DACS Station
- 117 Open DACS
- 1.1K RFA
- 103 UPA
- 172 TREP Infrastructure
- 224 TRKD
- 886 TRTH
- 5 Velocity Analytics
- 5 Wealth Management Web Services
- 59 Workspace SDK
- 9 Element Framework
- 5 Grid
- 13 World-Check Data File
- Yield Book Analytics
- 46 中文论坛