A 'get_news_story' request into a dataframe
Hi, I am using ek.get_news_headlines to display a dataframe of 5 news articles for a particular company. i.e.
df = ek.get_news_headlines('GOOG.O AND Language:LEN', date_from='2021-01-01T09:00:00', date_to='2023-06-30T23:59:59', count = 5)
The above works fine at display the last 5 storyId's... but i'd like to use the ek.get_news_story request to loop through the rows in the above df and pull the article from each storyID into another dataframe? When I try the below snippet - which I found on another post - I just get a HTML dump from the first storyId only.
for idx, storyId in enumerate(headlines['storyId'].values): #for each row in our df dataframe
newsText = ek.get_news_story(storyId) #get the news story
time.sleep(5) # sleep for 5 seconds
print(newsText)
I'd ideally like to see 1 new dataframe containing 5 rows (one row for each news article), one column with the news article's title, another column containing just the text from each article (no HTML tags!), and then another column of the URL.
Any help would be greatly appreciated.
Thank you!
Best Answer
-
Thank you for reaching out to us.
To get the story text (no HTML tag), you need to use Refinitiv Data Library for Python. The example code is avaiable on GitHub.
The code looks like this:
import time
import pandas as pd
df = pd.DataFrame(columns=['headline', 'story', 'storyid'])
headlines = rd.news.get_headlines('GOOG.O AND Language:LEN',
start='2021-01-01T09:00:00',
end='2023-06-30T23:59:59',
count = 5)
for index, row in headlines.iterrows():
newsText = rd.news.get_story(row['storyId'], format=rd.news.Format.TEXT) #get the news story
df = df.append({'headline':row['headline'],'story':newsText,'storyid':row['storyId']}, ignore_index=True)
time.sleep(5)
dfThe ouput is:
1
Answers
-
Thank you, this worked. Any idea of how I can include a column for the timestamp of each article too?
0 -
Please this one:
import time
import pandas as pd
df = pd.DataFrame(columns=['timestamp','headline', 'story', 'storyid'])
headlines = rd.news.get_headlines('GOOG.O AND Language:LEN',
start='2021-01-01T09:00:00',
end='2023-06-30T23:59:59',
count = 5)
headlines = headlines.reset_index()
for index, row in headlines.iterrows():
newsText = rd.news.get_story(row['storyId'], format=rd.news.Format.TEXT) #get the news story
df = df.append({'timestamp':row['versionCreated'],'headline':row['headline'],'story':newsText,'storyid':row['storyId']}, ignore_index=True)
time.sleep(5)
df1 -
thank you @Jirapongse, this was exactly what i was looking for!
One last question please re: this topic
Is it possible to do a freeform search as part of this news query? i.e. if I wanted to pull news articles into a data frame where "Elon Musk SpaceX" was my search term?Thank you!
0 -
Yes, you can use the free text search.
df = ek.get_news_headlines(query='\\"Elon Musk SpaceX\\"', count=100)
df0 -
Hi
@Jirapongse, another question please - how would I run the same query by using the company's PermID instead of the "TSLA.O" code? Some of the company's in my search are not publicly traded. Thank you!0 -
0
Categories
- All Categories
- 6 AHS
- 37 Alpha
- 161 App Studio
- 4 Block Chain
- 4 Bot Platform
- 16 Connected Risk APIs
- 47 Data Fusion
- 30 Data Model Discovery
- 608 Datastream
- 1.3K DSS
- 577 Eikon COM
- 4.9K Eikon Data APIs
- 7 Electronic Trading
- Generic FIX
- 7 Local Bank Node API
- Trading API
- 2.7K Elektron
- 1.3K EMA
- 236 ETA
- 519 WebSocket API
- 33 FX Venues
- 10 FX Market Data
- 1 FX Post Trade
- 1 FX Trading - Matching
- 12 FX Trading – RFQ Maker
- 5 Intelligent Tagging
- 2 Legal One
- 20 Messenger Bot
- 2 Messenger Side by Side
- 9 ONESOURCE
- 7 Indirect Tax
- 59 Open Calais
- 264 Open PermID
- 39 Entity Search
- 2 Org ID
- PAM
- PAM - Logging
- 8.4K Private Comments
- 6 Product Insight
- Project Tracking
- ProView
- ProView Internal
- 20 RDMS
- 1.4K Refinitiv Data Platform
- 367 Refinitiv Data Platform Libraries
- 3 Refinitiv Due Diligence
- LSEG Due Diligence Portal API
- 3 Refinitiv Due Dilligence Centre
- Rose's Space
- 1.1K Screening
- 18 Qual-ID API
- 13 Screening Deployed
- 23 Screening Online
- 10 World-Check Customer Risk Screener
- 990 World-Check One
- 44 World-Check One Zero Footprint
- 45 Side by Side Integration API
- Test Space
- 3 Thomson One Smart
- 1.2K TR Internal
- Global Hackathon 2015
- 2 Specialists Who Code
- 10 TR Knowledge Graph
- 150 Transactions
- 142 REDI API
- 1.7K TREP APIs
- 4 CAT
- 21 DACS Station
- 117 Open DACS
- 1.1K RFA
- 103 UPA
- 172 TREP Infrastructure
- 224 TRKD
- 886 TRTH
- 5 Velocity Analytics
- 5 Wealth Management Web Services
- 59 Workspace SDK
- 9 Element Framework
- 5 Grid
- 13 World-Check Data File
- Yield Book Analytics
- 46 中文论坛