Getting data frame directly from using news API

chrisma.salut · July 2021

In Eikon, news code [HNXB] contains headlines with links to download the file. Is there a way for us to download the data frame directly via Codebook? I'm only able to get the headlines but couldn't find a code that would extract the data directly.

ek.get_news_headlines(query = 'HNXB', count=10)

chavalit-jintamalit · July 2021

Hi @chrisma.salut

Step 1: get headline

Step 2: get story id

I can only add 2 pictures in an answer, so please see further comment on this answer for more detail.

chavalit-jintamalit · July 2021

Step 3: get Side by Side API token with Eikon Desktop

Step 4: prepare Side by Side API command

Step 5: open the URL using the prepared command

url = 'http://127.0.0.1:9000/sxs/v1';
headers = {
    'Content-Type': 'application/json',
    }

response = requests.request('POST', url, data=commandString, headers=headers)

print(response.text)

jason.ramchandani01 · July 2021

@chrisma.salut So im thinking that you are looking to retrieve the news stories - so get_news headlines returns a dataframe of max 100 items per API call corresponding to a query you give it - in your case 'HNXB'.

df = ek.get_news_headlines(query = 'HNXB', count=10)
df

So this returns a dataframe of 10 headline items and you need to pass the storyId to a second API called get_news_story - which will return you the html formatted story text. The folllowing routine stores the formatted html news story in a new column called storytext.

df['storytext'] = ''
for i, uri in enumerate(df['storyId']):
    df['storytext'][i] = ek.get_news_story(uri)
    
df

You can display the story using:

from IPython.display import HTML
HTML(df['storytext'][0])

I have noticed in your query the news story text is actually a download link - if you changed the query to say VOD.L and repeated the exercise - it would display a news item. It is working correctly. I hope this can help.

chrisma.salut · July 2021

Hello Chavalit,

Thank you for your assistance on this query. Client can't get past step 3 as they are getting an error. Any idea what's causing this?

khangdiep · August 2021

Hi all, I dont verified your answers because it doenst match the requirement : downloading the file not the news. I found out my own way without applying any side to side api :

###1. get the table news

###2. get the story id

###2. get the file (downloaded in download folder)

df1Id=df1[df1['date']== a]['storyId'][0]

df1_url=ek.get_news_story(df1Id)

link1 = re.search('href="(.*)" data-type', df1_url)

webbrowser.open(link1.group(1))

jason.ramchandani01 · August 2021

@khangdiep Sorry we did not hit the mark this time. Thanks for taking the trouble to post your solution here - i'm sure others in the community will benefit from this addition. I have verified that this does indeed open the link in the webbrowser - however it does not download the file. From here you could try to scrape the rendered HTML I suppose. Once again thanks so much for contributing your solution.

khangdiep · August 2021

the file is downloaded in "download" folder as the default location from website download. the download window is popped up and ask to choose the save folder location - but u can skip this step as keep running the code to download the other news file til the code end.

Getting data frame directly from using news API

Best Answer

Answers

Categories