Getting data frame directly from using news API

In Eikon, news code [HNXB] contains headlines with links to download the file. Is there a way for us to download the data frame directly via Codebook? I'm only able to get the headlines but couldn't find a code that would extract the data directly.

ek.get_news_headlines(query = 'HNXB', count=10)

Best Answer

  • chavalit-jintamalit
    Answer ✓

    Hi @chrisma.salut

    Step 1: get headline

    ahs1.png


    Step 2: get story id

    ahs2.png


    I can only add 2 pictures in an answer, so please see further comment on this answer for more detail.


Answers

  • Step 3: get Side by Side API token with Eikon Desktop

    ahs3.png

    Step 4: prepare Side by Side API command

    ahs4.png


    Step 5: open the URL using the prepared command

    url = 'http://127.0.0.1:9000/sxs/v1';
    headers = {
    'Content-Type': 'application/json',
    }

    response = requests.request('POST', url, data=commandString, headers=headers)

    print(response.text)
  • @chrisma.salut So im thinking that you are looking to retrieve the news stories - so get_news headlines returns a dataframe of max 100 items per API call corresponding to a query you give it - in your case 'HNXB'.

    df = ek.get_news_headlines(query = 'HNXB', count=10)
    df

    1625126359346.png


    So this returns a dataframe of 10 headline items and you need to pass the storyId to a second API called get_news_story - which will return you the html formatted story text. The folllowing routine stores the formatted html news story in a new column called storytext.

    df['storytext'] = ''
    for i, uri in enumerate(df['storyId']):
        df['storytext'][i] = ek.get_news_story(uri)
        
    df


    1625127940211.png

    You can display the story using:

    from IPython.display import HTML
    HTML(df['storytext'][0])

    I have noticed in your query the news story text is actually a download link - if you changed the query to say VOD.L and repeated the exercise - it would display a news item. It is working correctly. I hope this can help.

  • Hello Chavalit,


    Thank you for your assistance on this query. Client can't get past step 3 as they are getting an error. Any idea what's causing this?image001.jpg

  • Hi all, I dont verified your answers because it doenst match the requirement : downloading the file not the news. I found out my own way without applying any side to side api :

    ###1. get the table news

    ###2. get the story id

    ###2. get the file (downloaded in download folder)

    df1Id=df1[df1['date']== a]['storyId'][0]

    df1_url=ek.get_news_story(df1Id)

    link1 = re.search('href="(.*)" data-type', df1_url)

    webbrowser.open(link1.group(1))


  • @khangdiep Sorry we did not hit the mark this time. Thanks for taking the trouble to post your solution here - i'm sure others in the community will benefit from this addition. I have verified that this does indeed open the link in the webbrowser - however it does not download the file. From here you could try to scrape the rendered HTML I suppose. Once again thanks so much for contributing your solution.
  • the file is downloaded in "download" folder as the default location from website download. the download window is popped up and ask to choose the save folder location - but u can skip this step as keep running the code to download the other news file til the code end.