News headline and story to CSV file

Koji.Miyamoto · January 2018

Hi,

I would like to make csv file of news headlines and story.

for headline I’m using→headlines = ek.get_news_headlines('JPY=')

for story I’m using →for index, headline_row in headlines.iterrows():

story = ek.get_news_story(headline_row['StoryId'])

print (story)

then request, df.to_csv('news.csv')

Does anyone know where do I have to fix?

Regards

Jirapongse · January 2018

Do you mean adding the Story column in the headlines data frame? If yes, the code is:

headlines = ek.get_news_headlines("R:JPY= IN JAPANESE", count=100, date_from='2018-01-10T13:00:00', date_to='2018-01-10T15:00:00')
stories = pd.DataFrame(columns=['DATE','STORY'])
for index, headline_row in headlines.iterrows():   
    story = ek.get_news_story(headline_row['storyId'])
    stories = stories.append({'DATE':index,'STORY':story}, ignore_index=True)
stories = stories.set_index('DATE')
result = pd.concat([headlines, stories], axis=1)
result.to_csv("news.csv")

The result looks like:

pierre.faurel · January 2018

First, set to lower case StoryId in your code to request a story :
story = ek.get_news_story(headline_row['storyId'])

Then, I understand that you want to save stories with storyId in a csv file.

If I'm correct, the function to_csv you're using comes from DataFrame class.
You have to create the DataFrame based on a story list.
Example:

headlines = ek.get_news_headlines('JPY=')
stories = [ (storyId,ek.get_news_story(storyId)) for storyId in headlines['storyId'].tolist()]
df = pd.DataFrame(stories, columns=['storyId', 'story'])
df.to_csv('news.csv', sep=',',index=False)

Koji.Miyamoto · January 2018

Thank you for your support.

I have an one more question,the number of news are different between DF and RESULTS.

It's my understanding that RESULTS includes DF thus I can get wider range of news using RESULTS compare with DF. Is this correct?

Sorry but I am very new to Eikon APIs.

Thank you for your kindly support.

Regards,

Koji

Jirapongse · January 2018

Could you please explain more about the question or share the code?

pierre.faurel · January 2018

If you're comparing results from following requests :
headlines = ek.get_news_headlines("R:JPY= IN JAPANESE",...
and
headlines = ek.get_news_headlines('JPY=')

News parameters are different, so number of headlines/stories could be different.

Koji.Miyamoto · January 2018

I meant former answer uses :

result = pd.concat([headlines, stories], axis=1)

result.to_csv("news.csv")

But latter answer uses :

df = pd.DataFrame(stories, columns=['storyId', 'story'])
df.to_csv('news.csv', sep=',',index=False)

What is the difference between result= and df=?

Jirapongse · January 2018

As mentioned by pierre.faurel, news parameters are different, so number of headlines/stories could be different.

result uses headlines from ek.get_news_headlines("R:JPY= IN JAPANESE", count=100, date_from='2018-01-10T13:00:00', date_to='2018-01-10T15:00:00') while pd uses headlines from ek.get_news_headlines('JPY=').

Koji.Miyamoto · January 2018

Sorry for lack of my information,

I meant definitions of result= and df= .

Its my understanding that If I want to contain over 2 columns, I should use results=

then if I want to just 2 columns, use df=.

Is this correct?

Regards,

Koji

Jirapongse · January 2018

Yes, you are correct.

result in the first sample uses concat to merge two data frames (headlines, stories) based on date which is an index. headlines data frame has the following 5 columns: DATE, versionCreated, text, storyId, and sourceCode while stories data frame has the following 2 column: DATE, and STORY. After merging, the result data frame has 6 column which has DATE as an index.

df in the second sample creates a new data frame with two columns: storyId, and story.

Koji.Miyamoto · January 2018

Thank you very much!

Your answer is very helpful.

Kind regards,

Koji

News headline and story to CSV file

Best Answer

Answers

Categories