Using for loop to get news headlines of multiple companies

I am trying to use for loop to get ESG news headlines for a list of 405 companies.

My code has proven succesfull when retrieving RICs based on ISINs, however, when I apply the same approach to retrieve the news headlines, I get the error "query must be string ". When I try to use "str()" to convert the values into string, and run the for loop, I get the error below:

EikonError: Error code 500 | Backend error. Failed to deserialize backend response. Expected valid JSON. Error: invalid character 'i' looking for beginning of value

See example below of the instance where my for loop code works with the "ek.get_symbology" function, and how it fails with the "ek.get_news_headlines" function.

Succesful example with ek.get_symbology:

## Ensuring that all ISIN ID's in the Brands data frame are strings (or otherwise converting them)

Brands["ID (ISIN)"] = Brands["ID (ISIN)"].astype(str)

## Creating a list of all the ISINs in order to make the RIC call on the API

ISIN_List = Brands["ID (ISIN)"].tolist()

## Dividing the list of 450 compaies into chunks for the API to be able to process it

chunklist = chunk(ISIN_List, 100)

##Calling RICs from API based on ISIN's (with a for loop to batch the request in chunks)

content_df = []

for subs in chunklist:

RICs = ek.get_symbology(subs, from_symbol_type='ISIN', to_symbol_type= 'RIC')

content_df.append(RICs)

content_df = pd.concat(content_df)

Failed attempt with the "ek.get_news_headlines" function.

##Adding the "R:" for insturment clarification + ESG as a topic for the headline retriever

Brands_M["H_Rqst"] = "R:"+ Brands_M["RIC"].map(str) + " AND Topic:ESG"

## Ensuring that all Headline Request in the Brands data frame are strings (or otherwise converting them)

Brands_M["H_Rqst"] = Brands_M["H_Rqst"].astype(str)

##Creating list of RICs codes with ESG topic to call on the API

H_Q_List = Brands_M["H_Rqst"].tolist()

##Dividing the list into chunks for the API to process it

H_Q_chunklist = chunk(H_Q_List, 50)

##Retreiving headlines data based on RICs + ESG as a Topic

Headlines_df = []

for H_Q_subs in H_Q_chunklist:

News = ek.get_news_headlines(query=H_Q_subs,count=10,date_from='2020-09-01T09:00:00',date_to='2020-09-25T18:00:00')

Headlines_df.append(News)

Headlines_df = pd.concat(Headlines_df)

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-13-dbae4b732937> in <module>
3 Headlines_df = []
4 for H_Q_subs in H_Q_chunklist:
----> 5 News = ek.get_news_headlines(query=H_Q_subs,count=10,date_from='2020-09-01T09:00:00',date_to='2020-09-25T18:00:00')
6 Headlines_df.append(News)
7

~/opt/anaconda3/lib/python3.7/site-packages/eikon/news_request.py in get_news_headlines(query, count, date_from, date_to, raw_output, debug)
97 error_msg = 'query must be a string'
98 logger.error(error_msg)
---> 99 raise ValueError(error_msg)
100
101 if type(count) is not int:

ValueError: query must be a string


Hope you can help, I suspect it has something to do with the addition of the topic.



Best Answer

  • @Erik77 does this work for you?

    df1 = pd.DataFrame()
    ricList = ["AAPL.O","IBM.N","TSLA.O"]
    for ric in ricList:
        q="R:" + ric + " AND Topic:ESG"
        print(q)
        df = ek.get_news_headlines(q)
        df1 = df1.append(df)

Answers

  • @Erik77 I think your issue may have something to do with proper escaping of the query string - please see my answer on this thread. Could you please post a completed example of the query string: eg print("R:"+ Brands_M["RIC"].map(str) + " AND Topic:ESG")

    this works for me:

    q="R:" + "IBM" + " AND Topic:ESG"
    df = ek.get_news_headlines(q)
    df 

    image

    I hope this can help.

  • Thanks for your reply, I am looking into the escaping. In the meantime, see below the print you requested:



    image



  • @Erik77 So this query looks fine it just needs to be escaped itself eg "R:MCD AND Topic:ESG"

    Can you try:

    Brands_M["H_Rqst"] = Brands_M["H_Rqst"].astype(str).values

  • This is already the case, when I pass the column as a list, and then into chunks. Yet I still seem to get the error:

    Or am I missing something?

    image

  • @Erik77 - they need to be escaped with " not ' i believe this is the issue.

  • @Erik77 actually I just tried with single quotes and it works for me:

    image


    Can you try removing all the other parts of the query eg count and start/end date and try again? Does the below work for you:

    q='R:MCD AND Topic:ESG'
    df = ek.get_news_headlines(q)
    df
  • Hi @jason.ramchandi this does work as a single query. But I don't seem to be able to loop over 405 companies.

    Any ideas on how to do that?

  • Yes it does, thank you!