I'd like to extract only "UPDATE" News headlines in English, which query to throw?

I am trying to extract only "UPDATE" News in English such as ::

- UPDATE 1-Chilean lawmakers censure ex-interior minister over rights abuses - Reuters News

- UPDATE 3-U.S. House approves Space Force, family leave in $738 bln defense bill - Reuters News


Now I am searching them with a keyword "UPDATE" but I'm getting bunch of unintentional resources like::

- Australia Stock Exchange release from OPENLEARNING OLL.AX: OpenLearning Commences Trading & Business Update


I'd like to know the way to have a pure news feed of UPDATE news.

Any advice would be greatly appreciated.

Best Answer

  • @yuri_k

    One method is filtering headlines at the application level. For example, the code calls the get_news_headlines method with "len and update" string to get English headlines with "update" text. Then, filter only the headlines that start with "UPDATE" text.


    headlines = ek.get_news_headlines("len and update", count=100, raw_output=False)
    headlines[headlines.text.str.match('^UPDATE')  == True]