How to clear special character in news extracted from eikon api

Hi team, I encountered a question regarding eikon api retrieving news. The news body contains too many special characters, hyper links as well delimiters. Is there any way to clean them up and only keep the raw text? I've attached my code below and the original news from workspace. Thanks for the help.

1700535282789.png

1700535364526.png

Best Answer

Answers

  • Hi Jira, thanks for the reply. The new command did help to cleaned up special characters, but it truncated quite a lot text.

    text = rd.news.get_story("urn:newsml:reuters.com:20231113:nL4S3CE14O:1", format=rd.news.Format.TEXT)

    print(text)

    Original news:

    1700563399600.png

    News extracted:

    1700563414385.png

    Every line was truncated right in front of a hyper link or RIC. Is that some bugs or any other adjustments I need to do? Thank you.

  • @Julian.Bai

    This is what I get from the API.

    1700568715153.png


  • Thanks Jira, I'll try again later on my side.