Python. Get_Story results - format differences.

For one story I got different result for TEXT and HTML formats. The example is below:

import refinitiv.data as rd

from datetime import timedelta

from IPython.display import HTML


rd.open_session()

url = "urn:newsml:newswire.refinitiv.com:20230828:nL4N3A91S1:1"

story = rd.news.get_story(url, format=rd.news.Format.HTML)

rd.close_session()


HTML format contains a subtitle unlike text format:

"Aug 28 (Reuters)</span><span class="tr-dl-sep"> - </span>U.S. oil refiners are expected to have about 912,000 barrels per day (bpd) of capacity offline for the week ending Sept. 1, decreasing available refining capacity by 410,000 bpd, research company IIR Energy said on Monday.</p>"


Is it bug?

Best Answer

Answers

  • Hello @alexander.shkurin01

    I can replicate the issue with the "desktop.workspace" session. The "rd.news.Format.TEXT" output does not has "U.S. oil refiners are expected to have about 912,000 barrels per day (bpd) of capacity offline for the week ending Sept. 1, decreasing available refining capacity by 410,000 bpd, research company IIR Energy said on Monday." text like the "rd.news.Format.HTML" output.

    However, I cannot replicate the issue when using the Library "platform.rdp" session. The outputs' content of 2 formats are identical.

    html-rdp.png

    text-rdp.png

    I am contacting the Data Library team. In the meantime, I suggest you use the "platform.rdp" session if possible.

    • The "platform.rdp" needs RDP credentials
    • Some content set like TR Fields are not available on the "platform.rdp" session


  • @wasin.w Thank you for your quick response. I try to use HTML format for further processing and my purposes. Please, let me know, if you have any updates in this case.

  • Hello @wasin.w!

    It's working. Thank you!