Accessing HTML content of news using the RDP .NET API

Hello,

What is the best way of accessing the HTML content part off a news story. The IStoryData part of the returned IStoryResponse contains only the text/plain content of the styory in the NewsStory field.

Many thanks,

Darko Roje
Spreadex Ltd

Best Answer

  • nick.zincone
    Answer ✓

    Hi @darko.roje,

    The Refinitiv Data Library for .Net doesn't make any attempt to manipulate the response from the platform. However, I would imagine the Playground would take the raw response and once presented within the browser, would likely remove <CR><LF> and other non-relevant formatting sequences, which would explain the differences. That being said, the critical content should not be affected, or shouldn't be.

Answers

  • To add, I am able to see that IStoreResponse.Data is a Refinitiv.Data.Content.News.StoryData object and has a Raw field which contains the whole response from the server. However since Refinitiv.Data.Content.News.StoryData is internal, I am unable to cast to that type to get access to the Raw Field.

    I see that there is a IStoryDefinition.HtmlFormat method, but calling that with true just seems to convert the story.Data.NewsStory to some strange type of text/plain document, even though it has an <html> header.

  • @darko.roje

    From my testing, when setting HtmlFormat(true), the ContentType is text/html.

    1625742866470.png


  • Hello,

    Yes, you are right that the context type is text/html when using the HtmlFormat(true) but even that the actual story returned is not the original HTML received from Refinitiv, but some kind of conversion to text.

  • @darko.roje

    I have verified the retrieved data from the API Playground.

    1625816056687.png

    The HTML response from RDP .NET API is similar to the API playground.

    1625816164459.png


    You can refer to the Reference guide of the /data/news/v1/stories/{storyId} endpoint in the API Playground regarding HTML view response.

  • Hello,

    Thanks for your response. For me, the HTML content retrieved via RDP API is similiar, but not the same as the one received by the API playground, or indeed the one I can see in the C# debugger if I look at the returned Story. I have attached the two files.HTML from Playground.txtHTML from API.txt which demonstrate the difference.

  • Thank you. I have figured out that if HtmlFormat is on, it changes the accept header and the server sends back different HTML than the one sent when Accept header is text/plain. This explains what I was seeing.