De-duplicating MRN_STORY messages received over two subscriptions

Hello,

(First post here).

For redundancy, we have two news handler processes (primary & secondary) subscribing to the MRN_STORY RIC over the ELECTRON_DD service. In our backend, we would like to de-duplicate the messages received over these two feeds so that only a single "master" copy of a news item is kept in the database. This brings a couple of questions:

1. Is it possible for the "same" news item to be published over two different MRN_SRC systems? (If this is the case, it would mean we cannot use the MRN_SRC-GUID combination as a unique identifier in the backend for de-duplication of messages - as our primary and secondary news handler processes may receive the same message over different MRN_SRC systems).

2. If the answer to (1) is "Yes", what other fields should we be looking at for redundancy removal? Perhaps we can hash some combination of fields in the JSON body of the message? But it's unclear from the documentation what fields should be used for this kind of a setup.

3. Is there a correlation between the messages received over MRN_STORY RIC and the MRN_TRNA RIC? (I'm wondering if such a relationship can be exploited for redundancy removal).

Thanks!

/ Asiri

Best Answer

  • Duplicate news items share GUID values alone.

    De-duplication among messages also is important. Consumers may observe them after a failover in Thomson Reuters infrastructure. In this case, the duplicate messages would share GUID and FRAG_NUM values.

    The "id" field in the JSON links news items across MRN_STORY and MRN_TRNA.

Answers

  • Thank you for your quick response.

    Just to clarify: Does a story has the same GUID no matter which MRN_SRC it is coming from?

    After reading the spec I thought only GUID+MRN_SRC does uniquely identify a story (which would question why GUID is called GUID).

    MRN DATA MODELS AND ELEKTRON IMPLEMENTATION GUIDE DATA MODEL VERSION 2.10, NEWS ANALYTICS VERSION 3.1

    Section 3.2.1.2 "A single MRN data item publication is uniquely identified by the combination of RIC, MRN_SRC and GUID."