How to decode enum values enclosed in ‘#”?

In the enumtype.def file, there are enum values enclosed in '#'s e.g. #DE# up tick or zero uptick, #FE# down tick or zero downtick. What encoding type does this denote? I need to get these strings in UTF-8, but I don't know what to convert from.

Best Answer

  • @Akechi Sato

    The
    enum values enclosed in “#” are hex characters encoded in Reuters Multi-Lingual
    Text Encoding Standard (RMTES). You need to use RMTES conversion utility to
    convert the value to UTF-8. For example, “#DE#” will be converted to 0x e2 87
    a7 which displays ⇧ symbol.

    Please note that there is no meta-data to define if the enum values are RMTES or
    not, so application needs to pass all received enum values through the RMTES decoding.
    If the value is non-RMTES, the decoding will do nothing.

    For
    UPA C, please see the “11.2.9RMTES Decoding” section in UPA Developers Guide document for more information.

    For RFA C++, please see
    the “10.6 RMTESConverter Usage” section in RFA C++ Developers Guide document
    for more information.