Record Matching API website support gives an error

Hello,

I'm trying to use the website feature through a python script using the REST API, and hitting a snag. It appears that there is a trailing \r after the data that messes up the JSON parsing:

URL='https://api.thomsonreuters.com/permid/match'

header={'Accept': 'application/json', 'Content-Type': 'text/plain', 'x-ag-access-token': 'xxx', 'x-openmatch-dataType': 'Organization', 'x-openmatch-numberOfMatchesPerRecord': 1}

data='LocalID,Standard Identifier,Name,Country,Street,City,PostalCode,State,Website \n ,,Apple,,,,,,http://www.apple.com'

Exception Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/requests/models.py", line 819, in json return json.loads(self.text, **kwargs)

...File "/usr/lib/python2.7/site-packages/simplejson/decoder.py", line 400, in raw_decode return self.scan_once(s, idx=_w(s, idx).end())

JSONDecodeError: Invalid control character u'\r' at: line 1 column 692 (char 691)

Note: i am running python 2.7 and requests 2.7 in cygwin, on windows 10.

Best Answer

  • @vna

    I found that the problem only happens when the request text contains the website information.


    LocalID,Standard Identifier,Name,Country,Street,City,PostalCode,State,Website

    ,,Apple,,,,,,www.apple.com


    The response will contain 0x0D after the "Input_Website" text.

    image

    However, if I removed the website information, the response is fine.

    LocalID,Standard Identifier,Name,Country,Street,City,PostalCode,State,Website

    ,,Apple,,,,,,

    The response is:

    {
    "ignore": " ",
    "unMatched": 0,
    "matched": {
    "total": 1,
    "excellent": 1
    },
    "numReceivedRecords": 1,
    "numProcessedRecords": 1,
    "numErrorRecords": 0,
    "headersIdentifiedSuccessfully": [
    "localid",
    "standard identifier",
    "name",
    "country",
    "street",
    "city",
    "postalcode",
    "state",
    "website"
    ],
    "headersNotIdentified": [],
    "headersSupportedWereNotSent": [],
    "errorCode": 0,
    "errorCodeMessage": "Success",
    "resolvingTimeInMs": 238,
    "requestTimeInMs": 238,
    "outputContentResponse": [
    {
    "ProcessingStatus": "OK",
    "Match OpenPermID": "https://permid.org/1-4295905573",
    "Match OrgName": "Apple Inc",
    "Match Score": "92%",
    "Match Level": "Excellent",
    "Match Ordinal": "1",
    "Original Row Number": "2",
    "Input_Name": "Apple"
    }
    ]
    }

    Please confirm if it is the same problem that you found.

Answers

  • Yes, it is the same issue I have. Did you find any way to counteract that? (that ideally doesn't require me to hack into the requests library for this one special case)

  • I have contacted the support team to verify this problem.

  • @vna

    The development team can replicate the issue. They will handle it. I will keep you updated on its progress.

  • Hello Jira, any news on this bug? I was hoping to start using websites by next month :)

  • @vna

    There was no response back from the development team. I will contact them again for the update.

  • Hello Sunil & Jira,

    any response yet? I am blocked trying to use the API if i can't match websites...

    best

    VnA

  • @vna

    I have tested it. The problem has been fixed. There is no 0x0D after the "Input_Website" text anymore.

  • Thanks! It seems ok now for me too.