Python Raw Data Extraction Request?

Can someone please post the correct bodyRequest to get all columns of RAW data. I tried the code below but am getting a 400 RESPONSE.

requestBody={
"ExtractionRequest": {
"@odata.type": "#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.CorporateActionsIsoExtractionRequest",
"ContentFieldNames": [
"GMT Offset",
"Type",
"MSG/FID Number",
"Message Type",
"FID Name",
"FID Value",
"FID Enum String",
"PE Code",
"Template Number",
"RTL",
"Sequence Number",
"Source RIC",
"Number of FIDs"
],
"IdentifierList": {
"@odata.type": "#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.InstrumentIdentifierList",
"InstrumentIdentifiers": [{
"Identifier": "CARR.PA",
"IdentifierType": "Ric"
}],
"UseUserPreferencesForValidationOptions":"false"
}
}
}

Also - could you guys kindly add a few more examples in Python as you have only 1 example in python and 100 in C#.... Just some basic stuff like pulling Venue Level by day would be helpful.

Best Answer

  • noah.kauffman, by Excel sheet do you mean the TRTH Data Dictionary ? Are you using the latest version, i.e. this one ? Where did you find a field called "Type" ? It does not exist, and is what makes the request fail. The available field names are in tab Field Descriptions, in column B. Column A contains the corresponding Tick History Report (i.e. the type of request, which corresponds to a type of data), in the case discussed above it would be Tick History Time and Sales.

    I also could not find "GMT Offset" in the latest Excel. I'm guessing you might be using a different Excel sheet. Can you send a link to where you found it, or attach it, so we can investigate your query further ?

    Edit: added this:

    I like to use Postman to test queries, it automatically displays the error messages, which helps debug. When I try the request above, that includes field "Type", the response has status 400 Bad Request, and its body contains a clear message:

    {
    "error": {
    "message": "Validation Error:\r\n\r\nInvalid content FieldName \"Type\""
    }
    }

Answers

  • So I discovered the following seems to return a result (though not in the format I expected).... 

    requestBody={
    "ExtractionRequest": {
    "@odata.type": "#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.TickHistoryRawExtractionRequest",
    "Condition": {
    "MessageTimeStampIn": "GmtUtc",
    "ReportDateRangeType": "Range",
    "QueryStartDate": "2017-08-01T12:00:00.000Z",
    "QueryEndDate": "2017-08-01T12:10:00.000Z",
    "ExtractBy": "Ric",
    "SortBy": "SingleByRic",
    "DomainCode": "MarketPrice",
    "DisplaySourceRIC": "true"
    },
    "IdentifierList": {
    "@odata.type": "#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.InstrumentIdentifierList",
    "InstrumentIdentifiers": [{
    "Identifier": "CARR.PA",
    "IdentifierType": "Ric"
    }],
    "UseUserPreferencesForValidationOptions":"false"
    }
    }
    }

    I think this format returned is in FID format. In the old API, the response was a plain flat file with columns such as:

      #RIC
    Current RIC
    Date[G]
    Time[G]
    GMT Offset
    Type
    Ex/Cntrb.ID
    Price
    Volume
    Exch Time

    The new format has all of this FID information. Is there some way to get it in a smaller format with out the FIDs?

    Second - What in this message request distinguishes between the TickType (like whether you are getting back Trades, or Quotes or End Of Day) ... It is not clear from the message what controls this.

  • You need to use TickHistoryTimeAndSalesExtractionRequest instead of TickHistoryRawExtractionRequest.

    Please refer to REST API Tutorial 4: On Demand tick data extraction in TRTH - REST API Tutorials.

  • Can you pls explain why the following request works:

    requestBody={
    "ExtractionRequest": {
    "@odata.type": "#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.TickHistoryTimeAndSalesExtractionRequest",
    "ContentFieldNames": [
    "Trade - Price",
    "Trade - Volume",
    "Trade - Exchange Time",
    "Type"
    ],
    "IdentifierList": {
    "@odata.type": "#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.InstrumentIdentifierList",
    "InstrumentIdentifiers": [{
    "Identifier": "CARR.PA",
    "IdentifierType": "Ric"
    }]
    },
    "Condition": {
    "MessageTimeStampIn": "GmtUtc",
    "ApplyCorrectionsAndCancellations": "false",
    "ReportDateRangeType": "Range",
    "QueryStartDate": "2017-08-01T12:00:00.000Z",
    "QueryEndDate": "2017-08-01T12:10:00.000Z",
    "DisplaySourceRIC": "true"
    }
    }
    }

    However, this request does not:

    requestBody={
    "ExtractionRequest": {
    "@odata.type": "#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.TickHistoryTimeAndSalesExtractionRequest",
    "ContentFieldNames": [
    "Trade - Price",
    "Trade - Volume",
    "Trade - Exchange Time",
    "Type"
    ],
    "IdentifierList": {
    "@odata.type": "#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.InstrumentIdentifierList",
    "InstrumentIdentifiers": [{
    "Identifier": "CARR.PA",
    "IdentifierType": "Ric"
    }]
    },
    "Condition": {
    "MessageTimeStampIn": "GmtUtc",
    "ApplyCorrectionsAndCancellations": "false",
    "ReportDateRangeType": "Range",
    "QueryStartDate": "2017-08-01T12:00:00.000Z",
    "QueryEndDate": "2017-08-01T12:10:00.000Z",
    "DisplaySourceRIC": "true"
    }
    }
    }

    Your Excel Sheet which has the avail. fields for EBD-TAS includes tons of fields such as:

      Type
    Ex/Cntrb.ID
    LOC
    Price
    Volume
    Market VWAP
    Buyer ID
    Bid Price
    Bid Size
    No. Buyers
    Seller ID
    Ask Price
    ... and many other columns ....

    However, when I add some of these other columns into the request (above), it fails. Can you explain?

    Also, why does "GMT Offset" say "work in progress" in the excel sheet?

  • Thanks - it was something a sales guy sent called "TRTH_v1v2_DataFieldDifferences_March2017". Anyhow - nevermind that, what you sent makes sense. Thank you!

  • noah.kauffman, you are welcome !

    Note: I found the Excel you mention. The tab EBD-TAS is only for VBD (Venue by Day) data. Custom extractions are under tab Customs-TAS. But this Excel his fairly high level, and only lists functional differences. The cells of the sheet do not contain actual field names that you can use in your calls.

    So for your purpose you definitely need to use the data dictionary.

  • So - one other thing...

    When I get the return for the following request:

    requestBody={
    "ExtractionRequest": {
    "@odata.type": "#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.TickHistoryTimeAndSalesExtractionRequest",
    "ContentFieldNames": [
    "Trade - Price",
    "Trade - Volume",
    "Trade - Exchange Time",
    "Quote - Ask Price",
    "Quote - Ask Size",
    "Quote - Bid Price",
    "Quote - Bid Size",
    "Auction - Volume",
    "Auction - Price",
    "Trade - Total Buy Value",
    "Trade - Total Buy Volume",
    "Trade - Total Demand",
    "Trade - Total Issues",
    "Trade - Total Moves",
    "Trade - Total Sell Value",
    "Trade - Total Sell Volume",
    "Trade - Total Volume"
    ],
    "IdentifierList": {
    "@odata.type": "#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.InstrumentIdentifierList",
    "InstrumentIdentifiers": [{
    "Identifier": "CARR.PA",
    "IdentifierType": "Ric"
    }]
    },
    "Condition": {
    "MessageTimeStampIn": "GmtUtc",
    "ApplyCorrectionsAndCancellations": "false",
    "ReportDateRangeType": "Range",
    "QueryStartDate": "2017-08-01T01:00:00.000Z",
    "QueryEndDate": "2017-08-01T23:59:00.000Z",
    "DisplaySourceRIC": "true"
    }
    }
    }

    It appears to return various duplicate records. For example, there are 16 different Auction entries w/ volume = 494597

    Is there some override to make sure that the data is consolidated and does not include duplicates?

    Next - I notice that when I request various fields, many of them are blank. For example, "Total Buy Volume" returns no values. Obviously any of the greeks would also be blank for an equity.

    So, is there some way to determine, for a standard equity ticker, what columns are usually available, so I can inspect what these are? For example, you list several fields such as Quote Imbalance. Not sure if these are available - but how can I find all standard equity columns that will be returned non-null?

    And plugging in all possible columns listed in your excel sheet returns an error response.

  • noah.kauffman, there are several queries here, let me try to answer them all:

    "Duplicate" auction entries: what you observe are not duplicates. These are special records that are published at the end of an auction (I cannot give you the exact details, I'm not a financial specialist).

    Blank fields: there is no way to automatically find out which fields will be populated. If you look in the TRTH data dictionary under tab "Field Descriptions", and scroll to the right you see columns corresponding to asset classes. This gives an indication of what should be available, but it is no guarantee. We store what was sent by the data providers, no more, no less. So you might find data for some instruments / exchanges, but not for others. For the specific field you mention "Trade - Total Buy Volume" it should be available for Equity only, but might also be blank. Same goes for the 4 fields related to Quote Imbalance, all should be available for Equity, but again, no guarantee.

    Error: did you plug in the entire list of column B of the sheet, or only the 364 fields available for a Time and Sales request ? The Excel contains all fields for all request types ... That said, I recommend you attempt to optimize calls to request only the fields you really require. That makes the whole workflow faster, and avoids you storing useless data. You can easily experiment with various field names to test what data is returned by executing queries in Postman.

  • Thanks for the response - but what you say does not appear to be correct. The closing auction print of this security on 8/1, according to bloomberg, was 493,119. The corresponding response from the API returns three such records of this volume, each with different time stamps. Additionally, the API collectively returns 112M shares labeled AUCTION where the timestamp in 8/1 after 15:30 .... For reference, on this day, there were only 1.772M shares traded.... So the auction prints are already collectively 63x the entire day volume.

    Can you please explain why the web service is returning such results and how to fix this issue?

  • noah.kauffman, as stated, I'm not a financial specialist. I'll ask a data specialist.

  • Hi Christiaan - Can you please have a data specialist update the ticket. As mentioned there appear to be duplicates in the response.

  • noah.kauffman

    I just received an answer from the data specialist:

    In auctions data it is observed that the exchange replays the same Auction Price and Volume on instances when there is no change to either the Auction Price or volume. And at the end of the auctions, the same price and volumes can be refreshed 2 or more times. This is an observed behavior.carr-1-aug-2017-auctions.txt

    Let us consider the same example, RIC <CARR.PA> on 1 Aug 2017 (auction data in attached file). Between 15:35:00.831799707Z and 15:35:04.864076469Z there are no changes to the price at which most orders can be executed, neither are there new orders adding on to the existing Auction prices, and consequently the volume stood as 494597. However, there was a new order recorded at 15:35:06.264160974Z which changed the Volume to 493119 and remained the same till the Auction close. To restate, this is an observed behavior.

    If you are interested in the Firm Auction Price (with Volumes) you need to refer to the latest “non-zero” Auction Price and Volumes in the respective auction sessions i.e. Opening Auctions, Closing Auctions, etc.

  • ok. seems weird, but i guess it is what it is. Next question - can you explain why the following request fails? I verified the call in the old system which returns data.

    requestBody={
    "ExtractionRequest": {
    "@odata.type": "#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.TickHistoryTimeAndSalesExtractionRequest",
    "ContentFieldNames": [
    "Trade - Price",
    "Trade - Volume",
    "Trade - Exchange Time"
    ],
    "IdentifierList": {
    "@odata.type": "#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.InstrumentIdentifierList",
    "InstrumentIdentifiers": [{
    "Identifier": "STXEM7",
    "IdentifierType": "Ric"
    }]
    },
    "Condition": {
    "MessageTimeStampIn": "GmtUtc",
    "ApplyCorrectionsAndCancellations": "false",
    "ReportDateRangeType": "Range",
    "QueryStartDate": "2017-03-14T01:00:00.000Z",
    "QueryEndDate": "2017-03-15T23:59:00.000Z",
    "DisplaySourceRIC": "true"
    }
    }
    }
  • I can reproduce this. The syntax of the query is correct. But the instrument was not found, even though it was quoted in March 2017. This is a data issue, I will escalate it.

  • I disagree. Something else seems to be going on because I've gone through many different futures symbols and every one of them returns a 200 response and fails. For example:

    Can you please escalate and have someone send sample code that works for futures symbols?

     "IdentifierList": {
    "@odata.type": "#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.InstrumentIdentifierList",
    "InstrumentIdentifiers": [{
    "Identifier": "SSIM7",
    "IdentifierType": "Ric"
    }]
    },
    "Condition": {
    "MessageTimeStampIn": "GmtUtc",
    "ApplyCorrectionsAndCancellations": "false",
    "ReportDateRangeType": "Range",
    "QueryStartDate": "2017-03-29T01:00:00.000Z",
    "QueryEndDate": "2017-04-01T23:59:00.000Z",
    "DisplaySourceRIC": "true"


    and


    "IdentifierList": {
    "@odata.type": "#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.InstrumentIdentifierList",
    "InstrumentIdentifiers": [{
    "Identifier": "FFIM7",
    "IdentifierType": "Ric"
    }]
    },
    "Condition": {
    "MessageTimeStampIn": "GmtUtc",
    "ApplyCorrectionsAndCancellations": "false",
    "ReportDateRangeType": "Range",
    "QueryStartDate": "2017-03-24T01:00:00.000Z",
    "QueryEndDate": "2017-03-26T23:59:00.000Z",
    "DisplaySourceRIC": "true"
    }
    }
    }


    and


    "IdentifierList": {
    "@odata.type": "#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.InstrumentIdentifierList",
    "InstrumentIdentifiers": [{
    "Identifier": "SXFM7",
    "IdentifierType": "Ric"
    }]
    },
    "Condition": {
    "MessageTimeStampIn": "GmtUtc",
    "ApplyCorrectionsAndCancellations": "false",
    "ReportDateRangeType": "Range",
    "QueryStartDate": "2017-05-30T01:00:00.000Z",
    "QueryEndDate": "2017-06-02T23:59:00.000Z",
    "DisplaySourceRIC": "true"
    }
    }
    }
  • noah.kauffman, please try adding this:

          "ValidationOptions": { 
    "AllowHistoricalInstruments": true
    },
    "UseUserPreferencesForValidationOptions": false,

    Which will give:

    {
    "ExtractionRequest": {
    "@odata.type": "#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.TickHistoryTimeAndSalesExtractionRequest",
    "ContentFieldNames": [
    "Trade - Price",
    "Trade - Volume",
    "Trade - Exchange Time"
    ],
    "IdentifierList": {
    "@odata.type": "#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.InstrumentIdentifierList",
    "ValidationOptions": {
    "AllowHistoricalInstruments": true
    },
    "UseUserPreferencesForValidationOptions": false,
    "InstrumentIdentifiers": [{
    "Identifier": "STXEM7",
    "IdentifierType": "Ric"
    }]
    },
    "Condition": {
    "MessageTimeStampIn": "GmtUtc",
    "ApplyCorrectionsAndCancellations": "false",
    "ReportDateRangeType": "Range",
    "QueryStartDate": "2017-03-14T01:00:00.000Z",
    "QueryEndDate": "2017-03-15T23:59:00.000Z",
    "DisplaySourceRIC": "true"
    }
    }
    }

    And thank you Romita for the suggestion :-) ... I should have thought of this.

  • Thanks, this works. Can you give me a brief explanation of what these additional params are doing? Is this a general optional param that I should be adding to all requests (such as equities as well)? And are there any other critically important parameters like this that I should be aware of?

  • noah.kauffman, by default the query will reject all instruments that are not currently quoted (like STXEM7). These parameters allow you to change this behavior.

    Setting "UseUserPreferencesForValidationOptions": false overrides the User preferences for instrument lists. Here is the full list of parameters you can set. You might want to systematically use the first 2 for such queries:

                "ValidationOptions": {
    "AllowOpenAccessInstruments": true,
    "AllowHistoricalInstruments": true,
    "AllowInactiveInstruments": true,
    "AllowDuplicateInstruments": false,
    "AllowUnsupportedInstruments": false,
    "ExcludeFinrAsPricingSourceForBonds": true,
    "UseExchangeCodeInsteadOfLipper": true,
    "UseUsQuoteInsteadOfCanadian": true,
    "UseConsolidatedQuoteSourceForUsa": true,
    "UseConsolidatedQuoteSourceForCanada": true,
    "UseDebtOverEquity": true
    },
    "UseUserPreferencesForValidationOptions": false,

    Hope this helps.