x-direct-download aws access key error
Hi,
I'm trying to download the latest extracted file for an everyday schedule. I was using the x-direct-download header to download the files directly from the aws server. However I get an aws access key error. I guess the key should be something that must be provided from your api end when it redirects. If it is supposed to be provided by us, then how to get it? I get the following error messages when trying the direct download:
<Error><Code>SignatureDoesNotMatch</Code><Message>The request signature we calculated does not match the signature you provided. Check your key and signing method.</Message><AWSAccessKeyId>AKIAJVAI4XORJURKYMEA</AWSAccessKeyId><StringToSign>GET
text/plain
1508866448
x-amz-request-payer:requester
/tickhistory.query.production.hdc-results/569288DA08B4447C911246BAFDFF15E2/data/merged/merged.csv.gz?response-content-disposition=attachment; filename=latestdownload.csv.gz</StringToSign><SignatureProvided>jSLffd/FgvE9+OJxV4MJesQmru8=</SignatureProvided><StringToSignBytes>47 45 54 0a 0a 74 65 78 74 2f 70 6c 61 69 6e 0a 31 35 30 38 38 36 36 34 34 38 0a 78 2d 61 6d 7a 2d 72 65 71 75 65 73 74 2d 70 61 79 65 72 3a 72 65 71 75 65 73 74 65 72 0a 2f 74 69 63 6b 68 69 73 74 6f 72 79 2e 71 75 65 72 79 2e 70 72 6f 64 75 63 74 69 6f 6e 2e 68 64 63 2d 72 65 73 75 6c 74 73 2f 35 36 39 32 38 38 44 41 30 38 42 34 34 34 37 43 39 31 31 32 34 36 42 41 46 44 46 46 31 35 45 32 2f 64 61 74 61 2f 6d 65 72 67 65 64 2f 6d 65 72 67 65 64 2e 63 73 76 2e 67 7a 3f 72 65 73 70 6f 6e 73 65 2d 63 6f 6e 74 65 6e 74 2d 64 69 73 70 6f 73 69 74 69 6f 6e 3d 61 74 74 61 63 68 6d 65 6e 74 3b 20 66 69 6c 65 6e 61 6d 65 3d 6c 61 74 65 73 74 64 6f 77 6e 6c 6f 61 64 2e 63 73 76 2e 67 7a</StringToSignBytes><RequestId>6D0CF27BCFA8F1B8</RequestId><HostId>SsWFsDplls+pBOt4kIJ6oaHmPd2u5pajDLqEAgUzxOJYC3V8ihWBw14IyJAUV7wQLlEjvUTeDQg=</HostId></Error>
Best Answer
-
Figured out the problem was with the requests library. Upgrading it made everything work perfectly.
1
Answers
-
You do not have to manage the AWS key, it should be transparent to you. Here is the mechanism:
When you send the call to retrieve your data with the X-Direct-Download: true header, you receive a response with HTTP status 302 (redirect). The
response header contains a redirection URI in item Location. It has this format:This is a self signed URI, using an AWS Access Key Id which is included directly in the URI.
Most HTTP clients automatically follow the redirection, which means you have nothing to do. A call is made in the background to this URI, and the data is retrieved. You can actually check if the redirection works by sniffing the traffic, with a tool like Fiddler or equivalent.
Some HTTP clients support and follow the redirection, but fail to connect to AWS,
because they include the Authorization
header in the request message redirected to AWS, which then returns a BadRequest status (400) with the error:“Only one auth
mechanism allowed; only the X-Amz-Algorithm query parameter, Signature query
string parameter or the Authorization header should be specified”.That is described in an advisory.
But the error you encountered is different.
If this generic description of the mechanism is not sufficient to solve your issue:
- You might want to look at some of our samples, they were recently enhanced to use AWS. Under the downloads tab there are the .Net SDK Tutorials code (2, 4 and 5 use AWS), the Java samples (2 for immediate schedules and 2 for On demand extractions) and a Python sample.
- Alternatively, if you'd like us to help you further, we'd need to know exactly what workflow you followed, from the call to retrieve the data till the occurence of the error. The code you use would also be of help.
0 -
Hi,
Here's the full code:
#!/bin/python
# coding: utf-8
import requests
import json
import shutil
import time
import sys
base_url = "https://hosted.datascopeapi.reuters.com/RestApi/v1/"
auth_req_url = base_url+"Authentication/RequestToken"
instrument_url = base_url + "Extractions/InstrumentLists"
get_scheduled_url = base_url + "Extractions/Schedules"
get_report_status_url = base_url + "Extractions/ReportExtractions"
download_extracted_url = base_url + "Extractions/ExtractedFiles"
requestHeaders={ "Prefer":"respond-async", "Content-Type":"application/json" }
credData = { 'Credentials' : { 'Username' : CENSORED, 'Password' : CENSORED} }
def getAuthToken( data = credData):
r = requests.post(url = auth_req_url, json = data, headers = requestHeaders)
if(r.status_code == 200):
jsonResponse = json.loads(r.text.encode('ascii', 'ignore'))
token = jsonResponse["value"]
return token
def addToInstrumentList( token, listName, identifierList ):
requestHeaders["Authorization"] = "token " + token
url = instrument_url+"('" +listName + "')/ThomsonReuters.Dss.Api.Extractions.InstrumentListAppendIdentifiers"
data = {"Identifiers": identifierList, "KeepDuplicates":False}
print data
r =requests.post(url, json=data, headers=requestHeaders )
# if(r.status_code == 200):
print r.status_code, r.text
def getLatestData( token, scheduleId ):
requestHeaders["Authorization"] = "token " + token
url = get_scheduled_url+ "('" + scheduleId + "')/LastExtraction"
# print url
r = requests.get(url, headers=requestHeaders)
if(r.status_code==200):
jsonResponse = json.loads(r.text.encode('ascii', 'ignore'))
return jsonResponse["ReportExtractionId"]
else:
print "Can't get report id. Here's some debug info: ",r.status_code, r.text
def getReportFiles( token, extractionId ):
requestHeaders["Authorization"] = "token " + token
url = get_report_status_url + "('" + extractionId + "')/Files"
r = requests.get( url, headers=requestHeaders )
if(r.status_code == 200):
jsonResponse = json.loads(r.text.encode('ascii', 'ignore'))["value"]
fileTuple = {}
fileTuple["notes"] = jsonResponse[0]["ExtractedFileId"]
fileTuple["file"] = jsonResponse[1]["ExtractedFileId"]
return fileTuple
else:
print "Can't get extracted files. Here's some debug info: ",r.status_code, r.text
def downloadReportFiles( token, fileId, outfile ):
requestHeaders["Authorization"] = "token " + token
requestHeaders["Accept-Encoding"] = "gzip"
requestHeaders["Content-Type"] = "text/plain"
requestHeaders["X-Direct-Download"] = "true"
url = download_extracted_url + "('" + fileId + "')/$value"
# print url, requestHeaders
r = requests.get( url, headers=requestHeaders, stream=True )
if(r.status_code == 302):
print r
r.raw.decode_content = False
print r.status_code, r.headers["Content-Type"]#, r.headers["Content-Encoding"], r.headers["Content-Length"]
fileName = outfile
chunk_size = 1024
rr = r.raw
with open(fileName, 'wb') as fd:
shutil.copyfileobj(rr, fd, chunk_size)
if __name__ == "__main__":
token = getAuthToken()
print "Token is: ",token
scheduleId = CENSORED
reportId = getLatestData(token, scheduleId)
extractedFiles = getReportFiles(token, reportId)
timestr = time.strftime("%Y%m%d-%H%M%S")
outFileName = "download_"+timestr+".csv.gz"
downloadReportFiles(token, extractedFiles["file"], outFileName)
# notesFileName = "notes_"+timestr+".csv.gz"
# downloadReportFiles(token, extractedFiles["notes"], notesFileName)Also the problem seems machine specific. Works on some machine but on others I get this 403 forbidden. Any idea why?
0 -
sgan208, your code is ok, I have just tested it successfully.
Your last sentence makes me wonder: are you by chance behind a firewall or proxy ?
0 -
No.There's no firewall. The python libraries may differ. I'll check and get back.
0 -
You can try to disable redirection in requests. For example:
requests.get(url, headers=requestHeaders, allow_redirects=False)
After that, the application will get the HTTP response with status code 302.
HTTP/1.1 302 Found
Cache-Control: no-cache
Date: Sun, 03 Sep 2017 10:34:17 GMT
Expires: -1
Location: https://s3.amazonaws.com/tickhistory.query.production.hdc-results/xxx/data/merged/merged.csv.gz?AWSAccessKeyId=xxx&Expires=1504456458&response-content-disposition=attachment; filename=_OnD_0x05dbb5f5a62b3016.csv.gz&Signature=xxx&x-amz-request-payer=requesterThe Location header in the response contains the AWS URL for download. Then, the application needs to send another GET request with this AWS URL without any headers to download a file.
0 -
sgan208, thank you for letting us know :-)
0
Categories
- All Categories
- 6 AHS
- 37 Alpha
- 161 App Studio
- 4 Block Chain
- 4 Bot Platform
- 16 Connected Risk APIs
- 47 Data Fusion
- 30 Data Model Discovery
- 608 Datastream
- 1.3K DSS
- 577 Eikon COM
- 4.9K Eikon Data APIs
- 7 Electronic Trading
- Generic FIX
- 7 Local Bank Node API
- Trading API
- 2.7K Elektron
- 1.3K EMA
- 236 ETA
- 519 WebSocket API
- 33 FX Venues
- 10 FX Market Data
- 1 FX Post Trade
- 1 FX Trading - Matching
- 12 FX Trading – RFQ Maker
- 5 Intelligent Tagging
- 2 Legal One
- 20 Messenger Bot
- 2 Messenger Side by Side
- 9 ONESOURCE
- 7 Indirect Tax
- 59 Open Calais
- 264 Open PermID
- 39 Entity Search
- 2 Org ID
- PAM
- PAM - Logging
- 8.4K Private Comments
- 6 Product Insight
- Project Tracking
- ProView
- ProView Internal
- 20 RDMS
- 1.4K Refinitiv Data Platform
- 367 Refinitiv Data Platform Libraries
- 3 Refinitiv Due Diligence
- LSEG Due Diligence Portal API
- 3 Refinitiv Due Dilligence Centre
- Rose's Space
- 1.1K Screening
- 18 Qual-ID API
- 13 Screening Deployed
- 23 Screening Online
- 10 World-Check Customer Risk Screener
- 990 World-Check One
- 44 World-Check One Zero Footprint
- 45 Side by Side Integration API
- Test Space
- 3 Thomson One Smart
- 1.2K TR Internal
- Global Hackathon 2015
- 2 Specialists Who Code
- 10 TR Knowledge Graph
- 150 Transactions
- 142 REDI API
- 1.7K TREP APIs
- 4 CAT
- 21 DACS Station
- 117 Open DACS
- 1.1K RFA
- 103 UPA
- 172 TREP Infrastructure
- 224 TRKD
- 886 TRTH
- 5 Velocity Analytics
- 5 Wealth Management Web Services
- 59 Workspace SDK
- 9 Element Framework
- 5 Grid
- 13 World-Check Data File
- Yield Book Analytics
- 46 中文论坛