Download Price of more instruments with DSWS

andrea.negri · April 2022

Hi want download price from list of code. But the datframe Price is empty

CodiceDatastream

MSACWFELBEURSELBGOBL$LGBDGCEIT0004782758LU1694212348FR0010149120LU0452734238LU0764816798

import numpy as np

import pandas as pd

import DatastreamDSWS as DSWS

ds = DSWS.Datastream(username="xxxxxxxxxxxx", password= "xxxxxxxxxxxx")

Codici = pd.read_excel('AnagraficaCodici.xlsx')

Codici['DatastreamRICs'] = Codici['CodiceDataStream']

ds_rics_str = Codici['DatastreamRICs'].str.cat(sep=',')

Price = ds.get_data(ds_rics_str, fields=['X(P)~EUR'], start='-5Y', end='0D', freq='D')

Shadab Hussain · April 2022

Hi @andrea.negri,

Hope Jirapongse answer was helpful, I'm adding code snippets for few of the RICs below assuming start and end day by my own using which you can read the RICs name from the file, and request for the data from DSWS API:

# Installing Packages
import DatastreamDSWS as DSWS
import pandas as pd
import numpy as np
import datetime
from IPython.display import display
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

# Loading RICs name from file
stock_list = list(pd.read_excel("Sample.xlsx")['CodiceDataStream'])

# Declaring Variables
ds = DSWS.Datastream(username ="******", password='******')
fields = ['(X)~EUR']
final_df = pd.DataFrame()
frames = []
year = 2018
start_day = 1
end_day = 20
month = 10
start_date = datetime.datetime(year, month, start_day).strftime('%Y-%m-%d')
end_date = datetime.datetime(year, month, end_day).strftime('%Y-%m-%d')


for stock in stock_list:
    df = ds.get_data(tickers=stock,fields=fields,start=start_date, end=end_date)[stock].reset_index().droplevel('Currency', axis=1)
    df['Instrument'] = stock
    frames.append(df)

temp_df = pd.concat(frames)    
final_df = temp_df.pivot_table(index='Dates', columns='Instrument', values=fields)
final_df.columns = final_df.columns.swaplevel(0, 1)
final_df.sort_index(axis=1, level=0, inplace=True)

final_df.head()

Below is the output I got:

Please change the parameters like fields, and start/end day based on the data you need.

Adding complete code for @Jirapongse answers as well, in case you need it:

# Installing Packages
import DatastreamDSWS as DSWS
import pandas as pd
import numpy as np
import datetime
from IPython.display import display
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

# Loading RICs name from file
stock_list = ",".join(pd.read_excel("Sample.xlsx")['CodiceDataStream'])

# Declaring Variables and Session Initialization
ds = DSWS.Datastream(username ="******", password='******')
fields = ['(X)~EUR']

# Load Data
df = ds.get_data(",".join(stock_list), fields=['(X)~EUR'], start='-1Y', end='0D', freq='D').reset_index().droplevel('Currency', axis=1)

# Top 5 rows data
df.head()

Gurpreet · April 2022

Hi @andrea.negri,

What is the in the excel sheet (post it here). Does the get_data API work, when you manually input the RICS from the excel sheet?

Jirapongse · April 2022

@andrea.negri

I assume that you retrieved data for the following instruments.

MSACWFE,LBEURSE,LBGOBL$,LGBDGCE,IT0004782758,LU1694212348,FR0010149120,LU0452734238,LU0764816798

MSACWFE, LBEURSE, LBGOBL$, and LGBDGCE don't have data for the P data type.

LU1694212348 has data from 10/24/2017.

You can use the following code to get data from the default data type.

Price = ds.get_data(ds_rics_str, fields=['(X)~EUR'], start='-5Y', end='0D', freq='D')

The output is:

andrea.negri · April 2022

I can't attached the file there is an error in upload. In the file there are only two columns:

andrea.negri · April 2022

The second code show this error:

'NoneType' object has no attribute 'reset_index'

andrea.negri · April 2022

I have the problem in first code when code don't exist (example LU2348335964). The code in for cycle generate error

Shadab Hussain · April 2022

Sorry to hear that, I tested this for the RICs which you can see in the image. I'm sending you an email, please do share the excel sheet so that we can test the solution.

Shadab Hussain · April 2022

Hi @andrea.negri,

I loaded the excel file and found for a few ISIN, there might be the possibility that data is not available for it, & it was also pointed out by @Jirapongse in his answer, so in that case, it will through an error if you will try to get data for all ISIN at once which is in the case of my previous answer.
To handle the exception here I have tried loading data for each ISIN individually, and if it throws an error that means there is no data available for it, so for those ISIN, I have created a separate dataframe.
Please find the code below for it:

# Installing Packages
import DatastreamDSWS as DSWS
import pandas as pd
import numpy as np
import datetime
from IPython.display import display
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

# Loading RICs name from file
stock_list = list(pd.read_excel("Dati.xlsx")['Codice_ISIN'])

# Declaring Variables
ds = DSWS.Datastream(username ="*****", password='*****')
fields = ['(X)~EUR']
final_df = pd.DataFrame()
frames = []
err_frames = []
year = 2018
start_day = 1
end_day = 20
month = 10
start_date = datetime.datetime(year, month, start_day).strftime('%Y-%m-%d')
end_date = datetime.datetime(year, month, end_day).strftime('%Y-%m-%d')

# Loading Data
for stock in stock_list:
    try:
        df = ds.get_data(tickers=stock, fields=fields, start=start_date, end=end_date)[stock].reset_index().droplevel('Currency', axis=1)
        df['Instrument'] = stock
        frames.append(df) # Dataframes of those ISIN for which data is available
    except:
        df = ds.get_data(tickers=stock,fields=fields,start=start_date, end=end_date)
        df['Instrument'] = stock
        err_frames.append(df) # Dataframes of those ISIN for which data is not available

# Processing ISIN with data
temp_df = pd.concat(frames)    
final_df = temp_df.pivot_table(index='Dates', columns='Instrument', values=fields)
final_df.columns = final_df.columns.swaplevel(0, 1)
final_df.sort_index(axis=1, level=0, inplace=True)
final_df.head()

# Processing ISIN without data
err_temp_df = pd.concat(err_frames)    
err_temp_df.head()

When I ran this code for your file, I found for 2844 ISIN, data was available for the period which I have used in the code, and for 371 ISIN, data was not available.

Also please note, that loading data individually for each ISIN will be time taking process, so you may use threading functionality in python to speed up the execution time.

andrea.negri · April 2022

Why have you indicate in for cycle 1420?

Shadab Hussain · April 2022

Updated it, that I was doing some other testing and forgot to remove it

andrea.negri · April 2022

Nice!! The code work.

One question I don't know this threading functionality in python. You could explain me if is possible for you.

Shadab Hussain · April 2022

@andrea.negri you can go through this article for understanding multithreading and multiprocessing functionality in python. Let me know if you have further queries or finding something difficult to understand.

Shadab Hussain · April 2022

Hi @andrea.negri,

The error that you were getting with the second code was because of the DSWS limit. You can read more about DSWS User Stats and Limits here.

So I have written an alternative code to split the request which can load data for 50 ISIN (instruments) at a time, and will be a little faster than the previous code:

# Installing Packages
import DatastreamDSWS as DSWS
import pandas as pd
from IPython.display import display
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

# Loading RICs name from file
stock_list = list(pd.read_excel("Dati.xlsx")['Codice_ISIN'])

# Declaring Variables
ds = DSWS.Datastream(username ="*****", password='*****')
fields = ['(X)~EUR']
final_df = pd.DataFrame()
frames = []

# Load Data
for i in range(0, len(stock_list), 50):
    stock = stock_list[i:i+50]
    df = ds.get_data(",".join(stock), fields=['(X)~EUR'], start='-1Y', end='0D', freq='D')
    frames.append(df)

# Concatenate DataFrames
final_df = pd.concat(frames)

Let us know if any further queries you have with DSWS API.

andrea.negri · April 2022

Hi
@Shadab Hussain I understand article but i some problem to implementing threading in your code. Could you help me?

Shadab Hussain · April 2022

Sure @andrea.negri, let me post a new answer using multiprocessing. Thanks for your patience

Shadab Hussain · April 2022

Hi @andrea.negri,
Please find the below code using multiprocessing to fasten your data pull request:

# Installing Packages
import DatastreamDSWS as DSWS
import pandas as pd
from multiprocessing.pool import ThreadPool as Pool

# Loading RICs name from file
stock_list = list(pd.read_excel("../DataStream/Dati.xlsx")['Codice_ISIN'])
# Declaring Variables
ds = DSWS.Datastream(username ="*****", password='*****')
fields = ['(X)~EUR']
final_df = pd.DataFrame()
frames = []

# Multiprocessing Implementation
pool_size = 20  # your "parallelness"

# define worker function before a Pool is instantiated
def get_data(stock):
    df = ds.get_data(",".join(stock), fields=['(X)~EUR'], start='-1Y', end='0D', freq='D')
    frames.append(df)

pool = Pool(pool_size)

for i in range(0, len(stock_list), 50):
    stock = stock_list[i:i+50]
    pool.apply_async(get_data, (stock,))

pool.close()
pool.join()

# Concat Dataframe
final_df = pd.concat(frames)

# Top 5 rows
final_df.head()

andrea.negri · April 2022

Attention you don't have hidden username and your password.

andrea.negri · April 2022

I'm using the first solution not the second.

Shadab Hussain · April 2022

You can use the second solution as well, that will be faster to process request in chunks of 50

andrea.negri · April 2022

For all list the time is the same 17 minuts first solution and second solution. For all isin with 5 years price history.

I think there is an error because the file csv size created as 287.000 KB and before 21.000 KB

Shadab Hussain · April 2022

For me its taking ~3mins, please see the image below-

In your case, might be the number of ISIN has increased for you in the new excel file.

andrea.negri · April 2022

This is the code and the file contain 3215 ISIN code.

andrea.negri · April 2022

And now I have this error

Shadab Hussain · April 2022

You have exceeded the size limit, please refer to DSWS User Stats and Limits here.

Download Price of more instruments with DSWS

Best Answer

Answers

Categories