I want to get data using python API for multiple fields and RICs for a time period

The output that the get_history function returns is not in the best format. The first column contains dates, the first row contains RICs, and the second row contains fields. I want to have the data in the following format: the first column should contain dates, the second column should contain RICs, and the other columns should contain the field values corresponding to the date and RIC for each observation. How can I format it this way in Python? Additionally, there is another problem: the get_history function sometimes returns multiple rows for the same date and RIC, with some field values displayed in the first row and some field values displayed in the second row. I don't know whether this is a bug or a feature. The output format I'd like to have is in the picture.

1716699640313.png

Best Answer

  • raksina.samasiri
    Answer ✓

    Hi @vsafak ,

    The dataframe can be formatted with Python code, for example

    df = rd.get_history(universe=["SPYQ202454000.U^E24", "SPYQ202453900.U^E24"], fields= ['TR.OPENPRICE', 'TR.HIGHPRICE', 'TR.LOWPRICE', 'TR.CLOSEPRICE', 'TR.OPENINTEREST', 'TR.IMPLIEDVOLATILITY','TR.DELTA','TR.THETA','TR.GAMMA','TR.VEGA','TR.RHO'],
    interval="1d",
    start="2024-01-01",
    end="2024-06-01")

    format_df = pd.DataFrame()
    idx = pd.IndexSlice
    first_loop = True

    for ric in rics:
    # get the dataframe of each RIC in the RICs list
    temp_df = df.loc[:,idx[ric,:]]
    temp_df.columns = temp_df.columns.droplevel()
    temp_df.reset_index(inplace=True)
    temp_df['RIC'] = ric

    # put this value in the format_df dataframe
    if first_loop:
    format_df = temp_df
    first_loop = False
    else:
    format_df = format_df.append(temp_df)
    display(format_df)

    1716909844640.png

Answers

  • @vsafak

    Please share the code that you are using to retrieve the data.

  • df = rd.get_history(universe=["SPYQ202454000.U^E24", "SPYQ202453900.U^E24"], fields= ['TR.OPENPRICE', 'TR.HIGHPRICE', 'TR.LOWPRICE', 'TR.CLOSEPRICE', 'TR.OPENINTEREST', 'TR.IMPLIEDVOLATILITY','TR.DELTA','TR.THETA','TR.GAMMA','TR.VEGA','TR.RHO'],

    interval="1d",

    start="2024-01-01",

    end="2024-06-01")

  • Thanks. For those who are using pandas 2.0 or above, .append was replaced by ._append. After this minor modification, the code above works.