Refinitiv.data proper fields names

vitali · April 4

Greetings,

I am currently involved in a project aimed at retrieving the closing price for companies that have been delisted due to bankruptcy, specifically the day following their bankruptcy announcement. The primary methodology I employ involves checking whether 'TR.MnAHasBankruptcy' is true. If this condition is met, we proceed to use 'TR.MnAHasBankruptcy.date' to extract the closing price on the subsequent day, utilizing "TR.PriceCloseDate".

I have encountered a challenge in executing the code intended for this purpose. For reference, the dataframe columns available for extraction are as follows: Index(['Instrument', 'Delisted Quote Flag', 'RetireDate', 'Bankruptcy Flag', 'Date', 'Acquisition Techniques', 'Date Effective or Unconditional', 'Price Close', 'Date'], dtype='object'). It is noteworthy that both 'TR.MnAHasBankruptcy.date' and "TR.PriceCloseDate" correspond to the same field name "Date".

I am seeking a solution to address this ambiguity between field names. What strategies could potentially resolve this issue?

Thank you!

import refinitiv.data as rd
import pandas as pd

rd.open_session()
results = rd.discovery.search(
    view=rd.discovery.Views.EQUITY_QUOTES,
    top=500,
    filter="(IssuerOAPermID eq '4295899979' and RCSAssetCategoryLeaf eq 'Ordinary Share')",
    select="RIC, DTSubjectName, RCSAssetCategoryLeaf, ExchangeName, ExchangeCountry"
)

df = results

substrings = ['.OQ', '.O', '.N', '.A']

def is_valid_ric(ric):
    for ending in substrings:
        if ending in ric:
            pos = ric.find(ending)
            if len(ric) == pos + len(ending) or (len(ric) > pos + len(ending) and ric[pos + len(ending)] == '^'):
                return True
    return False

filtered_df = df[df['RIC'].apply(is_valid_ric)]

rics = filtered_df['RIC'].tolist()

df = rd.get_data(
    universe=rics,
    fields=[
        'TR.IsDelistedQuote',
        'DELIST_DAT',
        'TR.RetireDate',
        'TR.MnAHasBankruptcy',
        'TR.MnAHasBankruptcy.date',
        "TR.MnAAcquisitionTechnique(Concat='|')",
        'TR.MnADateUnconditional',
        "TR.PriceClose",
        "TR.PriceCloseDate",
    ],
    parameters={}
)

# Filtering DataFrame where 'TR.MnAHasBankruptcy' is True
bankruptcy_df = df[df['Bankruptcy Flag'] == True].copy()

# Adding a new column for the target date, which is one day after 'TR.MnAHasBankruptcy.date'
bankruptcy_df['TargetDate'] = pd.to_datetime(bankruptcy_df['TR.MnAHasBankruptcy.date']) + pd.Timedelta(days=1)

# Prepare a DataFrame to hold the final results
final_columns = ['RIC', 'TR.MnAHasBankruptcy.date', 'TargetDate',  'Price Close']
final_df = pd.DataFrame(columns=final_columns)

# Iterate through the bankruptcy DataFrame
for index, row in bankruptcy_df.iterrows():
    target_date = row['TargetDate']
    ric = row['RIC']
    # Attempt to find the 'TR.PriceClose' where 'TR.PriceCloseDate' matches 'TargetDate'
    price_close_row = df[(df['RIC'] == ric) & (pd.to_datetime(df['TR.PriceCloseDate']) == target_date)]
    if not price_close_row.empty:
        price_close = price_close_row.iloc[0]['Price Close']
        final_df = final_df.append({
            'RIC': ric,
            'TR.MnAHasBankruptcy.date': row['TR.MnAHasBankruptcy.date'],
            'TargetDate': target_date,
            'Price Close': price_close
        }, ignore_index=True)
    else:
        # Handle the case where no matching 'TR.PriceClose' is found
        final_df = final_df.append({
            'RIC': ric,
            'TR.MnAHasBankruptcy.date': row['TR.MnAHasBankruptcy.date'],
            'TargetDate': target_date,
            'Price Close': "<NA>"
        }, ignore_index=True)

print(final_df)
final_df.to_csv("./bankruptcy_price_close.csv")

rd.close_session()

jonathan.legrand · April 4

Hi @vitali ,

If I understand your request properly, you would simply just like to diferenciate between the two 'Date' columns? Do let me know if I missunderstood.

There are 2 ways that come to mind. The 1st is to use the `use_field_names_in_headers` argument:

df = rd.get_data(
    universe=rics,
    fields=[
        'TR.IsDelistedQuote',
        'DELIST_DAT',
        'TR.RetireDate',
        'TR.MnAHasBankruptcy',
        'TR.MnAHasBankruptcy.date',
        "TR.MnAAcquisitionTechnique(Concat='|')",
        'TR.MnADateUnconditional',
        "TR.PriceClose",
        "TR.PriceCloseDate",
    ],
    parameters={},
    use_field_names_in_headers=True)

The 2nd is to rename the columns with

for i in range(len(my_list)):
    if my_list[i] == 'Date':
        my_list[i] = replacement_values[count]
        count += 1
        if count == len(replacement_values):
            break

It's rather ugly but it does the trick:

import refinitiv.data as rd
import pandas as pd

rd.open_session()
results = rd.discovery.search(
    view=rd.discovery.Views.EQUITY_QUOTES,
    top=500,
    filter="(IssuerOAPermID eq '4295899979' and RCSAssetCategoryLeaf eq 'Ordinary Share')",
    select="RIC, DTSubjectName, RCSAssetCategoryLeaf, ExchangeName, ExchangeCountry"
)

df = results

substrings = ['.OQ', '.O', '.N', '.A']

def is_valid_ric(ric):
    for ending in substrings:
        if ending in ric:
            pos = ric.find(ending)
            if len(ric) == pos + len(ending) or (len(ric) > pos + len(ending) and ric[pos + len(ending)] == '^'):
                return True
    return False

filtered_df = df[df['RIC'].apply(is_valid_ric)]

rics = filtered_df['RIC'].tolist()

df = rd.get_data(
    universe=rics,
    fields=[
        'TR.IsDelistedQuote',
        'DELIST_DAT',
        'TR.RetireDate',
        'TR.MnAHasBankruptcy',
        'TR.MnAHasBankruptcy.date',
        "TR.MnAAcquisitionTechnique(Concat='|')",
        'TR.MnADateUnconditional',
        "TR.PriceClose",
        "TR.PriceCloseDate",
    ],
    parameters={},)

my_list = df.columns.to_list()

replacement_values = ["MnAHasBankruptcyDate", "PriceCloseDate"]
count = 0

for i in range(len(my_list)):
    if my_list[i] == 'Date':
        my_list[i] = replacement_values[count]
        count += 1
        if count == len(replacement_values):
            break

df.columns = my_list

df

vitali · April 4

Greetings, @jonathan.legrand!
Thank you for your prompt response.

Upon reviewing your feedback, I've realized that it doesn't fully address my requirements. To provide some context, my code utilizes specific field names like 'Bankruptcy Flag' and 'Price Close', which are accurately retrieved using the "df.columns" method. However, I've retained the original names 'TR.MnAHasBankruptcy.date' and 'TR.PriceCloseDate' unchanged because both include the term 'Date', leading to potential conflicts. Due to this similarity, crafting an algorithm that distinguishes between these similarly named fields poses a challenge, as it results in errors.

So, if I leave 'TR.MnAHasBankruptcy.date' and 'TR.PriceCloseDate' unchanged I receive an error:

If I replace 'TR.MnAHasBankruptcy.date' and 'TR.PriceCloseDate' with their proper name 'Date', the error is:

Refinitiv.data proper fields names

Best Answer

Answers

Categories