Python Pandas -remove rows based on given value in cell?

grzeszczukmaciek · December 2021

I have following script which removes rows in which date start with 202110 in column DATE OF OPERATION. I understand that space in the name of column is not allowed so script also replace space by _ and then after rows are removed it add back the space. For some reason I couldn't attach csv here so please see example below:

Column1DATE OF OPERATIONNAVpUnitsdsasa2021120124324dsasa2021102223232sd20211022232sd202110223-2802.6667

The code is as below and the error I'm getting is: KeyError: 'DATE_OF_OPERATION'

Could you advice what is the case of the error? - Thank you

import os
import glob
import pandas as pd
from pathlib import Path
source_files = sorted(Path(r'/Users/maciejgrzeszczuk/Downloads/').glob('*.csv'))

for file in source_files:
 df = pd.read_csv(file)
 df.columns = df.columns.str.replace(' ', '_')
 df = df[~df['DATE_OF_OPERATION'].astype(str).str.startswith('202110')]
 df.columns = df.columns.str.replace('_', ' ')
 name, ext = file.name.split('.')
 df.to_csv(f'{name}.{ext}', index=0)

pf · January 2022

Hi @grzeszczukmaciek ,

This is a pure Python question (and not an issue related to our APIs), but let's try to propose a solution.

spaces are allowed in column names, so your code could be simplified to:

for file in source_files:
 df = pd.read_csv(file)
 df = df[~df['DATE OF OPERATION'].astype(str).str.startswith('202110')]
 name, ext = file.name.split('.')
 df.to_csv(f'{name}.{ext}', index=0)

On my side, it woked with following file.csv content:

DATE OF OPERATION,ASK,BID,TRADE PRICE
20210901,10,12,11
20211001,11,13,12
20211101,12,14,13

Python Pandas -remove rows based on given value in cell?

Best Answer

Categories