Skip to content Skip to sidebar Skip to footer

Dealing With Non Float Numbers And Nan Values From Tables When Parsing Dynamically From A Website

How do I deal with nan values in a table from a website that updates periodically and some table values sometimes get nan I am talking about coronavirus website table of cases. I a

Solution 1:

Something to get you started. Do this after you read the dataframe:

cols=['NewCases','NewDeaths']
for col in cols:
    df[col] = df[col].apply(lambda x: str(x).replace('+',''))
    df[col] = df[col].apply(lambda x: str(x).replace(',',''))
    df[col] = df[col].replace('nan','')
    df[col] = pd.to_numeric(df[col])

'NewCases' and 'NewDeaths' are the columns you need to preprocess. So for each, I am replacing '+', ',' and 'nan' with ''. Then converting them to numeric type.

If you can explain what you need to do further, I might be able to help.

EDIT: fillna fills, in your case, any NaNs with values from a different dataframe(df1), df and df1 having the same structure/size. In your case df1 would be a previously fetched set of data.

df=df.fillna(df1)

Solution 2:

The fastest way to fix this problem is to change:

 if '+' in table.loc[k,'NewCases'] and table.loc[k,'NewCases'] is not 'nan' :

to

 if '+' in table.loc[k,'NewCases'].fillna('') and table.loc[k,'NewCases'] is not 'nan':

That way Nan values are cast to a string.

It's not the prettiest solution because it may ignore other edge cases (like an actual float being passed), but it should solve your immediate problem.


Post a Comment for "Dealing With Non Float Numbers And Nan Values From Tables When Parsing Dynamically From A Website"