How To Aggregate Only The Numerical Columns In A Mixed Dtypes Dataframe
I have a mixed pd.DataFrame: import pandas as pd import numpy as np df = pd.DataFrame({ 'A' : 1., 'B' : pd.Timestamp('20130102'), 'C' : pd
Solution 1:
By using select_dtypes
:
df.groupby(list(df.select_dtypes(exclude=[np.number]))).agg(np.median).reset_index()
Or something like this:
df1 = df.groupby('B',as_index=False).agg(np.median)
pd.concat([df1,df.drop_duplicates(['B']).drop(list(df1),1).reset_index(drop=True)],axis=1)
Solution 2:
If 'C', 'F' are the same for each value of 'B', then you can include it in the groupby columns, like this:
df.groupby(['B','C','F']).agg(np.median).reset_index()
Or as @BradSolomn suggests:
df.groupby(['B','C','F'], as_index=False).agg(np.median)
Output:
B C F A D
0 2013-01-02 2018-01-01 foo 1.0 0.392723
If not, then you'll need to aggregrate 'C', 'F' somehow for example get the get the first value from 'C', 'F'
df.groupby('B').agg({'D':np.median,'A':np.median,'C':'first','F':'last'}).reset_index()
B C F A D
0 2013-01-02 2018-01-01 foo 1.0 0.392723
Post a Comment for "How To Aggregate Only The Numerical Columns In A Mixed Dtypes Dataframe"