Pandas Apply Function That Returns Two New Columns
I have a pandas dataframe that I would like to use an apply function on to generate two new columns based on the existing data. I am getting this error: ValueError: Wrong number o
Solution 1:
Based on your latest error, you can avoid the error by returning the new columns as a Series
def myfunc1(row):
C = row['A'] + 10
D = row['A'] + 50return pd.Series([C, D])
df[['C', 'D']] = df.apply(myfunc1 ,axis=1)
Solution 2:
Please be aware of the huge memory consumption and low speed of the accepted answer: https://ys-l.github.io/posts/2015/08/28/how-not-to-use-pandas-apply/ !
Using the suggestion presented there, the correct answer would be like this:
defrun_loopy(df):
Cs, Ds = [], []
for _, row in df.iterrows():
c, d, = myfunc1(row['A'])
Cs.append(c)
Ds.append(d)
return pd.Series({'C': Cs,
'D': Ds})
defmyfunc1(a):
c = a + 10
d = a + 50return c, d
df[['C', 'D']] = run_loopy(df)
Solution 3:
df['C','D']
is considered as 1 column rather than 2. So for 2 columns you need a sliced dataframe so use df[['C','D']]
df[['C', 'D']] = df.apply(myfunc1 ,axis=1)
A B C D
04614541511555
Or you can use chain assignment i.e
df['C'], df['D'] = df.apply(myfunc1 ,axis=1)
Solution 4:
Add extra brackets when querying for multiple columns.
import pandas as pd
import numpy as np
defmyfunc1(row):
C = row['A'] + 10
D = row['A'] + 50return [C, D]
df = pd.DataFrame(np.random.randint(0,10,size=(2, 2)), columns=list('AB'))
df[['C', 'D']] = df.apply(myfunc1 ,axis=1)
Solution 5:
It works for me:
def myfunc1(row):
C = row['A'] + 10
D = row['A'] + 50return C, D
df = pd.DataFrame(np.random.randint(0,10,size=(2, 2)), columns=list('AB'))
df[['C', 'D']] = df.apply(myfunc1, axis=1, result_type='expand')
df
add: ==>> result_type='expand',
regards!
Post a Comment for "Pandas Apply Function That Returns Two New Columns"