Skip to content Skip to sidebar Skip to footer

Pandas Apply Function That Returns Two New Columns

I have a pandas dataframe that I would like to use an apply function on to generate two new columns based on the existing data. I am getting this error: ValueError: Wrong number o

Solution 1:

Based on your latest error, you can avoid the error by returning the new columns as a Series

def myfunc1(row):
    C = row['A'] + 10
    D = row['A'] + 50return pd.Series([C, D])

df[['C', 'D']] = df.apply(myfunc1 ,axis=1)

Solution 2:

Please be aware of the huge memory consumption and low speed of the accepted answer: https://ys-l.github.io/posts/2015/08/28/how-not-to-use-pandas-apply/ !

Using the suggestion presented there, the correct answer would be like this:

defrun_loopy(df):
    Cs, Ds = [], []
    for _, row in df.iterrows():
        c, d, = myfunc1(row['A'])
        Cs.append(c)
        Ds.append(d)
    return pd.Series({'C': Cs,
                      'D': Ds})

defmyfunc1(a):
    c = a + 10
    d = a + 50return c, d

df[['C', 'D']] = run_loopy(df)

Solution 3:

df['C','D'] is considered as 1 column rather than 2. So for 2 columns you need a sliced dataframe so use df[['C','D']]

df[['C', 'D']] = df.apply(myfunc1 ,axis=1)

    A  B   C   D
04614541511555

Or you can use chain assignment i.e

df['C'], df['D'] = df.apply(myfunc1 ,axis=1)

Solution 4:

Add extra brackets when querying for multiple columns.

import pandas as pd
import numpy as np

defmyfunc1(row):
    C = row['A'] + 10
    D = row['A'] + 50return [C, D]

df = pd.DataFrame(np.random.randint(0,10,size=(2, 2)), columns=list('AB'))

df[['C', 'D']] = df.apply(myfunc1 ,axis=1)

Solution 5:

It works for me:

def myfunc1(row):
    C = row['A'] + 10
    D = row['A'] + 50return C, D

df = pd.DataFrame(np.random.randint(0,10,size=(2, 2)), columns=list('AB'))

df[['C', 'D']] = df.apply(myfunc1, axis=1, result_type='expand')
df

add: ==>> result_type='expand',

regards!

Post a Comment for "Pandas Apply Function That Returns Two New Columns"