Pandas Left Merge Keeping Data In Right Dataframe On Duplicte Columns
Solution 1:
Frankenstein Answer
df[['ser','no']].merge(df2,'left').set_axis(df.index).fillna(df)
ser no c d
0001.0NaN1011.0NaN2021.0NaN3101.0NaN4111.0NaN51288.090.06201.0NaN7211.0NaN8221.0NaN
Explanation
I'm going to merge on the columns
['ser', 'no']
and don't want to specify in themerge
call. Also, I don't want goofy duplicate column names like'c_x'
and'c_y'
so I slice only columns that I want in common then mergedf[['ser', 'no']].merge(df2, 'left')
When I merge, I want only rows from the left dataframe. However,
merge
usually produces a number of rows vastly different from the original dataframes and therefore produces a newindex
. However, NOTE this is assuming the right dataframe (df2
) has NO DUPLICATES with respect['ser', 'no']
then a'left'
merge
should produce the same exact number of rows as the left dataframe (df
). But it won't have the sameindex
necessarily. It turns out that in this example it does. But I don't want to take chances. So I useset_axis
set_axis(df.index)
Finally, since the resulting dataframe has the same
index
andcolumns
asdf
. I can fill in the missing bits with:fillna(df)
Solution 2:
Update: What you are looking for is combine_first
:
(df2.set_index(['ser','no'])
.combine_first(df.set_index(['ser','no']))
.reset_index()
)
You can also try concat
, which is more similar to 'outer'
merge when the pair ser,no
are unique valued.
pd.concat([df2,df]).groupby(['ser','no'], as_index=False).first()
Output:
ser no c d
0001NaN1011NaN2021NaN3101NaN4111NaN5128890.06201NaN7211NaN8221NaN
Post a Comment for "Pandas Left Merge Keeping Data In Right Dataframe On Duplicte Columns"