Pandas Left Merge Keeping Data In Right Dataframe On Duplicte Columns
Solution 1:
Frankenstein Answer
df[['ser','no']].merge(df2,'left').set_axis(df.index).fillna(df)
ser no c d
0001.0NaN1011.0NaN2021.0NaN3101.0NaN4111.0NaN51288.090.06201.0NaN7211.0NaN8221.0NaNExplanation
I'm going to merge on the columns
['ser', 'no']and don't want to specify in themergecall. Also, I don't want goofy duplicate column names like'c_x'and'c_y'so I slice only columns that I want in common then mergedf[['ser', 'no']].merge(df2, 'left')When I merge, I want only rows from the left dataframe. However,
mergeusually produces a number of rows vastly different from the original dataframes and therefore produces a newindex. However, NOTE this is assuming the right dataframe (df2) has NO DUPLICATES with respect['ser', 'no']then a'left'mergeshould produce the same exact number of rows as the left dataframe (df). But it won't have the sameindexnecessarily. It turns out that in this example it does. But I don't want to take chances. So I useset_axisset_axis(df.index)Finally, since the resulting dataframe has the same
indexandcolumnsasdf. I can fill in the missing bits with:fillna(df)
Solution 2:
Update: What you are looking for is combine_first:
(df2.set_index(['ser','no'])
.combine_first(df.set_index(['ser','no']))
.reset_index()
)
You can also try concat, which is more similar to 'outer' merge when the pair ser,no are unique valued.
pd.concat([df2,df]).groupby(['ser','no'], as_index=False).first()
Output:
ser no c d
0001NaN1011NaN2021NaN3101NaN4111NaN5128890.06201NaN7211NaN8221NaN
Post a Comment for "Pandas Left Merge Keeping Data In Right Dataframe On Duplicte Columns"