Strange Behavior With Dataframe Copy
Consider this code: In [16]: data = [['Alex',10],['Bob',12],['Clarke',13]] In [17]: df = pd.DataFrame(data,columns=['Name','Age']) Out[18]: Name Age 0 Alex 10 1 Bob
Solution 1:
Use -
df_new = df.copy()
OR
df_new = df.copy(deep=True)
This is the standard way of making a copy of a pandas
object’s indices and data.
From the pandas documentation
When deep=True (default), a new object will be created with a copy of the calling object’s data and indices. Modifications to the data or indices of the copy will not be reflected in the original object
Explanation
If you see the object IDs of the various DataFrames you create, you can clearly see what is happening.
When you write df_new = df, you are creating a variable named new_df
, and binding it with an object with same id as that of df
.
Example
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'])
df_new = df
df_copy = df.copy()
print("ID of old df: {}".format(id(df)))
print("ID of new df: {}".format(id(df_new)))
print("ID of copy df: {}".format(id(df_copy)))
Output
ID of old df: 113414664
ID of new df: 113414664
ID of copy df: 113414832
Post a Comment for "Strange Behavior With Dataframe Copy"