Skip to content Skip to sidebar Skip to footer

Strange Behavior With Dataframe Copy

Consider this code: In [16]: data = [['Alex',10],['Bob',12],['Clarke',13]] In [17]: df = pd.DataFrame(data,columns=['Name','Age']) Out[18]: Name Age 0 Alex 10 1 Bob

Solution 1:

Use -

df_new = df.copy()

OR

df_new = df.copy(deep=True)

This is the standard way of making a copy of a pandas object’s indices and data.

From the pandas documentation

When deep=True (default), a new object will be created with a copy of the calling object’s data and indices. Modifications to the data or indices of the copy will not be reflected in the original object

Explanation

If you see the object IDs of the various DataFrames you create, you can clearly see what is happening.

When you write df_new = df, you are creating a variable named new_df, and binding it with an object with same id as that of df.

Example

data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'])

df_new = df
df_copy = df.copy()
print("ID of old df: {}".format(id(df)))
print("ID of new df: {}".format(id(df_new)))
print("ID of copy df: {}".format(id(df_copy)))

Output

ID of old df: 113414664
ID of new df: 113414664
ID of copy df: 113414832

Post a Comment for "Strange Behavior With Dataframe Copy"