Skip to content Skip to sidebar Skip to footer

How To Add/append A Row To A Particular Partition In The Dask Dataframe?

I want to append a row to a particular partition in dask dataframes. I have tried out many methods but none of them are possible. Can anyone help me on this. Thanks in advance I tr

Solution 1:

Using map_partitions you can modify that particular partition.

Then create a new frame by replacing the modified partition in the dataframe by switching to delayed objects, replacing the delayed object into the list, and then switching back to dask dataframe.

defappend_row_dict(df, row_dict):
    small_df = pd.DataFrame(row_dict)
    return df.append(small_df)
    
p_df = pd.DataFrame({'a':np.arange(0,10)})

dask_df = dd.from_pandas(p_df,npartitions=4)
part_to_change = 1

new_partion = dask_df.get_partition(part_to_change).map_partitions(append_row_dict,{'a':[-1]})
list_of_delayed = dask_df.to_delayed()

## we only have 1 delayed object for 1 partitionassert new_partion.npartitions==1
list_of_delayed[part_to_change]=new_partion.to_delayed()[0]

new_dask_df = dd.from_delayed(list_of_delayed, meta=dask_df._meta)
new_dask_df.get_partition(part_to_change).compute()
    a
3344550   -1

Post a Comment for "How To Add/append A Row To A Particular Partition In The Dask Dataframe?"