How To Add/append A Row To A Particular Partition In The Dask Dataframe?
I want to append a row to a particular partition in dask dataframes. I have tried out many methods but none of them are possible. Can anyone help me on this. Thanks in advance I tr
Solution 1:
Using map_partitions
you can modify that particular partition.
Then create a new frame by replacing the modified partition in the dataframe by switching to delayed objects, replacing the delayed object into the list, and then switching back to dask dataframe.
defappend_row_dict(df, row_dict):
small_df = pd.DataFrame(row_dict)
return df.append(small_df)
p_df = pd.DataFrame({'a':np.arange(0,10)})
dask_df = dd.from_pandas(p_df,npartitions=4)
part_to_change = 1
new_partion = dask_df.get_partition(part_to_change).map_partitions(append_row_dict,{'a':[-1]})
list_of_delayed = dask_df.to_delayed()
## we only have 1 delayed object for 1 partitionassert new_partion.npartitions==1
list_of_delayed[part_to_change]=new_partion.to_delayed()[0]
new_dask_df = dd.from_delayed(list_of_delayed, meta=dask_df._meta)
new_dask_df.get_partition(part_to_change).compute()
a
3344550 -1
Post a Comment for "How To Add/append A Row To A Particular Partition In The Dask Dataframe?"