How To Count Overlapping Datetime Intervals In Pandas?
I have a following DataFrame with two datetime columns: start end 0 01.01.2018 00:47 01.01.2018 00:54 1 01.01.2018 00:52 01.01.2018 01:03 2 01.01.2018
Solution 1:
Use Series.cumsum
with Series.map
(or Series.replace
):
new_df=df.melt(var_name='status',value_name='time').sort_values('time')new_df['counter']=new_df['status'].map({'start':1,'end':-1}).cumsum()print(new_df)statustimecounter0start2018-01-01 00:47:00 11start2018-01-01 00:52:00 211end2018-01-01 00:54:00 12start2018-01-01 00:55:00 23start2018-01-01 00:57:00 313end2018-01-01 00:59:00 24start2018-01-01 01:00:00 312end2018-01-01 01:03:00 25start2018-01-01 01:07:00 315end2018-01-01 01:12:00 214end2018-01-01 01:16:00 116end2018-01-01 01:24:00 06start2018-01-01 01:33:00 17start2018-01-01 01:34:00 28start2018-01-01 01:37:00 39start2018-01-01 01:38:00 417end2018-01-01 01:38:00 310start2018-01-01 01:39:00 419end2018-01-01 01:41:00 320end2018-01-01 01:41:00 218end2018-01-01 01:47:00 121end2018-01-01 01:55:00 0
We could also use numpy.cumsum
:
new_df['counter'] = np.where(new_df['status'].eq('start'),1,-1).cumsum()
Solution 2:
Just putting everything together to help newbies like me.
import pandas as pd
import numpy as np
df = pd.read_csv('startend.csv', sep=',' , index_col=0 , infer_datetime_format=True)
df = df.stack().to_frame()
df = df.reset_index(level=1)
df.columns = ['status', 'time']
df = df.reset_index().drop('index', axis=1)
df['time'] = pd.to_datetime(df['time'])
df = df.sort_values('time')
new_df = pd.melt(df,id_vars="time",value_name="status")
new_df.drop(columns=["variable"],inplace=True)
new_df['counter'] = np.where(new_df['status'].eq('start'),1,-1).cumsum()
print(new_df)
Post a Comment for "How To Count Overlapping Datetime Intervals In Pandas?"