Skip to content Skip to sidebar Skip to footer

How To Count Overlapping Datetime Intervals In Pandas?

I have a following DataFrame with two datetime columns: start end 0 01.01.2018 00:47 01.01.2018 00:54 1 01.01.2018 00:52 01.01.2018 01:03 2 01.01.2018

Solution 1:

Use Series.cumsum with Series.map (or Series.replace):

new_df=df.melt(var_name='status',value_name='time').sort_values('time')new_df['counter']=new_df['status'].map({'start':1,'end':-1}).cumsum()print(new_df)statustimecounter0start2018-01-01 00:47:00        11start2018-01-01 00:52:00        211end2018-01-01 00:54:00        12start2018-01-01 00:55:00        23start2018-01-01 00:57:00        313end2018-01-01 00:59:00        24start2018-01-01 01:00:00        312end2018-01-01 01:03:00        25start2018-01-01 01:07:00        315end2018-01-01 01:12:00        214end2018-01-01 01:16:00        116end2018-01-01 01:24:00        06start2018-01-01 01:33:00        17start2018-01-01 01:34:00        28start2018-01-01 01:37:00        39start2018-01-01 01:38:00        417end2018-01-01 01:38:00        310start2018-01-01 01:39:00        419end2018-01-01 01:41:00        320end2018-01-01 01:41:00        218end2018-01-01 01:47:00        121end2018-01-01 01:55:00        0

We could also use numpy.cumsum:

new_df['counter'] = np.where(new_df['status'].eq('start'),1,-1).cumsum()

Solution 2:

Just putting everything together to help newbies like me.

import pandas as pd
import numpy as np

df = pd.read_csv('startend.csv', sep=',' , index_col=0 , infer_datetime_format=True)
df = df.stack().to_frame()
df = df.reset_index(level=1)
df.columns = ['status', 'time']
df = df.reset_index().drop('index', axis=1)
df['time'] = pd.to_datetime(df['time'])
df = df.sort_values('time')

new_df = pd.melt(df,id_vars="time",value_name="status")
new_df.drop(columns=["variable"],inplace=True)
new_df['counter'] = np.where(new_df['status'].eq('start'),1,-1).cumsum()
print(new_df)

Post a Comment for "How To Count Overlapping Datetime Intervals In Pandas?"