Skip to content Skip to sidebar Skip to footer

Are Values In One Dataframe In Bins Of Another Dataframe?

I have a dataframe named loc_df with two columns of bins that looks like this... > loc_df loc_x_bin loc_y_bin (-20, -10] (0, 50] (-140, -130]

Solution 1:

UPDATE2: if you want to check that bothx and y belong to bins from the same row in df_loc (or loc_df):

xstep = 10
ystep = 50

In [201]: (df.assign(bin=(pd.cut(df.loc_x, np.arange(-500, 500, xstep)).astype(str)
   .....:                 +
   .....:                 pd.cut(df.loc_y, np.arange(-500, 500, ystep)).astype(str)
   .....:                )
   .....:           )
   .....: )['bin'].isin(df_loc.sum(axis=1))
Out[201]:
0True1False2True3False4False
Name: bin, dtype: bool

Explanation:

In [202]: (df.assign(bin=(pd.cut(df.loc_x, np.arange(-500, 500, xstep)).astype(str)
   .....:                 +
   .....:                 pd.cut(df.loc_y, np.arange(-500, 500, ystep)).astype(str)
   .....:                )
   .....:           )
   .....: )
Out[202]:
   loc_x  loc_y                       bin
0    -1525         (-20, -10](0, 50]
13035           (20, 30](0, 50]
25    -45           (0, 10](-50, 0]
3   -135   -200  (-140, -130](-250, -200]
4525            (0, 10](0, 50]

In [203]: df_loc.sum(axis=1)
Out[203]:
0         (-20, -10](0, 50]
1    (-140, -130](100, 150]
2           (0, 10](-50, 0]
dtype: object

UPDATE: if you want to check whether x belongs to loc_x_bin and y belongs to loc_y_bin (not necessarily from the same row in df_loc):

if df_loc.dtypes doesn't show category for both columns, then you may want to convert your categories to category dtype first:

df_loc.loc_x_bin = df_loc.loc_x_bin.astype('category')
df_loc.loc_y_bin = df_loc.loc_y_bin.astype('category')

then you can categorize your columns in the df "on the fly":

xstep = 10
ystep = 50

df['in_bins'] = (   (pd.cut(df.loc_x, np.arange(-500, 500, xstep)).isin(df_loc.loc_x_bin))
                    &
                    (pd.cut(df.loc_y, np.arange(-500, 500, ystep)).isin(df_loc.loc_y_bin))
                )

Test:

In [130]: df['in_bins'] = (   (pd.cut(df.loc_x, np.arange(-500, 500, xstep)).isin(df_loc.loc_x_bin))
   .....:                     &
   .....:                     (pd.cut(df.loc_y, np.arange(-500, 500, ystep)).isin(df_loc.loc_y_bin))
   .....:                 )

In [131]: df
Out[131]:
   loc_x  loc_y in_bins
0    -1525    True
13035   False
25    -45    True
3   -135   -200   False

Post a Comment for "Are Values In One Dataframe In Bins Of Another Dataframe?"