Skip to content Skip to sidebar Skip to footer

Check One-on-one Relationship Between Two Columns

I have two columns A and B in a pandas dataframe, where values are repeated multiple times. For a unique value in A, B is expected to have 'another' unique value too. And each uniq

Solution 1:

Consider you have some dataframe:

d = df({'A': [1, 3, 1, 2, 1, 3, 2], 'B': [4, 6, 4, 5, 4, 6, 5]})

d has groupby method, which returns GroupBy object. This is the interface to group some rows by equal column value, for example.

gb = d.groupby('A')
 grouped_b_column = gb['B']

On grouped rows you could perform an aggregation. Lets find min and max value in every group.

res = grouped_b_column.agg([np.min, np.max])

>>> print(res)
   amin  amax
A            
144255366

Now we just should check that amin and amax are equal in every group, so every group consists of equal B fields:

res['amin'].equals(res['amax'])

If this check is OK, then for every A you have unique B. Now you should check the same criteria for A and B columns swapped.

Post a Comment for "Check One-on-one Relationship Between Two Columns"