Sample Maximum Possible Data Points From Distribution To New Distribution

September 29, 2022 Post a Comment

Solution 1:

You can try calculate the maximal total count for each week, then multiply that with the desired distribution. The idea is

Devide the Count by Desired Distribution to get the possible total
Calculate the minimal possible total for each week with groupby
Then multiply the possible totals with the Desired Distribution to get the sample numbers.

In code:

df['new_count'] = (df['Count'].div(df['Desired Distribution'])
    .groupby(df['Week']).transform('min')
    .mul(df['Desired Distribution'])
    //1
).astype(int)

Output:

   Week Class  Count  Distribution  Desired Distribution  new_count
0     1     A    954          0.36                  0.55        954
1     1     B    554          0.21                  0.29        503
2     1     C   1145          0.43                  0.16        277
3     2     A    454          0.21                  0.55        454
4     2     B    944          0.44                  0.29        239
5     2     C    748          0.35                  0.16        132

Python Dummy

Sample Maximum Possible Data Points From Distribution To New Distribution

Solution 1:

Post a Comment for "Sample Maximum Possible Data Points From Distribution To New Distribution"