Skip to content Skip to sidebar Skip to footer

Optimization Of The Python Code Using Numpy And Pandas

I have the following code working: import numpy as np import pandas as pd colum1 = [0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05] colum2 = [1,2,3,4,5,6,7,8,9,10,11,1

Solution 1:

The first step is to remove the loop over the index and replace those tests for numbers greater than 0 with np.maximum. This works because np.where(a > 0, a, 0) for our purposes is equivalent to np.maximum(0, a).

At the same time define the longer expressions separately to make your code readable:

s1 = df['colum4'] - (df['result'].shift(1) * (df['colum1'] * df['colum3']))
s2 = df['colum4'] - (df['result'].shift(1) * df['colum1'])

df['result'] = np.where(df['colum2'] <= 5,
                        np.where(df['colum2'] == 1, df['colum4'],
                                 np.maximum(0, s1)),
                        np.maximum(0, s2))

The next step is to use np.select to remove the nested np.where statements:

m1 = df['colum2'] <= 5
m2 = df['colum2'] == 1

conds = [m1 & m2, m1 & ~m2]
choices = [df['colum4'], np.maximum(0, s1)]

df['result'] = np.select(conds, choices, np.maximum(0, s2))

This version will be more manageable.


Post a Comment for "Optimization Of The Python Code Using Numpy And Pandas"