Optimization Of The Python Code Using Numpy And Pandas
I have the following code working: import numpy as np import pandas as pd colum1 = [0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05] colum2 = [1,2,3,4,5,6,7,8,9,10,11,1
Solution 1:
The first step is to remove the loop over the index and replace those tests for numbers greater than 0 with np.maximum
. This works because np.where(a > 0, a, 0)
for our purposes is equivalent to np.maximum(0, a)
.
At the same time define the longer expressions separately to make your code readable:
s1 = df['colum4'] - (df['result'].shift(1) * (df['colum1'] * df['colum3']))
s2 = df['colum4'] - (df['result'].shift(1) * df['colum1'])
df['result'] = np.where(df['colum2'] <= 5,
np.where(df['colum2'] == 1, df['colum4'],
np.maximum(0, s1)),
np.maximum(0, s2))
The next step is to use np.select
to remove the nested np.where
statements:
m1 = df['colum2'] <= 5
m2 = df['colum2'] == 1
conds = [m1 & m2, m1 & ~m2]
choices = [df['colum4'], np.maximum(0, s1)]
df['result'] = np.select(conds, choices, np.maximum(0, s2))
This version will be more manageable.
Post a Comment for "Optimization Of The Python Code Using Numpy And Pandas"