Skip to content Skip to sidebar Skip to footer

Pandas Rolling Vs Scipy Kurtosis - Serious Numerical Inaccuracy

First and foremost, I'm sorry for the clearly not minimal examples that I listed below. I am fully aware this doesn't meet SO's minimally reproducible constraint, however, having b

Solution 1:

It looks like a bug in older Pandas version. I could reproduce on an old installation Python 3.6.2 64 bit on win32, Pandas 1.0.3, numpy 1.15.4:

>>> s3.rolling(20,min_periods=3).kurt().tail(10)
890         9.591071
891         9.591071
892         9.591071
893         9.591071
894        19.663685
895        15.248361
896        40.444894
897      1368.233241
898    251407.375343
899    902540.031652
dtype: float64

It seems to be fixed on my newer version, Python 3.8.4 64 bit, Pandas 1.2.2, numpy 1.20.1:

>>> s3.rolling(20,min_periods=3).kurt().tail(10)
890     9.591067
891     9.591067
892     9.591067
893     9.591067
894    19.663666
895    14.872262
896    14.147158
897    16.716989
898     7.037037
899    20.000000
dtype: float64

both installations on the same Windows 10 machine.

I cannot say which component (Pandas or numpy) is the cause. As your tests using numpy.stats.kurtosis give correct result, I would suspect Pandas, but without further analysis by Pandas experts (and I am not one) I cannot be affirmative.

IMHO, the most reasonable solution is either to upgrade your system, or add a fresh new independant Python installation with the last possible Pandas version.


Post a Comment for "Pandas Rolling Vs Scipy Kurtosis - Serious Numerical Inaccuracy"