Working With Large Primes In Python
Solution 1:
For determining if a number is a prime, there a sieves and primality tests.
# for large numbers, xrange will throw an error.# OverflowError: Python int too large to convert to C long# to get over this:defmrange(start, stop, step):
while start < stop:
yield start
start += step
# benchmarked on an old single-core system with 2GB RAM.from math import sqrt
defis_prime(num):
if num == 2:
returnTrueif (num < 2) or (num % 2 == 0):
returnFalsereturnall(num % i for i in mrange(3, int(sqrt(num)) + 1, 2))
# benchmark is_prime(100**10-1) using mrange# 10000 calls, 53191 per second.# 60006 function calls in 0.190 seconds.
This seems to be the fastest. There is another version using not any
that you see,
defis_prime(num)
# ...returnnotany(num % i == 0for i in mrange(3, int(sqrt(num)) + 1, 2))
However, in the benchmarks I got 70006 function calls in 0.272 seconds.
over the use of all
60006 function calls in 0.190 seconds.
while testing if 100**10-1
was prime.
If you're needing to find the next highest prime, this method will not work for you. You need to go with a primality test, I have found the Miller-Rabin algorithm to be a good choice. It is a little slower the Fermat method, but more accurate against pseudoprimes. Using the above mentioned method takes +5 minutes on this system.
Miller-Rabin
algorithm:
from random import randrange
defis_prime(n, k=10):
if n == 2:
returnTrueifnot n & 1:
returnFalsedefcheck(a, s, d, n):
x = pow(a, d, n)
if x == 1:
returnTruefor i in xrange(s - 1):
if x == n - 1:
returnTrue
x = pow(x, 2, n)
return x == n - 1
s = 0
d = n - 1while d % 2 == 0:
d >>= 1
s += 1for i in xrange(k):
a = randrange(2, n - 1)
ifnot check(a, s, d, n):
returnFalsereturnTrue
Fermat
algoithm:
defis_prime(num):
if num == 2:
returnTrueifnot num & 1:
returnFalsereturnpow(2, num-1, num) == 1
To get the next highest prime:
defnext_prime(num):
if (not num & 1) and (num != 2):
num += 1if is_prime(num):
num += 2whileTrue:
if is_prime(num):
break
num += 2return num
print next_prime(100**10-1) # returns `100000000000000000039`# benchmark next_prime(100**10-1) using Miller-Rabin algorithm.1000 calls, 337 per second.
258669 function calls in2.971 seconds
Using the Fermat
test, we got a benchmark of 45006 function calls in 0.885 seconds.
, but you run a higher chance of pseudoprimes.
So, if just needing to check if a number is prime or not, the first method for is_prime
works just fine. It is the fastest, if you use the mrange
method with it.
Ideally, you would want to store the primes generated by next_prime
and just read from that.
For example, using next_prime
with the Miller-Rabin
algorithm:
print next_prime(10^301)
# prints in 2.9s on the old single-core system, opposed to fermat's 2.8s
1000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000531
You wouldn't be able to do this with return all(num % i for i in mrange(3, int(sqrt(num)) + 1, 2))
in a timely fashion. I can't even do it on this old system.
And to be sure that next_prime(10^301)
and Miller-Rabin
yields a correct value, this was also tested using the Fermat
and the Solovay-Strassen
algorithms.
See: fermat.py, miller_rabin.py, and solovay_strassen.py on gist.github.com
Edit: Fixed a bug in next_prime
Solution 2:
In response to the possible inaccuracy of math.sqrt
I have benchmarked two different methods for performing an isqrt(n)
call. isqrt_2(n)
is coming from this article and this C code
The most common method seen:
def isqrt_1(n):
x = n
while True:
y = (n // x + x) // 2if x <= y:
returnxx= y
cProfile.run('isqrt_2(10**308)')
Benchmark results:
isqrt_1 at 10000 iterations: 12.25
Can perform 816 calls per second.
10006functioncallsin 12.904 secondsOrderedby: standardnamencallstottimepercallcumtimepercallfilename:lineno(function)10.0000.00012.90412.904 <string>:1(<module>)
10.6900.69012.90412.904math.py:10(func)
1000012.2130.00112.2130.001math.py:24(isqrt_1)
10.0000.0000.0000.000 {method 'disable' of '_lsprof.Profiler' objects}
10.0000.0000.0000.000 {range}
20.0000.0000.0000.000 {time.time}
This method is incredibly slow. So we try the next method:
def isqrt_2(n):
if n < 0:
raise ValueError('Square root is not defined for negative numbers.')
x = int(n)
ifx== 0:
return0
a, b = divmod(x.bit_length(), 2)
n = 2 ** (a + b)
while True:
y = (n + x // n) >> 1if y >= n:
returnnn= y
cProfile.run('isqrt_2(10**308)')
Benchmark results:
isqrt_2 at 10000 iterations: 0.391000032425
Can perform 25575 calls per second.
30006functioncallsin 1.059 secondsOrderedby: standardnamencallstottimepercallcumtimepercallfilename:lineno(function)10.0000.0001.0591.059 <string>:1(<module>)
10.6870.6871.0591.059math.py:10(func)
100000.3480.0000.3720.000math.py:34(isqrt_2)
100000.0130.0000.0130.000 {divmod}
100000.0110.0000.0110.000 {method 'bit_length' of 'long' objects}
10.0000.0000.0000.000 {method 'disable' of '_lsprof.Profiler' objects}
10.0000.0000.0000.000 {range}
20.0000.0000.0000.000 {time.time}
As you can see, the difference in isqrt_1(n)
and isqrt_2(n)
is an amazing 11.858999967575 seconds
faster.
You can see this in action on Ideone.com or get the code
note: Ideone.com resulted in execution timeout for isqrt_1(n)
so the benchmark was reduced to 10**200
Post a Comment for "Working With Large Primes In Python"