Saturday, February 20, 2010

SICP Exercise 1.24: The Fermat Test

From SICP section 1.2.6 Example: Testing for Primality

Exercise 1.24 asks us to once again modify the timed-prime-test procedure from exercise 1.22, this time using fast-prime? (which uses the Fermat test), then test the improvement using the timing statistics we gathered before.

The Fermat test is a probabilistic method for testing primality, meaning that if a given number passes the Fermat test, the best we can say is that the number is very probably prime. It's based on Fermat's Little Theorem which states:
If n is a prime number and a is any positive integer less than n, then a raised to the nth power is congruent to a modulo n.
So to test n for primality, we select a number, a, which is less than n, and raise it to the nth power. We then divide an by n to get the remainder. If n is prime, then the remainder will be equal to a, the base we selected.

Fermat's Little Theorem says that this property will always hold true if n is prime, but it doesn't say anything about when n isn't prime. If an mod n is not equal to a, then we know for certain that n is not prime. However, there are values of n and a that will pass the Fermat test even when n is not prime. If we test enough values of a for a given n, we can increase our confidence that n is prime. Unfortunately, there are extremely rare values of n called Carmichael numbers that pass the Fermat test for any value of a. There are variations on the Fermat test that cannot be fooled. We'll look at one of those variations, the Miller-Rabin test, in a later exercise.

The complete code listing below shows what modifications were necessary to use the Fermat test in the timed-prime-test procedure. The fast-prime? procedure takes two arguments, n, and times. The first argument is the number to test, the second argument tells the procedure how many random values to test with. The more random values we use, the higher our confidence that n is prime, so we should test a lot of values. I've (rather arbitrarily) chosen to test 100 random values.

One other change that you may notice is the inclusion of a library module in the first line of code. When testing the original code listing from the book, I found that Scheme's primitive random procedure has a limit of 4,294,967,087. This won't do for the values we're testing, so I replaced it with the random-integer procedure, which doesn't have this limitation. I found out about this procedure and library through the Scheme Cookbook.
(require (lib "27.ss" "srfi"))

(define (timed-prime-test n)
(newline)
(display n)
(start-prime-test n (current-inexact-milliseconds)))

(define (start-prime-test n start-time)
(cond ((fast-prime? n 100)
(report-prime (- (current-inexact-milliseconds) start-time)))))

(define (report-prime elapsed-time)
(display " *** ")
(display elapsed-time))

(define (square x)
(* x x))

(define (even? n)
(= (remainder n 2) 0))

(define (expmod base exp m)
(cond ((= exp 0) 1)
((even? exp)
(remainder (square (expmod base (/ exp 2) m))
m))
(else
(remainder (* base (expmod base (- exp 1) m))
m))))

(define (fermat-test n)
(define (try-it a)
(= (expmod a n n) a))
(try-it (+ 1 (random-integer (- n 1)))))

(define (fast-prime? n times)
(cond ((= times 0) true)
((fermat-test n) (fast-prime? n (- times 1)))
(else false)))

Test Results


We need to run timed-prime-test again with the new modifications using the same set of inputs that we found in the previous exercises, then compare the results to see how much we improved the run time. The following table compares the original data we gathered in exercise 1.22, the improved algorithm we used in exercise 1.23 (improved time column), and the new results we got using the Fermat test (all three time columns are in milliseconds). The ratio column is the ratio of the improved values from exercise 1.23 to the fermat time column.
























prime original time improved time fermat time ratio
100000000191411031780.58
100000000331721011870.54
100000000611541121730.65
1000000000035163101811.71
1000000000194932951861.59
1000000000575273051891.61
100000000003916278611894.56
100000000006115598841954.53
100000000006315498731974.43
100000000000375014267120812.84
100000000000514932266821112.64
100000000000994855269720812.97
10000000000003115745853123336.61
10000000000006716022826423135.77
10000000000009715861853022537.91
10000000000000374895026346244108.0
10000000000000914883626210255102.8
10000000000001594900826256257102.2


Analysis

The timing numbers from the Fermat test start out looking pretty poor compared to what we've seen previously. This has mostly to do with the completely arbitrary number of random values I chose to test each prime with. As we increase the size of the numbers we're testing, you can see that the time using the Fermat test barely increases at all. We can verify that the time increase is logarithmic by observing that as the numbers under test increase by a factor of 10, the ratio column increases by a factor of roughly three. This logarithmic characteristic is due to the fact that the expmod procedure used in fermat-test has a logarithmic time complexity.


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

1 comment:

Petar said...

Hey, I know it's a bit late,
but I was wondering about the following procedure:

(define (start-prime-test n start-time)
(cond ((fast-prime? n 100)

Why is the second argument's value 100, if it was 10 how would that effect the probability of n being a prime number?

I was wondering which margins of error are linked to different values of the second argument, and if you could suggest me some statistics or probabibility book related with this domain?

Thanks!