Wednesday, January 23, 2008

Scheme For Enlightenment Part One - Factorials

The dreaded factorial example.

This series is starting to look like a Scheme tutorial. While most Scheme tutorials start with this example, they usually gush over the language as the ultimate for expression, hackability, etc. My opinion is that if you invest enough in a language, your tendency is to advocate that others share your pain. My approach here is to expose the weaknesses when noticed, without apology. Even my current favorite language, C (or maybe English), has plenty of warts. This is my approach to sales. I say things like, Have a free bagel. It only has 500 calories. That's probably only a quarter of the calories you consume in a day. So giving away free bagels can be difficult. On the bright side, my investigations continue with the hope that something good will come of it eventually. Feel free to share my pain.

Some Scheme syntax. (First comes an open parenthesis. Then comes a verb, which is a function or a method. Then comes space separated list of all the parameters: the nouns. The list has zero or more of these nouns. The list ends with a close parenthesis.) There is a verb that performs multiplies.

(* 5 6)

The asterisk is for multiplication. The 5 and 6 are things to multiply. The expression returns 30. Since expressions return nouns, they can be used as parameters to other functions.

(+ (* 5 6) 7)

This returns 37. The 30 returned by the multiply is added to the 7. These functions can be nested arbitrarily deep. The Scheme language does not place limits, as far as i know, though implementations certainly do.

This is functional programming. Apparently, the idea that one might assign a value to a variable is not part of functional programming. Scheme allows variables, but there are pure functional languages, like j, which don't.

The factorial of an integer is the product of all the integers from 1 to that number. The factorial of 0 is 1. The factorial of 3 is 6 (3 * 2 * 1). Now the mathematician says, we know the factorial of zero is one. We know the factorial of one is one times the factorial of zero, which we already know is one. And so on - the factorial of a number x is x times the factorial of x - 1. Consider this Scheme implementation.

(define (factorial x)
(if (zero? x) 1
(* x (factorial (- x 1)))))

It is purely recursive, it matches the definition of the function, doesn't use any extra variables, and is short. There are a boatload of parenthesis, each one in the correct place. Get used to it. In the second line, the function checks to see if the parameter is zero, and if it is, it returns the number one. This is the factorial that we know - the base condition. In the third line, we are in the else clause of the if, so x is not zero. It subtracts one from it's parameter, calls itself with that, and multiplies the result by it's original parameter. This function works for small positive numbers. Let's step through an example.

(factorial 3)

We're going to substitute the actual numbers. In the first call, 3 is not zero, so the return is (* 3 (factorial 2)). There is still a call to do. In that call, 2 is not zero, so the return is (* 2 (factorial 1)), substituting that in above, real result is (* 3 (* 2 (factorial 1))). The number 1 is still not zero, so (* 1 (factorial 0)) is returned. In this case, 0 is equal to zero, so the number 1 is returned. Substituting that in, we get (* 3 (* 2 (* 1 1))). The inner most multiply is computed, resulting in (* 3 (* 2 1)). The next inner most multiply is computed, resulting in (* 3 2). Then finally, this multiply is computed, resulting in 6. If you type in this example into any Scheme system, the result, 6, is printed.

While it works for small positive numbers, if you feed it a thousand, Guile reports this:

ERROR: Stack overflow

Now, the factorial of 1000 is a very large number indeed. But the stack overflow has little to do with the size of the answer. Scheme supports arbitrarily large integers. Millions of digits, for example are not only supported, but practical. Computations with million digit numbers can be performed in a reasonable amount of time on modern machines.

The reason it gets a stack overflow is that it needs a stack level for each of the thousand calls to itself. The system runs out of stack space. The Earth rests on the back of a turtle. What does that turtle stand on? Another turtle! It's turtles, all the way down.

No comments: