Friday, January 25, 2008

Scheme For Enlightenment Part Three - More Factorials

The dreaded Scheme factorial example is continued recursively.

Scheme (and Lisp) purists continue to promote recursion as the right way to do things. Scheme, or at least Guile, Lispme, mzScheme, umb-scheme and probably all others, support free tail recursion. That is, if the very last thing your function does is call itself, then no stack penalty is paid. Unfortunately, the first example has a multiply after the last call to itself, so it must use a stack. There is a solution, but it isn't as simple.

(define (factorial x)
(letrec ((tr-factorial
(lambda (count product)
(if (> count x)
(tr-factorial (+ count 1)
(* count product))))))
(tr-factorial 1 1)))

This tail-recursion version is much more complex. The second line defines a helper function, which calls itself. This helper function takes two parameters. The first is the current number to multiply by, and the second is the current accumulated result. In the forth line, it checks if it's done by comparing with x, the original parameter. If it is done, it returns the answer. Otherwise, it multiplies the current number by the accumulated answer, adds one to the current number and calls itself with these values. The last thing it does is calls itself, so the Scheme implementation can do the tail recursion optimization, and no additional stack is used on each call. The implementation can be said to jump to the start of the function, instead of call it. The outer definition calls the inner definition in the seventh line, starting the count at one and the accumulator at one, and the result from this call is returned. And, if one feeds this version 1000, it quickly spits out all 2,568 digits of the answer.

One is expected to use recursion for loops. And this really seems to be how one is expected to do program recursion in Scheme. Enlightenment isn't going to come easy. It works, but there are some issues with it. First, there's a weird interaction between the nested function definitions. The inner definition needs access to the outer definition's parameter. One supposes that it could just be passed as a third parameter. Then it could be a standalone function, and one would always call it with two extra parameters, set to one. In fact, Guile supports optional arguments with default values. So this can be written as a single function, at least in principal. We won't go there.

Another issue is that it isn't obvious from the source that the inner function is actually tail-recursive. The self call is smack-dab in the middle of the function definition. This has implications if one is going to use this language. As a programmer or debugger, one must know. Are there tools that could quickly tell you? Well, you can (trace) your call to find out. The tutorials talk about how you can do it with inspection. Plan to get good at it.

In timing tests, the looping structure is quicker than recursion. It isn't much faster in this example, because for large factorials, most of the computer's time is spent doing the large multiplies. For one brought up with procedural languages, the loops are easier to write, to read, and to debug. In fact, having variables to watch in a Scheme debugger provides clues as to what is going on. In a pure functional program there are no variables, so the debugger can only tell you what happens in function calls. This can be less than helpful.

Apparently, at least part of incentives to use a pure functional style is that such programs are easier to prove are correct, at least in a mathematical sense. While noting that setting variables is convenient, this quote from The History of Lisp is an example: (Fans of other programming languages are challenged to write a program to concatenate lists and prove that the operation is associative). Presumably, such a proof would involve having another program use logic to make such a statement.

But one still must cope with other language issues. For example, how does one cope with the parenthesis? A good editor can tell you if your parenthesis are lined up. Vi and Emacs can do this. Some editors, like Emacs, and other external tools can pretty print functions. The pretty printing can expose some of the structure. And, after a year of beating on the language, it gets a bit easier. But there's still the problem of where the various bits of the do loop should go. The syntax is horrible. And i've gotten comfortable with the full Unix find command, and regular expressions. I grok horrible.

No comments: