Tuesday, May 24, 2011

Forth For Enlightenment, part three of ten, Logic two

Last time, the four digit logic problem was introduced, with solution. This time, we'll develop a brute force solution for the HP-28, which we're thinking of as a Forth language engine.

We're looking for a four digit number. It must be greater than 2000. The digits of this number add to 14. The one hundred's digit is less than the one's digit. The ten's digit is three times the thousands digit. The number must be odd.

Since the number is four digits and greater than 2000, the number must be between 2001 and 9999. That suggests a loop. In Forth, the style is to use lots of little functions to get things done. So, in writing the top level function, we'll imagine that we have helper functions to solve smaller problems. The top level could be a loop that checks every digit. Call it 'TOP'. It effectively enforces the first two rules. The development plan was more or less top down design, bottom up implementation, though with some feedback leading to a little jumping around. It's presented as top down, that is callers first, however. For testing, it should be entered bottom up. The called functions can be tested first. If you don't recall the syntax for something, you can refer to my previous articles. Hewlett Packard has the full manual available as well, though not online or free.

By the way, using lots of little functions isn't how i learned to write C. I use C when performance is everything. When i learned C, subroutine calling always introduced a few instructions of overhead. The incentive is to only write subroutines when they'll be called more than once. But modern C compilers can perform function inlining automatically. And, statically declared functions can be inlined with the original function eliminated entirely. So the effect can be better readability and maintainability without the subroutine call performance penalties or extra code. However, there's still the issue of variable scope, parameter passing and return. So it's not something you can always do. By the way, when entering this routine, NEXT isn't the NEXT key on the machine. It's in the Program Branch menu on the 2nd level. This procedure ironically has you use the NEXT button to get to the 2nd level of the menu. Alternately, you can enter the 4 letters N E X T and a space.

« 2001 9999 FOR x x CHECK NEXT

So, the next thing to consider is CHECK, which enforces the last four rules. It takes a 4 digit number from the stack (and consumes it). How does it flag any correct answers? The stack is not a good place for this. We could store any answer we get in a variable called 'ANS'. If we assume that we don't know how many correct answers that there might be, 'ANS' could be a list which we can append to. So, TOP needs to initialize this list with an empy list before the loop that calls CHECK. We'll do that in a function called INIT. We might modify INIT later to add anything else that comes up. Also, TOP should recall the value of ANS. Here's the brand new INIT along with the new version of TOP.

« { } 'ANS' STO

« INIT 2001 9999 FOR x x CHECK NEXT ANS

Now CHECK needs to break the number into digits. I considered using an array for the digits, and index into the array. But there doesn't seem to be much advantage to that approach. A variable for each digit is OK. I went with 'THO', 'HUN', 'TEN' and 'ONE'. A function 'BRK' breaks the number into digits. The idea is to divide the number by 10 using the HP-28's MOD function - which gets the remainder. That's the low digit. Then divide the number by 10 and use FLOOR to get the integer part. This is repeated, so we'll write a GETDIG function to do that much. GETDIG takes the number to break and returns the new number to break and the digit obtained on the stack. BRK calls it 4 times, storing results.



This much is easily tested. Type 1234 and using the USER function key setter run 'BRK'. Four variables are created, ONE, TEN, HUN and THO, and pressing those buttons shows their digit values. Testing again with 5678 shows that each value changes appropriately.

The CHECK function has four rules to enforce. The order doesn't matter. If the number passes all four rules, then the 4 digit number is added to the list. The tests are in the functions IS14, ISHLO, IST3T and ISODD. These functions look at the digits stored by BRK and return true or false. They could be all tested and the results ANDed together, but it's probably faster to have nested IFs so that if any are false, it can stop checking. Using AND would eliminate all but one of the IFs. A new typography thing. The →LIST notation is for an HP28 built in function. The function name really starts with a right arrow. In development, i wrote a version that kept the current potential answer on the stack as an unnamed variable. It even worked. But storing it in a real variable called 'PANS' is easier and shorter than goofing around with the stack. It looks like this:

ANS PANS 1 →LIST + 'ANS' STO ; a solution

If there is a solution, it is converted to a list and appended to the ANS list of solutions. This function can't be checked yet. All the called check functions must in place first.

The first rule function is IS14. It checks to see that the digits add to 14. It recalls each digit and adds. It doesn't perform the IF, it just does the compare. The compare returns 0 for false or 1 for true. So Booleans are just numbers.

« THO HUN + TEN + ONE + 14 ==
» 'IS14' STO

One can check this function using 1234 BRK IS14. It should report 0. Then 2165 BRK IS14 should report 1.

The next rule is ISHLO, which checks that the one hundred's digit is less than the one's digit.


2165 BRK ISHLO should report 1. 4321 BRK ISHLO should report 0.

The next rule is IST3T, which checks that the ten's digit is three times the thousands digit. I'm using * instead of X for multiply because that's what you get on an HP-28 when you press the X button when you are creating a function. The letter X could be a variable or function.

« THO 3 * TEN ==

4321 BRK IST3T should report 0. 2165 BRK IST3T should report 1.

The last rule is ISODD, which checks that the number must be odd. One imagines that such a function could be handy for other things, and therefore is a candidate for a library function. Then any mistakes you make in creating it could be reused. No, no. Really, libraries are a good thing.

« ONE 2 MOD 1 ==

2165 BRK ISODD should report 1. 1234 BRK ISODD should report 0. Visually, it may not be totally clear what it's doing. We don't need to know anything other than the least significant digit. Recall the one's digit, divide it by 2 with the MOD function, and compare the result to 1.

The CHECK function can now be checked using INIT 1234 CHECK ANS. It should report { }. Then INIT 2165 CHECK ANS should report { 2165 }. Of course, if we didn't know a solution to the problem, you'd have to check that TOP yields an answer with TOP.

Running the full program by invoking TOP takes 49:22 (2962 seconds) to run on my HP-28C. It should be faster on an HP-28S, since the processor speed is higher. Optimization or speed was not a particularly big consideration with this program. A logic program like this would be run once. There are no parameters. It should always produce the same answer. But it should be noted that some of the tests report false more often than others, so should be evaluated first. For example, one expects that IS14 is true 1/14th of the time, while ISODD is true half of the time. In fact, ISODD could have been incorportated into TOP's FOR loop by using a STEP clause of 2 and count just the odd numbers. And of course, if speed were required, a modern machine could have been used and with an efficient compiler. One expects that a C version should consume less than a second. That's because 8000 just isn't a very large number. And in fact, even a decade old PC requires only a millisecond, and sometimes reports zero time.

One of the things that is attractive about this language is that the check functions are all really tiny. Indeed, having a compact language makes considerable sense for small machines. I had heard that Forth for the 1970's batch of microcomputers could achieve 1/3rd of the speed that typical programmers could achieve using assembly language. However, i doubt it. If i recall right, the inner loop for a Forth engine required 21 instructions on a z80. This is all overhead. But perhaps the figure was for a Forth system where many of the primitives were in assembly language, and therefore much of the code executed in a Forth system was, in fact, assembly language. Given that Forth is arguably easier to write, that's quite promising. The low memory foot print for Forth is also encouraging. Java has achieved 1/3rd the performance of C, but it started out at about 1/350 of C's speed. Java has only achieved this performance through considerable heroics. It took over ten years to happen. By available evidence, every opportunity to trade memory for speed has been used to achieve it. Java is incredibly memory hungry.

The CHECK function shows that even a little complexity can give rise to an increased programmer burden tracking how the stack is used. Tracking what's on the stack is one of the major challenges with writing and debugging Forth. One ends up drawing stack diagrams all the time. When i wrote this program, the only problem was that the first version of the CHECK function left an extra value on the stack. I didn't happen to notice it, so my first run of TOP ran out of memory due to the accumulating values on the stack every loop.

Even as short as ISODD is, it's not evident from the code what it might be doing to the novice. It's probably more evident to the seasoned Forth programmer. But this forms a barrier to entry, and makes procedural (or imperitive) languages easier in general. In fact, i find that documenting my Forth code to the level of this article is what i need to have a clue even a year later. That's distressing.

Is the application of logic, or the crafting of a program faster if the goal is getting the answer? I'd have to say that the logic was marginally quicker for me. It's not as complicated. It took nearly twice as much text to describe the program as it did the mathematical logic. I don't attempt to solve this sort of logic problem very often. While i'm not currently an HP-28 RPL or Forth guru, i do write programs all the time. I'm much quicker at it than i was towards the start of my programming career. Although the debug cycle was short, the run time was pretty long. I don't count it, as i was able to do other things while it ran. Uhm, i added 440 1 BEEP at the end of TOP to alert me when it was finished with a 1 second long musical A note.

No comments: