Wednesday, February 06, 2013

Forth for Enlightenment, part twelve of ten, Four Digit Puzzle

This is another number based logic puzzle. It can be solved without a computer. It's not difficult. The mostly brute force coded solution is straight forward too. It's included because of the rewrites. Like the previous puzzle, individual digits need to be examined. An experiment for this code is to use strings to extract digits. First the problem statement.

What is the four digit number in which the first digit is one-third the second, the third is the sum of the first and second, and the last is three times the second?

First, just a minor bit of analysis. We're looking for a four digit number. One imagines that four digit numbers have a range from 1000 to 9999. I mean, what's the difference between 0000 and 00000? Is the later a five digit number? But 0000 is a solution, right? This program assumes 1000 to 9999, but to save time, it stops when it finds a solution. The code 9999 'X' STO sets the variable X to its final value, so that the FOR loop ends. Without this, the program takes well over an hour to execute, and finds only the one solution.

«
 1000 9999 FOR X
  X →STR 1 1 SUB STR→ 'A' STO
  X →STR 2 2 SUB STR→ 'B' STO
  X →STR 3 3 SUB STR→ 'C' STO
  X →STR 4 4 SUB STR→ 'D' STO
  IF A 3 * B ==
   A B + C == AND
   B 3 * D == AND THEN
   X
   9999 'X' STO
  END
 NEXT
 'A' PURGE 'B' PURGE 'C' PURGE 'D' PURGE
 440 .1 BEEP
» 'PUZ4' STO

This code requires 307 bytes of memory, and takes and 225 seconds for execution. It produces the right answer, 1349.

So, X →STR takes the number X, and converts it to a string. 1 1 SUB returns the substring that is the first character. Since more arithmetic needs to be done, STR→ converts it back to a number. Once each digit is isolated, it is assigned to a global variable. The number, X or ABCD, is a candidate solution. The IF statement figures out if ABCD is a solution or not.

The IF statement performs each of the tests. A 3 * B == multiplies A * 3, and compares it to B. A has to be an integer from 0 to 9. And so does B. One can multiply A by 3 or divide B by 3 and get the same result. The next bit, A B + C == AND adds A and B and compares the result to C. The AND bit combines the first two logic results. The result of the AND is true only if both of the conditions are true. Finally, B 3 * D == AND multiplies the second digit by 3 and compares that to the 4th digit. AND combines the result with the first two results.

The code 440 .1 BEEP makes the machine beep at the end. That makes it easier to time the execution with a stopwatch.

The next idea explored is saving X →STR on the stack, instead of recalling X and converting it to a string 4 times. The code X DUP DUP2 puts 4 copies of X converted to a string onto the stack. One copy is consumed in each of the following lines.

«
 1000 9999 FOR X
  X →STR DUP DUP2
  1 1 SUB STR→ 'A' STO
  2 2 SUB STR→ 'B' STO
  3 3 SUB STR→ 'C' STO
  4 4 SUB STR→ 'D' STO
  IF A 3 * B ==
   A B + C == AND
   B 3 * D == AND THEN
   X
   9999 'X' STO
  END
 NEXT
 'A' PURGE 'B' PURGE 'C' PURGE 'D' PURGE
 440 .1 BEEP
» 'PUZ4' STO

This version is smaller, using only 291 bytes. It's also faster, taking 190 seconds.

Using strings to extract digits was an experiment. We had a numeric digit extractor called G in our previous number puzzle. Here it is.

«
 DUP 10 / FLOOR SWAP 10 MOD 
» 'G' STO

The G function takes a number, and produces the number divided by ten, and the remainder. This peels off the least significant digit. So the right most digit, D is obtained first. G is then called with the new smaller number to peel off the next digit. It doesn't get called for the last digit, since we know that the number had four digits. The third call to G must produce the first two digits. The program that uses it looks like this:

«
 1000 9999 FOR X
  X G 'D' STO
  G 'C' STO
  G 'B' STO
  'A' STO
  IF A 3 * B ==
   A B + C == AND
   B 3 * D == AND THEN
   X
   9999 'X' STO
  END
 NEXT
 'A' PURGE 'B' PURGE 'C' PURGE 'D' PURGE
 440 .1 BEEP
» 'PUZ4' STO

It's slightly bigger than our optimized string version at 306 bytes of memory. But it's faster, taking only 103 seconds. The substring idea looked like it might be quicker, but it wasn't.

One thing of note is the PURGE commands towards the end. Global variables are convenient, but they've got to be cleaned up at the end, otherwise, they stay around for the user to clean up. This isn't much of a big deal on the HP-28c, but on the HP-28s, there's much more memory, and more than one program might be loaded at a time. These variables would add clutter everywhere. Local variables get cleaned up automatically. And we learned how to use them in a recent post. However, we need to change the G function to produce the digits on the stack all at once. That is, it leaves the shortened number on the stack where it can be used to produce the next digit. All the digits produced stay on the stack until converted to local variables.

«
 DUP 10 MOD SWAP 10 / FLOOR
» 'G' STO

Then the program follows.

«
 1000 9999 FOR X
  X G G G 
  → D C B A «
   IF A 3 * B ==
    A B + C == AND
    B 3 * D == AND THEN
    X
    9999 'X' STO
   END
  »
 NEXT
 440 .1 BEEP
» 'PUZ4' STO

This local variable version is shorter, using 266 bytes and faster, taking 69 seconds. I'd have thought this version would be slower, since variables are created and destroyed every loop. So, not only is this version smaller and faster, but it looks easier to read to me.

The code → D C B A « takes the four digits that are on the stack, assigns them to local variables D, C, B and A. Local variable blocks can go anywhere.

Curiously, in RPL, variables of any kind must be given an initial value. It's impossible to have a variable that isn't initialized. Using a variable that doesn't yet have a value is simply not a bug you can code in this language. Fortunately, there are lots of other kinds of bugs you can code, more than making up for this deficiency.

The next idea is to try nested IF statements. Nested IFs are logically the same as the AND statements. They should be quicker since the tests can stop as soon as one is false. The C language allows shortcut logicals where additional conditionals are not evaluated. RPL on the HP-28c does not.

«
 1000 9999 FOR X
  X G G G 
  → D C B A «
   IF A 3 * B == THEN
    IF A B + C == THEN
     IF B 3 * D == THEN
      X
      9999 'X' STO
     END
    END
   END
  »
 NEXT
 440 .1 BEEP
» 'PUZ4' STO

This version is larger at 296 bytes, but it's also faster at 54 seconds. Is it easier to read? Experimentation with the order of the IF statements did not yield a change in speed. One might expect that putting the condition that is false most often first would be faster, as less code is executed on average.

Finally, we'll inline the G function. The HP-28c's RPL encourages using and calling functions, but it takes time to call a function and return. The resulting code is bigger since the G function is 49 bytes by itself, and this version has three copies of it. It's not necessarily easier to read, but it is quicker.

«
 1000 9999 FOR X
  X
  DUP 10 MOD SWAP 10 / FLOOR
  DUP 10 MOD SWAP 10 / FLOOR
  DUP 10 MOD SWAP 10 / FLOOR
  → D C B A «
   IF A 3 * B == THEN
    IF A B + C == THEN
     IF B 3 * D == THEN
      X
      9999 'X' STO
     END
    END
   END
  »
 NEXT
 440 .1 BEEP
» 'PUZ4' STO

This version requires 334 bytes, but comes in at 47 seconds. That's about 4.8 times faster than the first version. Is that as fast as it can be done? Of course not. For one thing, the problem can be solved without a program, and it doesn't take very long. This exercise is about optimization. In this case, we didn't know the relative speeds of different functions before we started. So, a black box trial and error system was used to deduce what turns out to be fast or small. For RPL, this is what it takes. But it often takes something like it in other languages. You might think that instruction speed in assembly language would be well known. But most modern processors sometimes can perform instructions while waiting on external memory, or can even execute multiple instructions simultaneously. And often, one needs to sign a non-disclosure form to get the documentation. If that's not enough, various levels of cache miss or TLB thrashing can toss your best estimates out the window. Trial and error has the advantage of trial, which is sort of a final arbiter anyway. So make your best guess, and ask the universe if you're right.

No comments: