Tuesday, March 29, 2011

Top Ten, 20 of 10

We last left our hero in the top ten list of ways to be Screwed by "C". Today's entry is 64 Bit Madness.

No example is given, just a general warning about how integers are often signed by default, and therefore do not always behave as expected when moving from 32 bit architectures to 64 bit architectures. Since i learned C on the PDP-11, i'm predisastered. The PDP-11 is a 16 bit architecture and commonly used 16 bit integers. C's long type was 32 bits, but was not generally used for pointer arithmetic on this system. The Vax 11/780 was a commonly accessible 32 bit computer that was contemporarily available when i learned C. So, there was a considerable amount of code that i ported from the 32 bit VAX to the 16 bit PDP-11. I had fewer issues moving from 16 bits to 32 bits than from 32 bits to 16.

The sign extension problem was not much of an issue. If there was a sign issue on the 16 bit PDP-11, it generally showed up right away. That's because using more than 2^15 = 32,768 bytes was common on these machines. These machines often had between a quarter of a megabyte of physical RAM and as much as four megabytes.

On 32 bit machines, until very recently, it wasn't very common to have more than 2 GB of RAM, which is what you need to run into sign extension problems. However, it's already very common to have more than 4 GB of RAM on 64 bit computers. I already have one with 8 GB.






Computercharshortintlonglong long
PDP-11 16 bits8161632n/a
32 bits816323264
64 bits816326464
64 bits816646464

Denis Ritchie's C compiler was nearly always used the the PDD-11. And the int type was explicitly stated to be the most natural size for the machine. For most computers, this is the width of the CPU registers. And, for most machines, this is also the width of the program counter so that general registers could be used to compute addresses. The gcc compiler introduced the long long data type. It was a way to allow the int type to remain flexible while allowing access to a new longer integer type, whose time had come. It also had the feature of not breaking the language. No new keywords were introduced. Old compilers did not break on new code (though they usually did not produce what was desired). But when 64 bit computers came out, there were two standards, often for the same hardware. Depending on the compiler, the int data type may be 32 or 64 bits in length. So, there are some C compilers which decided that the int data type was not a redundant optimization type but rather had it's own type, and further, that data type is not the same size as pointers on that machine, as was historically the case.

So while porting C code from the PDP-11 to the 32 bit VAX was generally easy, it can be awkward to port code from one compiler to another on the same 64 bit machine. There are workarounds.

Could Dennis have planned better when he designed his language? While 64 bit integers weren't much in demand on the PDP-11, 64 bit architectures existed in hardware since 1961. That's much older than the C language. It wouldn't have hurt much to add 64 bit support for the PDP-11. Indeed, Unix was ported to the 1976 Cray-1 computer architecture eventually. That's well before the roughly 1969 vintage C language.

No comments: