begriffs.com features a wonderful article on the different things C compilers had to do for different computer architectures.
We all tend to think a char & short are 8 bits, an int 16 bits, etc. But not on some machines:
In this article we’ll go on a journey from 4-bit microcontrollers to room-sized mainframes and learn how porting C to each of them helped people separate the essence of the language from the environment of its birth. I’ve found technical manuals and videos for this article to help bring each computer to life.
The Unisys 1100/2200 went with multiples of 9! The char size is 9 and the word size is 18 bits! On Motorola 68000 and ARM2 based machines, the processor had two-byte granularity and lacked the circuitry to cope with unaligned addresses. When presented with such an address, the processor would throw an exception. The original Mac (also based on the 68000) would usually demand the user restart the machine after an alignment error. (Similarly, some Sparc machines would raise a SIGBUS exception for alignment problems.)
The PDP-11 above is special:
The C home planet (C was conceived on a PDP-11). Not much to say about it, because things work smoothly. The real surprises happened when porting PDP code to other machines. Pointers of all types and integers can be interchanged without casting.
One strange thing about this machine is that whereas 16-bit words are stored in little endian, 32-bit long ints use a weird mixed endian format. The four bytes in the string “Unix” when stored in the PDP-11 are arranged as “nUxi” if interpreted as big endian. In fact that scrambled string itself resulted when porting code from the PDP to a big endian machine.
The article summarizes things:
It’s amazing that, by carefully writing portable ANSI C code and sticking to standard library functions, you can create a program that will compile and work without modification on almost any of these weird systems.