2018-06-08

Why Signed Integer Overflow is Undefined Behaviour in C

People often take integer overflow for granted in C/C++, assuming it will simply "wrap around". A closer look at the C standard, however, reveals that signed integer overflow is actually undefined.

On the surface, there doesn't seem to be a good reason for signed integer overflow to be undefined. After all, just not doing anything when you overflow signed 2's complement integers will give you the "wrap around" that everyone expects.

But there's the key to why it's undefined: 2's complement integers. As it happens, C is old enough that there were multiple standards for representing signed integers, all of which are equally valid as far as the standard is concerned. I'll discuss 2 here briefly, using 4 bit integers, which wouldn't really be useful in real life, but save me typing.

Sign-Magnitude Integers

Under sign-magnitude, numbers with the leftmost bit a 0 represent non-negative, as usual.

To represent negative numbers, we simply switch the leftmost bit of the positive number to a 1.

For example, 0100 is positive 4, so 1100 is negative 4.

One thing to note about sign-magnitude representation is that there is a positive and a negative 0: 0000 is positive 0, and 1000 is negative 0.

So what happens when we overflow a sign magnitude number, without any checks to see whether we have overflowed, that is, just throwing away whatever bits might be "carried off the left" of the result?

0111 is our biggest positive sign magnitude number, so what happens when we add 1? We get 1000, or negative 0! Completely different from what happens when we overflow 2's complement numbers!

1's Complement Integers

Another possibility for representing signed integers is called 1's complement. To negate a 1's complement integer, you simply flip the bits.

For example, to negate 0100, you flip the bits: 1011.

Note that, just as with sign magnitude, there are two 0's: 0000 and 1111.

Now let's see what happens when we overflow 1's complement integers: The biggest integer (positive 7) 0111 plus one gives 1000, which is negative 7. This is again different from 2's complement, where we would have gotten negative 8.

Conclusion

So there you have it, the reason why signed integer overflow is undefined in C/C++, and also the reason why this is rarely an issue: I can't think of a modern computer that doesn't use 2's complement.