Integer Type Selection in C++ in Safe, Secure and Correct Code: a good talk from Robert Seacord

Excellent talk from Robert Seacord at CppNow 2023: Integer Type Selection in C++: in Safe, Secure and Correct Code

Part of what I love about this talk is the ignorance of some of the audience, which I admit stunned me. Just as an example, note the audience member at around 1:09:30 arguing that using a pointer increment solves the problems Robert was discussing about loop termination. Newsflash: pointers have no special powers that solve this problem. Incrementing a pointer beyond allocated memory is UB.

Robert started this part of the talk with this, with the idea that size is of type size_t:

for (ssize_t i = (ssize_t)size-1; i >= 0; i--)

The defect that Robert points out is that some (roughly half) of the possible values of type size_t can’t be represented as ssize_t, and the cast will result in incorrect behavior for those values of size. Values of size that are larger than the largest signed integer representable by ssize_t will result in i being initialized as a negative value, hence immediate termination of the loop.

The audience member is arguing that pointer arithmetic solves this problem. One, he’s making the assumption that the whole point of the code is to use i as an array index. Which is not Robert’s point at all. Any array on his slide? Nope.

But suppose we go along with his argument, and make the assumption that i is being used as an array index in the loop. We’ll flesh this out a little more just to make a useful example of what the audience member was proposing.

Let’s suppose Robert’s code was part of this:

void fillCharArray(char *array, char c, size_t size)
{
    for (ssize_t i = (ssize_t)size-1; i >= 0; i--) {
        array[i] = c;
    }
}

The audience member was proposing this as a fix:

void fillCharArray(char *array, char c, size_t size)
{
     char *p = array;
     const char *arrEnd = array + size;
     for ( ; p != arrEnd; ++p) {
         *p = c;
     }
}

Does this fix the problem? No. In fact, from a safety and security perspective, it’s potentially much worse. array + size is undefined behavior for large size. In fact it’s undefined behavior for any size that’s greater than the actual length of array. And dereferencing (and worse, writing!) to what’s pointed to by p here is exactly how a large percentage of our published security flaws have appeared in C and C++ over the last many decades: out of bounds access.

The separation of pointer from the length of what it points to is at the root of many problems in C (and hence C++ when C-style code is used). It is why you often see sentinel values at the end of constant, initialized arrays where practical; loops can then terminate when they see the sentinel instead of relying on array index arithmetic. This isn’t novel; C-style strings are just character arrays with a null (‘\0’) as the sentinel (and string literals in C have a null implicitly appended).

To the sane among us: ideally you never use C-style arrays at all in C++ code. We have <array> and <vector>. Use them.

Leave a Reply