Hacker News new | ask | show | jobs
by cousin_it 4299 days ago
I don't completely understand the C spec. Would the following approach work for zeroing a buffer?

1) Zero the buffer.

2) Check that the buffer is completely zeroed.

3) If you found any non-zeros in the buffer, return an error.

Is the compiler still allowed to optimize away the zeroing in this case?

4 comments

You are mixing up the C level buffer abstraction and some potentially underlying RAM. C doesn't deal with RAM, only with the abstraction, so you can't look at the RAM in C, you only can look at the abstract buffer, and the only thing that the compiler has to guarantee is that the abstraction holds - namely, that after you write zeroes into the abstract buffer, a subsequent conditional that checks whether the buffer is zeroed will branch accordingly, which is a fact that is trivial to evaluate during compile time, and as soon as the compiler has determined that the conditional is statically determined, it can eliminate any alternative branches as dead code and translate the abstract buffer write into a NOOP at the machine code level.
> Is the compiler still allowed to optimize away the zeroing in this case?

Yes, completely. In the snippet below, the compiler is allowed to eliminate all code after “leave secrets in array c”.

  {
    char c[2];
    ... /* leave secrets in array c */
    memset(c, 0, 2);
    c[0] = 0;
    c[1] = 0;
    memset(c, 0, 2);
    if (c[0] || c[1]) exit 1;
  }
The compiler is also allowed to compile the last three instructions below as if they were “return 0;”

  {
    char c[2];
    ... /* leave secrets in array c */
    c[0] = 0;
    c[1] = 0;
    return c[0] + c[1];
  }
> In the snippet below, the compiler is allowed to eliminate all code after “leave secrets in array c”

gcc 4.4.5 doesn't though (-O3), it still clears the stack once and performs the comparison.

I believe these optimizations can be defeated by declaring a global

  volatile char fill = 0;
and using that instead of 0 in memset().
It's not guaranteed to defeat the optimization. For instance, it could just read fill into two registers and do the comparison there.
Is the compiler still allowed to optimize away the zeroing in this case?

With 'volatile', generally not, modulo bugs. Without volatile, it would never return an error.

I was wondering that too.. I would think that simply accessing the any byte in the buffer afterwards would prevent the compiler optimizing it out.
That depends how much the compiler can optimize (away). If the next call is free(), it's quite trivial to skip the zeroing and just take the correct branch.

I am still uncertain while people want to just 'zero' it. Filling random data (just one random() call) and then using inline PRNG, then summing the result, storing it globally in volatile would reliable 'zero' the data but it's quite CPU intense.

You still are not guaranteed to clear the buffer like that.

  for (int i=0; i<len; i++){
    sensitiveBuffer[i]=random();
  }
  int sum=0;
  for (int i=0; i<len; i++){
    sum+=sensitiveBuffer[i];
  }
  volitileVar=sum;
Using loop fusion, the compiler can optimize this to: int sum=0; for (int i=0; i<len; i++){ sensitiveBuffer[i]=random(); sum+=sensitiveBuffer[i]; } volitileVar=sum;

Which it can then optimize to: int sum=0; for (int i=0; i<len; i++){ sum+=random(); } volitileVar=sum;

In fact, as the article points out, the compiler can legally transform:

  reallyZeroBuffer(sensitiveBuffer);
into

  pointlesslyCopy(sensitiveBuffer);
  reallyZeroBuffer(sensitiveBuffer);
not like that: Not exactly like that: I didn't mean simple that way, more like hash alike. The 1st example is not optimal but shows the idea.

I disagree about "pointlessCopy". Of course it's permitted by the standard but it's not an optimization. Using such a broken compiler is beyond help.

------- volatileVar *= random(); for (int i=0;i<len;i++){ volatileVar+=buf[random()%len]+0x61c88647; buf[i]=volatileVar; } ===== static volatile uint volitileVar;

  for (int i=0; i<len; i++){
    sensitiveBuffer[i]=random();
  }

  for (int i=0; i<len; i++){
    volitileVar+=sensitiveBuffer[volatileVar%len];
  }
Why shouldn't the compiler be able to figure out that you sum a series of numbers that aren't used anywhere else, thus don't need to be spilled to RAM?
You write them to a global state volatile
How does that require writing the random data to RAM, apart from the volatile variable itself?