Hacker News new | ask | show | jobs
by scottlamb 6 days ago
> This is a good example of what slips through LLM attention. It forces all allocations to be calloc as if it is a strict upgrade.

I wouldn't assume Claude made that decision; it's not as if that was some incidental thing that it snuck into a large commit. The commit message starts with "zero all new memory from allocations", and that's exactly what the commit does. What do you imagine the prompt was?

It seems totally plausible to me that a human initially thought this was an improvement, then rethought after discovering the RSS regression. And it's not a law of nature anyway that this change has to increase RSS; calloc could special-case the case in which memory was freshly returned from the OS, knowing fresh memory mappings are zeroed anyway.

I blame AI for these regressions mostly in the sense that it caused a flurry of vulnerability reports. Those led to a flurry of quick fixes. Sometimes quick fixes cause other problems.

1 comments

You don't really have to guess. The guy told us the AI didn't suggest this specific change:

> The change to zero memory was my idea and my change. It was a reaction to a security report I got which caused use of an element past the end of an array. By zeroing the allocation I could ensure that misuse of that memory if a similar bug came up in the future could only cause a null ptr deref, which is better than the chance of a valid pointer. It got a claude co-authored tag on it as I got it to do some tidy ups of a series of commits, and that is just what it does when it makes any modification. It doesn't mean the change was written by claude. It was written by me.

https://github.com/RsyncProject/rsync/issues/959#issuecommen...

> … By zeroing the allocation …

How does that prevent reading past the end of the buffer? Or change how bytes outside the buffer are used? Are these arrays of pointers so that the “null ptr deref” comment makes sense?

Or am I the bozo and don’t know what’s happening here?

It doesn’t. It’s just that dereferencing a zeroed pointer reliably crashes the program (unless you specifically do funky things with mmap) but dereferencing garbage memory as a pointer could do a lot more insidious damage.
My point is that the developer's comment doesn't make sense. Zeroing the allocated memory doesn't change anything about overrunning the buffer.

edit: removed unnecessary examples

Haven't looked at the code, but the allocated memory could be larger than necessary to make "off-by-one" or "off-by-a-few" errors less deadly. Then zeroing it out makes it even less so. Defense in depth.

Or it's an allocation for an arena? The zeroing might help trigger 0 derefs earlier if the overrun happens for the object that are then allocated in the arena (and not by allocating more objects than the arena can provide)

This doesn't prevent overrunning the buffer -- it means that when you do overrun the buffer, it does less damage
The code is part of a function called expand item list. It looks like it over allocates memory and uses a bump pointer for internal allocation, only expanding the allocation when necessary. Thus OOB writes to the list would hit the allocated memory.

You’re not a bozo but it is helpful to read the code.

okay I had not read this or any discussions there (except the one linked in the post), but this looks weirder. the comment you linked is a dev responding to what is very clearly a bot comment. I am sure they have good intentions and I have no reason to believe otherwise as I have no connection to the project whatsoever, but the original commit being 4-5 lines long (what did claude do then?) and the revert description is almost certainly written by an LLM makes in my mind the slop argument stronger.

I hope if this doesn't come across as unkind towards the dev who gives their time and energy to the project. Grateful for that.

> the original commit being 4-5 lines long (what did claude do then?)

I've said "rebase onto <newbase>" and let it handle all the merge conflicts. I wouldn't expect this particular commit to conflict with anything, but it could have been part of a big series where it'd be worth doing that instead of running the rebase command yourself. It wouldn't surprise me if I picked up some Co-Authored-By:s along the way.

It is certainly unkind, when a developer asserts the opposite of what you have assumed about their code, to double down and imply they are lying.