| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by petcat 99 days ago

> A compiler uses rigorous modeling and testing to ensure that generated code is semantically equivalent.

Here are the reported miscompilation bugs in GCC so far in 2026. The ones labeled "wrong-code".

https://gcc.gnu.org/bugzilla/buglist.cgi?chfield=%5BBug%20cr...

I count 121 of them.

2 comments

sarchertech 99 days ago

If you can’t understand the difference between a bug that will rarely cause a compiler encountering an edge case to generate a wrong instruction and an LLM that will generate 2 completely different programs with zero overlap because you added a single word to your prompt, then I don’t know what to tell you.

link

petcat 99 days ago

The point is that expert humans (the GCC developers) writing code (C++) that generates code (ASM) does not appear to be as deterministic as you seem to think it is.

link

sarchertech 99 days ago

I’m very aware of that, but I’m also aware that it’s rare enough that the compiler doesn’t emit semantically equivalent code that most people can ignore it. That’s not the case with LLMs.

I’m also not particularly concerned with non-determinism but with chaos. Determinism in LLMs is likely solvable, prompt instability is not.

link

jplusequalt 99 days ago

Classic HN-ism. To focus on the semantics of a statement while ignoring the greater point in order to argue why someone is wrong.

link

anthonyrstevens 99 days ago

I think it's a perfectly fine point. The OP said (my interpretation) that LLMs are messy, non-deterministic, and can produce bad code. The same is true of many humans, even those whose "job" is to produce clean, predictable, good code. The OP would like the argument to be narrowly about LLMs, but the bigger point even is "who generates the final code, and why and how much do we trust them?"

link

sarchertech 98 days ago

As of right now agents have almost no ability to reason about the impact of code changes on existing functionality.

A human can produce a 100k LOC program with absolute no external guardrails at all. An agent Can't do that. To produce a 100k LOC program they require external feedback forcing them from spiraling off into building something completely different.

This may change. Agents may get better.

link

petcat 99 days ago

I argued the greater point? Software code-generation is not deterministic, whether it's done by expert humans or by LLMs.

link

sarchertech 99 days ago

It has nothing to do with determinism. It's the difference between nearly perfectly but not quite perfectly translating between rigorously specified formal languages and translating an ambiguous natural language specification into a formal one.

The first is a purely mechanical process, the second is not and requires thousands of decisions that can go either way.

link

raw_anon_1111 99 days ago

And that’s no different than human developers

link

jcranmer 99 days ago

Compilers are some of the largest, most complex pieces of software out there. It should be no surprise that they come with bugs as all other large, complex pieces of software do.

link

Kye 99 days ago

This seems to apply easily to LLMs as language coprocessors that can output code. How long was it before people trusted compilers?

link

sarchertech 99 days ago

If you don't understand the difference between something that rigorously translates one formal language to another one and something that will spit out a completely different piece of software with 0 lines of overlap based on a one word prompt change, I don't know what to tell you.

link

anthonyrstevens 99 days ago

"rigorously" is doing a lot of heavy lifting here.

link

sarchertech 99 days ago

Let's substitute rigorously with "in an extremely thorough, careful, and methodical way."

link