| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Voultapher 811 days ago
	Non deterministic compilers, yay! Where do I sign up? In more seriousness, miscompilations or in general unexpected behavior caused by layers below you are expensive to find and fix. I think LLMs have a long way to go before such use cases seem appealing to me.

4 comments

taneq 810 days ago

Non-determinism is an implementation detail, not an intrinsic property, as I understand it (at least as long as you're setting temperature to zero).

I likewise don't really think LLMs are the right tool for this job, though. There's a whole class of systems that we built because humans take a long time to learn new skills, are fallible and non-repeatable, and get bored easily. Compilers are in this group along with sewing machines, CNC machines, automatic gearboxes, and design rules checking in CAD.

Maybe they could provide heuristics for optimising compilers with the output run through a formal verification check afterwards?

link

knightoffaith 810 days ago

>Non-determinism is an implementation detail, not an intrinsic property, as I understand it (at least as long as you're setting temperature to zero).

Right. A transformer outputs a probability distribution over all possible tokens from which the next token is sampled and then appended to the input sequence, at which point the process repeats. Temperature controls the entropy of the distribution - higher temperature, higher entropy, conversely, lower temperature, lower temperature. Technically zero temperature involves dividing by zero, so under the hood it's simply set to be an epsilon so small that the entropy of the distribution is low enough that sampling from it always effectively gives one token - the token with the highest probability. And so at every step in inference, the highest probability token is emitted.

link

eeue56 810 days ago

I (kinda) solved this with neuro-lingo[0] with the concept of pinning. Basically, once you have a version of a function implementation that works, you can pin it and it won't be regenerated when it's "compiled". The alternative approach would be to have tests be the only code a developer writes, and then make LLMs generate code to match the implementation for those, running the tests to ensure it's valid.

- [0] https://github.com/eeue56/neuro-lingo

link

ginko 811 days ago

Even regular compilers need quite a bit of nudging to give deterministic results.

link

Phillipharryt 810 days ago

Correct me if I'm wrong here, but I am under the impression they're only non-deterministic in the practical sense (i.e, it produces this output on my machine, I can't know what minute differences there are on your machine), but that's not non-deterministic in the truest sense. If you have completely identical inputs you will get the exact same output, ergo, deterministic.

link

layer8 810 days ago

You are correct. Compilers are deterministic, but reproducible builds can be a challenge.

link

tiborsaas 811 days ago

It's better to have a non-deterministic compiler for a task that would be really hard to write an algorithm for otherwise.

link

clbrmbr 810 days ago

But are LLMs really better at algorithm writing than you? I’ve found that they work best when I’ve already pseudocoded the algorithm.

link

tiborsaas 810 days ago

I didn't mean making it write the actual code and using that, but there are tasks that are more error prone or near impossible to write with a traditional approach. So using some zero shot prompting is better than running code.

That's why I find non-determinism as acceptable when otherwise it would be a pain to do something similar.

link