| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ptx 397 days ago
	> As we learn to use LLMs in our work, we have to figure out how to live with this non-determinism [...] but there will also things we'll gain that few of us understand yet. No thanks. Let's not give up determinism for vague promises of benefits "few of us understand yet".

2 comments

aradox66 397 days ago

Determinism isn't always ideal. Determinism may trade off with things like accuracy, performance, etc. There are situations where the tradeoff is well worth it.

link

pixl97 397 days ago

Yep, there are plenty of things that aren't computable without burning all the entropy in the visible universe, yet if you exchange it with a heuristic you can get a good enough answer in polynomial time.

Weather forecasts are a good example of this.

link

betenoire 397 days ago

I understand there are probabilities and shortcuts in weather forecasts.... but what part is non-deterministic?

link

josefx 397 days ago

Most heuristics are still deterministic.

link

aradox66 397 days ago

Also, at temperature 0 LLMs can behave deterministically! Indeterminism isn't necessarily quite the right word for the kind of abstraction LLMs provide

link

gpm 397 days ago

Even at temperature != 0 it's trivial to just use a fixed seed in the RNG... it's just a computer being used in a naive, not even multi threaded (i.e. with race conditions), way.

I wouldn't be surprised to find out different stacks multiple fp16s slightly differently or something. Getting determinism across machines might take some work... but there's really nothing magic going on here.

link

bird0861 397 days ago

Quite pleased you mentioned this. I would like to add transformer LLMs can be turing complete, see the work of Franz Nowak and his colleagues (I think there were at least one or two other papers by other teams but I read Nowak's the closest as it was the latest one when I became aware of this).

link

josefx 397 days ago

That runs into the issue that nobody runs LLMs with a temperature of zero.

link

bird0861 397 days ago

Not true. Perhaps very few do, but some do in fact run them at 0. I've done it myself. There are many small models that will gladly perform well in QA with temp 0. Of course there are few situations where this is the recommended setup -- we all know RAG takes less than a billion parameters now to do effectively. But nevertheless there are people who do this, and there are plausibly some use cases for it.

link

billyp-rva 397 days ago

Nobody was stopping anyone from making compilers that introduced random different behavior every time you ran them. I think it's telling this didn't catch on.

link

gpm 397 days ago

I think there was actually a very big push to stop people from doing that - https://en.wikipedia.org/wiki/Reproducible_builds

There were definitely compilers that used things like data-structures with an unstable iteration order resulting in non-determinism, and people went stopping other people from doing that. This behavior would result in non-deterministic performance everywhere, and combined with race conditions or just undefined behavior other random non-deterministic behaviors too.

At least in part this was achieved with techniques that can be used to make LLMs to, like by seeding RNGs in hash tables deterministically. LLMs are in that sense no less deterministic than iterating over a hash table (they are just a bunch of matrix multiplications with a sampling procedure at the end, after all).

link

danenania 397 days ago

I think this gets at a major hurdle that needs to be overcome for truly human-level AGI.

Because the human brain is also non-deterministic. If you ask a software engineer the same question on different days, you can easily get different answers.

So I think what we want from LLMs is not determinism, just as that's not really what you'd want from a human. It's more about convergence. Non-determinism is ok, but it shouldn't be all over the map. If you ask the engineer to talk through the best way to solve some problem on Tuesday, then you ask again on Wednesday, you might expect a marginally different answer considering they've had time to think on it, but you'd also expect quite a lot of consistency. If the second answer went in a completely different direction, and there was no clear explanation for why, you'd probably raise an eyebrow.

Similarly, if there really is a single "right" answer to a question, like something fact-based or where best practices are extremely well established, you want convergence around that single answer every time, to the point that you effectively do have determinism in that narrow scope.

LLMs struggle with this. If you ask an LLM to solve the same problem multiple times in code, you're likely to get wildly different approaches each time. Adding more detail and constraints to the prompt helps, but it's definitely an area where LLMs are still far behind humans.

link