| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mccoyb 58 days ago

LLMs utilize categorical distributions defined by the logits computed by the matrix multiplies, and there are many sampling strategies which are employed. This is one of the core mechanisms for token generation.

There's no peculiarity to discuss, that's how they work. That's how they are trained (the loss is defined by probabilistic density computations), that's how inference works, etc.

> I guess my central claim is that there hasn't been a salient argument made as to why the randomness here is relevant for consensus. Maybe the models exhibit some variability in their output, but in practice does this substantially change how they approach consensus? Can we model this as artefacts of how they are initialised rather than some inherent stochasticity? Why not? It feels like randomness is being introduced here as a sort of magic "get out of jail" free card here.

I'm really surprised to hear this given the content of the post. The claims in the post are quite strong, yet here I need to give a counterargument to why the claim about consensus applying to pseudorandom processes is relevant?

I don't think it's necessary to furnish a counterexample when pointing out when a formal claim is overreaching. It's not clear what the results are in this case! So it feels premature to claim that results cover a wider array of things than shown?

For instance, this is a strong claim:

> it means that in any multi-agentic system, irrespective of how smart the agents are, they will never be able to guarantee that they are able to do both at the same time: > > Be Safe - i.e produce well formed software satisfying the user's specification. > Be Live - i.e always reach consensus on the final software module.

I'm confused as to the stance, we're either hand-waving, or we're not -- so which is it?

2 comments

mccoyb 58 days ago

Re — totally fine with hand-waving for intuition.

I just came away from the read thinking that this post was pointing to something very strong and was a bit irked to find that the state of results was more subtle than the post conveys it.

link

gopiandcode 58 days ago

If you're pushing me, let's say we're not hand waving then. LLMs, abstraction removed, are deterministic computations of matrix-multiplication, f(x) -> y. If you want, we can make them pseudo-random, but thus still a deterministic process. FLP then holds. I'm not sure what your confusion is.

link