Hacker News new | ask | show | jobs
by mccoyb 58 days ago
There is not a single mention of probability in this post.

The post acts like agents are a highly complex but well-specified deterministic function. Perhaps, under certain temperature limits, this is approximately true ... but that's a serious restriction and glossed over.

For instance, perhaps the most striking constraint about FLP is that it is about deterministic consensus ... the post glazes over this:

> establishes a fundamental impossibility result dictating consensus in any asynchronous distributed system (yes! that includes us).

No, not any asynchronous distributed system, that might not include us. For instance, Ben-Or (1983, https://dl.acm.org/doi/10.1145/800221.806707) (as a counterexample to the adversary in FLP) essentially says "if you're stuck, flip a coin". There's significant work studying randomized consensus (yes, multi-agents are randomized consensus algorithms): https://www.sciencedirect.com/science/article/abs/pii/S01966...

Now, in Ben-Or, the coins have to be independent sources of randomness, and that's obviously not true in the multi-agent case.

But it's very clear that the language in this post seems to be arguing that these results apply without understanding possibly the most fundamental fact of agents: they are probability distributions -- inherently, they are stochastic creatures.

Difficult to take seriously without a more rigorous justification.

1 comments

It really depends on your model in my opinion.

At the lowest level of abstraction, LLMs are just matrix multiplication. Deterministic functions of their inputs. Of course, we can argue on the details and specifics of how the peculiarities of inference in practice lead to non-deterministic behaviours but now our model is being complicated by vague aspects of reality.

One convenient way of sidestepping these is to model them as random functions, sure. I wouldn't go as far to say they are "inherently stochastic creatures". Maybe that's the case, but you haven't really given substantial evidence to justify that claim.

At a higher level of abstraction, one possible model of llms is as deterministic functions of their inputs again, but now as functions of token streams or higher abstractions like sentences rather than the underlying matrix multiplication. In this case again we expect llms to produce roughly consistent outputs given the same prompt. In this case, again, we can apply deterministic theorems.

I guess my central claim is that there hasn't been a salient argument made as to why the randomness here is relevant for consensus. Maybe the models exhibit some variability in their output, but in practice does this substantially change how they approach consensus? Can we model this as artefacts of how they are initialised rather than some inherent stochasticity? Why not? It feels like randomness is being introduced here as a sort of magic "get out of jail" free card here.

Just my two cents I suppose.

LLMs utilize categorical distributions defined by the logits computed by the matrix multiplies, and there are many sampling strategies which are employed. This is one of the core mechanisms for token generation.

There's no peculiarity to discuss, that's how they work. That's how they are trained (the loss is defined by probabilistic density computations), that's how inference works, etc.

> I guess my central claim is that there hasn't been a salient argument made as to why the randomness here is relevant for consensus. Maybe the models exhibit some variability in their output, but in practice does this substantially change how they approach consensus? Can we model this as artefacts of how they are initialised rather than some inherent stochasticity? Why not? It feels like randomness is being introduced here as a sort of magic "get out of jail" free card here.

I'm really surprised to hear this given the content of the post. The claims in the post are quite strong, yet here I need to give a counterargument to why the claim about consensus applying to pseudorandom processes is relevant?

I don't think it's necessary to furnish a counterexample when pointing out when a formal claim is overreaching. It's not clear what the results are in this case! So it feels premature to claim that results cover a wider array of things than shown?

For instance, this is a strong claim:

> it means that in any multi-agentic system, irrespective of how smart the agents are, they will never be able to guarantee that they are able to do both at the same time: > > Be Safe - i.e produce well formed software satisfying the user's specification. > Be Live - i.e always reach consensus on the final software module.

I'm confused as to the stance, we're either hand-waving, or we're not -- so which is it?

Re — totally fine with hand-waving for intuition.

I just came away from the read thinking that this post was pointing to something very strong and was a bit irked to find that the state of results was more subtle than the post conveys it.

If you're pushing me, let's say we're not hand waving then. LLMs, abstraction removed, are deterministic computations of matrix-multiplication, f(x) -> y. If you want, we can make them pseudo-random, but thus still a deterministic process. FLP then holds. I'm not sure what your confusion is.