Hacker News new | ask | show | jobs
by jsjohnst 835 days ago
> LLM does not have any of this, architecturally, it just has the text itself.

I feel like you are maybe being a bit too focused on specifics of how the LLM works where as:

> The way a human interjects is that you have a parallel thought chain going

You are more abstract in the human case.

They really don’t need to be different here. The LLM could be running predictions in parallel each time you type another token playing out where the conversation is going. You could then layer on another model which blends these together (vaguely like MoE works) and is trained on opportune times to interject. Think of it like a chess playing AI, but rather with the goal of interjecting appropriately vs Checkmate.

The amount of compute power to run all these inferences at once would be fairly expensive, but it’s technically all possible today and wouldn’t be that much different than the human case for this specific scenario imho.

1 comments

Running predictions in parallel is just doing prediction and we're back at square one. Why do things in parallel in that case? At that point, you are just training an "opportune injection model" with the existing token stream as it comes. Which is subject to exactly the limitation that I described.

These models do have an implicit model of thought, but it is only accessible through the token interface. You need more explicit access, which is not possible given the current architecture.

I'd like to be wrong here.