| > like we all do Do we though? Sure, we communicate sequentially, but that doesn't mean that our internal effort is piecewise and linear. A modern transformer LLM however is. Each token is sampled from a population exclusively dependent on the tokens that came before it. Mechanistically speaking, it works similarly to autocomplete, but at a very different scale. Now how much of an unavoidable handicap this incurs, if any, is absolutely up for debate. But yes, taking this mechanistic truth and only considering it in a shallow manner underestimates the capability of LLMs by a large degree. |