|
|
|
|
|
by jameshart
1283 days ago
|
|
It’s not ‘more complexity’ than a Markov chain - it essentially is a Markov chain, just looking at a really deep sequence of preceding tokens to decide the probabilities for what comes next. And it’s not just looking that up in a state machine, it’s ‘calculating’ it based on weights. But in terms of ‘take sequence of input tokens; use them to decide probable next token’, it’s functionally indistinguishable from a Markov chain. |
|
You don't have to believe the hype, but if you think you can get GPT performance out of anything remotely resembling a markov chain, I encourage you to try.