Hacker News new | ask | show | jobs
by sterlind 440 days ago
It's not so surprising to me. It's like how Markov chains get better at passing for human the more N-grams they memorize. larger models will continue getting marginally better at predicting the distribution (human language.) but that doesn't translate into improved intelligence.
1 comments

The point is, it isn't marginally better. I agree the setup is not a demonstration of intelligence, but the difference is pretty significant. Not to mention that on conventional benchmarks Llama 405B is usually worse than GPT-4o.