| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Buttons840 1122 days ago

I don't think this is a fair argument. If we compare a GPT4 architecture with 5,000 parameters and a GPT4 architecture with 1 trillion parameters, should we judge the capabilities of both by the 5,000 parameter version, because they're both the same architecture?

There is more than architecture that can set them apart as well. GPT4 may have been trained by a slightly different algorithm, or on different data, and this can result in fundamentally different results.

Most of these conversations are not focused on one specific version, but are about the capabilities of LLMs in general, and it is implied we are talking about state-of-the-art LLMs, and GPT3 is no longer state-of-the-art.