Y
Hacker News
new
|
ask
|
show
|
jobs
by
euclaise
1174 days ago
I don't trust this. The article cites semafor (
https://www.semafor.com/article/03/24/2023/the-secret-histor...
), but semafor states the 1T parameter count without any source.
3 comments
QuadrupleA
1174 days ago
Yeah seems spotty. Especially considering recent "chinchilla scaling" laws suggesting training set size is generally the current bottleneck, the mileage llama/alpaca gets out of 7b/13b, the huge inference cost of 1T, etc.
link
brianjking
1174 days ago
Yeah, I'm highly suspicious too. Even the arxiv article from the MS researchers doesn't have specifics about the # of parameters in GPT-4.
link
lambo4bkfast
1173 days ago
Sam altman said it had 1T parameters in the Lex Fridman podcast
link