Y
Hacker News
new
|
ask
|
show
|
jobs
by
mxwsn
28 days ago
No, there are more training tokens than parameters in LLMs. They are in the classical first descent setting.