Hacker News new | ask | show | jobs
by mxwsn 28 days ago
No, there are more training tokens than parameters in LLMs. They are in the classical first descent setting.