Y
Hacker News
new
|
ask
|
show
|
jobs
by
delis-thumbs-7e
23 days ago
Wouldn’t that be extremely computationaly expensive considering how resource incentive training is?
1 comments
colechristensen
23 days ago
No, training a state of the art model involves training on the order of 10 trillion tokens.
We're talking about a step that updates weights based on say between 10k and 1M tokens.
link
delis-thumbs-7e
23 days ago
I learned something. Thank you!
link
We're talking about a step that updates weights based on say between 10k and 1M tokens.