Y
Hacker News
new
|
ask
|
show
|
jobs
by
beernet
1385 days ago
It depends on the "DL model", which is a highly vague term. Both a model with 10K parameters and a model with 10T parameters fit this description equally Well.