Y
Hacker News
new
|
ask
|
show
|
jobs
by
ketchup32613
11 days ago
Do you want to see scaling curves wrt data and param size? I agree that 1.2B and 10B tokens is not representative, but what scale of parameters and dataset sizes would be convincing?
1 comments
zxexz
10 days ago
Not to sound facetious, but perhaps enough runs at different param/token sizings to define a curve?
link