Hacker News new | ask | show | jobs
by aabhay 549 days ago
Disagree. Stanford researchers have been making datasets for years and years. SQUAD is another Stanford dataset. Everyone knows that publishing datasets gets you citations. But that gig is now sorta over because the word is out.
1 comments

Sure maybe that's true now. Less true when ImageNet was created though. And in any case, what is this argument that this is a "citation-hack"? Like - yeah if you found the dataset useful during your own research you should cite it... ImageNet did in fact provide value for many years. All the original work for guided diffusion trained on ImageNet, just as a for instance. Of course now we have superior, larger datasets like LAION and whatever OpenAI uses internally. But w.r.t. the times, it was valuable.