Hacker News new | ask | show | jobs
by throwaway314155 549 days ago
Data work is traditionally both incredibly valuable and something no one wants to actually do. Further, ImageNet has probably been cited a _lot_ in other important research.

I agree that it feels a _tad_ underwhelming but that's what these people do - try to strike big with some research and then spend their lives convincing others of the value of that research. If you're lucky, you might even do this a few times as she appears to be trying to do.

1 comments

Disagree. Stanford researchers have been making datasets for years and years. SQUAD is another Stanford dataset. Everyone knows that publishing datasets gets you citations. But that gig is now sorta over because the word is out.
Sure maybe that's true now. Less true when ImageNet was created though. And in any case, what is this argument that this is a "citation-hack"? Like - yeah if you found the dataset useful during your own research you should cite it... ImageNet did in fact provide value for many years. All the original work for guided diffusion trained on ImageNet, just as a for instance. Of course now we have superior, larger datasets like LAION and whatever OpenAI uses internally. But w.r.t. the times, it was valuable.