Hacker News new | ask | show | jobs
by rytill 404 days ago
Don’t forget the training data!
1 comments

We are far from open training data... training data might even be incriminating.
100%, though I still feel as though open training data will eventually become a thing. It'll have to be mostly new data, synthetic data, or meticulously curated from public domain / open data.

Synthetic training data sets, even robotically-acquired real world "synthetic" data, can rapidly create training sets. It's just a matter of coordinating these efforts and building high quality data.

I've made a few data sets using Unreal Engine, and I've been wanting to put various objects on turn tables and go out on backpack 3D scan adventures.

Someone will have to pay for it, though.