| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by morganK 3735 days ago
	Would have like to hear at least one concrete exemple of startup actually doing that. Seems a bit theoretical at the moment, as big companies doesn't need to do that thanks to existing datasets, and I've never heard any startups using dozens (hundreds?) of contractors for this kind of job.

7 comments

tariqali34 3735 days ago

Netflix used humans to tag movies for their recommendation system.

Source: http://www.theatlantic.com/technology/archive/2014/01/how-ne...

link

LunaSea 3735 days ago

Netflix is not a startup.

link

true_religion 3735 days ago

At one point, Netflix was a startup.

link

LunaSea 3735 days ago

Yes but it wasn't in 2014 or 2012.

link

HillRat 3735 days ago

CrowdFlower does AI and ML-focused microtasking, though I have no experience with them. Even large companies need plenty of preprocessing done on their datasets, so it's common to use offshored services companies or divisions to do annotation and cleanup work on corpora before using them as training sets.

link

johndavi 3735 days ago

In very broad strokes this is how we power many of our API features at Diffbot. We have hundreds of thousands of human-trained web pages amounting to millions of individual elements that have helped to train our system.

link

RobertoG 3735 days ago

Not a start-up and not deep learning (until now I suppose), but this have been done for years in the translation industry.

They feed their automatic systems with the output of the human translator. Every input means less and less manual work that need to be done in the future.

link

globba22 3735 days ago

the post office used humans for many years to train OCR models, e.g. zip code readers.

I visited a postal routing facility once in the 90s and saw a long row of metal stationed by about 20 people, 10 to each side. Envelopes passed through on a sort of pneumatic tube-like conveyor, paused in front of a human operator who read a single digit of a zip code, keyed it in and sent the envelope to be read by the next person.

link

nl 3735 days ago

Many, many startups use Amazon Mechanical Turk and/or CrowdFlower for this exact thing.

See http://blog.echen.me/2012/04/25/making-the-most-of-mechanica... for some examples.

link

klochner 3735 days ago

hunch

link