| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by patentatt 2661 days ago
	If your AI can accurately label my training data, why wouldn’t I just use your AI for my application?

2 comments

hyunsoo90 2661 days ago

Great question! Our AI is trained specifically for data annotation and that leads to a few differences. For example, our data annotation pipeline is not "end-to-end" and would not be suitable for a real-time deployment -- it works on multiple stages and some stages are done by human workers (i.e. editing or verifying the output of the AI), while others are automated.

link

jonghyuk0605 2661 days ago

Hi, I'm Jonghyuk, one of the co-founders. One point I would like to add is that a 90% accurate AI model may not be very useful for an application, but with the right data pipeline and well-designed system, we can extract quite a bit of boost out of it for data annotation.

link

jeromebaek 2661 days ago

Numerically, how much is "quite a bit of a boost"?

link

hyunsoo90 2661 days ago

It depends on how accurate our AI performs on a particular task, but as a back-of-the-envelope calculation, if we had a 90% accurate AI that means human annotators only have to work on the remaining 10%, giving us 10x boost. Obviously, there is some overhead not accounted for in this calculation, but with our current technology we can boost up to 10x depending on the type of data.

link

treis 2661 days ago

How do you know which are the 90% it got wrong and which is the 10% it got right?

link

hyunsoo90 2661 days ago

We have both AI-assisted and manual inspections in the pipeline. A good analogy would be an assembly line where humans and machines collaborate not only for building things but also for the quality control (ie. vision inspection system + manual inspection)

link

thoughtstheseus 2661 days ago

Do other training data providers use ML/AI to do initial screens?

link