Hacker News new | ask | show | jobs
by patentatt 2661 days ago
If your AI can accurately label my training data, why wouldn’t I just use your AI for my application?
2 comments

Great question! Our AI is trained specifically for data annotation and that leads to a few differences. For example, our data annotation pipeline is not "end-to-end" and would not be suitable for a real-time deployment -- it works on multiple stages and some stages are done by human workers (i.e. editing or verifying the output of the AI), while others are automated.
Hi, I'm Jonghyuk, one of the co-founders. One point I would like to add is that a 90% accurate AI model may not be very useful for an application, but with the right data pipeline and well-designed system, we can extract quite a bit of boost out of it for data annotation.
Numerically, how much is "quite a bit of a boost"?
It depends on how accurate our AI performs on a particular task, but as a back-of-the-envelope calculation, if we had a 90% accurate AI that means human annotators only have to work on the remaining 10%, giving us 10x boost. Obviously, there is some overhead not accounted for in this calculation, but with our current technology we can boost up to 10x depending on the type of data.
How do you know which are the 90% it got wrong and which is the 10% it got right?
We have both AI-assisted and manual inspections in the pipeline. A good analogy would be an assembly line where humans and machines collaborate not only for building things but also for the quality control (ie. vision inspection system + manual inspection)
Do other training data providers use ML/AI to do initial screens?