Hacker News new | ask | show | jobs
by gajju3588 3018 days ago
Auto labeling would be way forward for Supervised Algorithms. Get some data to annotate from your team, and tag rest of them using auto-labeling. https://dataturks.com/ could be such player, Not sure how will these survive in front of Google.
1 comments

There's no such thing as auto-labeling.

Data Turks is manual labeling.

There is active learning[1] and related algorithms where you trace the boundary of your classifier and pass examples along that boundary to be manually labeled (as they are the ones the classifier is most unsure about).

But there is nothing "auto" about this - it's just being smart about where to deploy the manual labor.

[1] https://en.wikipedia.org/wiki/Active_learning_(machine_learn...

Lets say we want to create a labeled data for text summarization for medium articles. Could the highlighted part be used as summary, its not auto labeled per se, but can be a proxy and passed to labelers to verify/edit.
Sure. There are lots of useful proxies for labeled data.

It's worth noting that highlighted sections in Medium articles probably aren't great summaries (they are more a representation of important points - which is a useful thing to predict as well).

For example, many summarizer systems are trained on the single-line summaries given in news media systems. There have been attempts to use Tweets as summaries for linked articles too.