Hacker News new | ask | show | jobs
by gajju3588 3016 days ago
Lets say we want to create a labeled data for text summarization for medium articles. Could the highlighted part be used as summary, its not auto labeled per se, but can be a proxy and passed to labelers to verify/edit.
1 comments

Sure. There are lots of useful proxies for labeled data.

It's worth noting that highlighted sections in Medium articles probably aren't great summaries (they are more a representation of important points - which is a useful thing to predict as well).

For example, many summarizer systems are trained on the single-line summaries given in news media systems. There have been attempts to use Tweets as summaries for linked articles too.