Hacker News new | ask | show | jobs
by jondot 4974 days ago
circa is a good idea, actually. From my close experience with this field, when a news article will be published it will be edited and republished many times, over many forms and shapes (Web, RSS, etc.) in many of these steps, a manual, human work is needed -- and this affects the volumes of the published news.

Further, many of the news really originate from relatively limited sources (reuters, etc), so you can plug your solution there as well.

Therefore it should be OK to assume that if you put humans at the same pipeline to summarize news manually, the capacity and efficiency will be reasonable.

1 comments

The problem in summarizing news manually is that it takes too much effort for a human to do it. The efficiency may be good, but as many news pass by, his efficiency will go down. (assuming that he's only the one summarizing)
True, but my point is people are already doing it at the start of the pipeline. Think what happens when Reuters decide to make a SaaS offering of their summarized content. Even regardless of that, you can hire a battery of professional summarizers instead of PHDs and do it pretty well.

Where this doesn't apply, and where I do think you're completely right is non-news articles: think blogs, tweets (although there's not much to summarize in 140chars), product descriptions, scientific articles, etc. These things are produced in much more volume and much less workflow around them.