| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by fortes 3849 days ago
	I used to work at Flipboard, and we invested a lot in this issue. It is not easy, and requires constant (constant!) maintenance. Getting to 80% quality isn't hard. 90% is tricky. 95% incredibly costly.

2 comments

rkho 3849 days ago

Completely agree. A friend and I tried to do something like this as a fun project at a hackathon, getting to 80% wasn't difficult, just a lot of parsing the DOM for articles. Dealing with things like adverts, photo captions, comments, and other text that shouldn't be in the actual article was the real pain -- especially when we wanted to detect paragraph/subheader breaks since we wanted to parse articles and text-to-speech.

link

dang 3849 days ago

Good point, the constant (constant!) maintenance aspect means there would need to be a sustainable plan. On the other hand, if lots of projects started depending on the library, you'd at least get a steady supply of notifications about breakage, and perhaps fixes as well.

link