Hacker News new | ask | show | jobs
by orion138 2554 days ago
Looks interesting but it’s unclear to me, do you copy the threads content?

One other comment, there seems to be a spammer on the review page...

1 comments

> do you copy the threads content?

Basically we take each message of the thread, normalize the typography so that it can be consistently styled, and then strip off the signatures and quoted reply text while leaving any quoted inline replies. There are some open source libraries that do parts of this, but none of them worked well enough for this particular use case so we ended up just developing some new techniques ourselves.

The other thing the tech does is it lets people run much more accurate NLP on threads. This isn't fully productized yet, so right now there are just a handful of companies we're working with to clean and normalize their email data.

In terms of the spammer, yeah I see that but I'm not sure what I can really do. As the app creator I don't seem to have any special ability to delete comments or flag things as spam.