Hacker News new | ask | show | jobs
by mariarmestre 1647 days ago
Thanks so much for your comment! You're right that this annotation tool can be used on any form of free-form documents found online. I tackled Wikipedia first because it was an obvious first choice and they have an API to read the html. This could be opened to other sources of data, but I also do not want this to become a scraping tool, so we would need to weigh costs/benefits of adding new data sources. The additional cost of adding a new source is mostly about how difficult it is to read and parse the content. In the future, I could integrate with some paying sources (e.g. news publications), where people have to pay for the content they scrape & label.

I have a pitch deck and I'm looking for all the things you mentioned :-). I can send the pitch deck to anyone interested.