Hacker News new | ask | show | jobs
by mariarmestre 1649 days ago
This is to build your own knowledge base. In many cases, Wikidata might not have the data you're looking for. For example, in the tutorial I have linked, the task is to come up with all the products released by a list of companies. Toutiao would be a product of Bytedance. This is a relation that might not exist on Wikidata (I tried to search for it but could not find it https://www.wikidata.org/wiki/Q24835387).
4 comments

I added ByteDance as the creator and owner of Toutiao. (It was already listed as a "product or material produced" on the ByteDance page https://wikidata.org/wiki/Q55606242 )
Wikidata is a very complete knowledge base, but I think there is still room for a tool like this to be used on Wikipedia data. There might be missing information on Wikidata which is still found on Wikipedia (e.g. list of Bytedance products or investors is incomplete, or the number of employees is also missing from Wikidata). This tool can be used to uncover these relationships for your application, or to feed back into Wikidata if it's of public interest.

It could also be that you are trying to extract data to train a named entity recognition model. In that case, you want to extract the paragraph or sentence that has the information and the label.

Why not use that to enrich Wikidata?
You can add relationships to Wikidata. Something like "is a product of" probably already has a property, and would be well within the scope of Wikidata.