Hacker News new | ask | show | jobs
by abraxaz 1823 days ago
> I’ve been considering using it for some projects, but the main thing that’s keeping me away is the concern that some moderator will decide that my data doesn’t fit and remove it.

To me the greatest value of Wikidata was making me aware of RDF and SPARQL.

In most cases, if you are relying on data business needs, it would be best to maintain your own RDF dataset and host it either just on HTTP, or on something like https://dydra.com/.

WikiData deseperately needs RDF ingestion, and if this is made available (can be done outside of Wikidata) then it would be easier to periodically sync datasets with Wikidata.

On that note however, you could export all Wikidata triples you need and just host that on your own SPARQL server (e.g. Jena) or use it with RDF tools like rdflib.

1 comments

RDF ingestion is problematic for Wikidata, because importing a dataset to Wikidata requires reconciling existing entities so as to avoid duplicare entries. The easiest way to achieve that is to publish your dataset online, create a linking Wikidata property for it, then ask for it to be imported in https://mix-n-match.toolforge.org where reconciliation can be done by the crowd.
Last I checked mix-n-match was using CSV, while this is okay, it still would be nicer to have direct RDF ingestion. And yes, I realize the reason why Wikidata does not have it, but it is not impossible to provide, just really difficult. I would work on it if I had more time and would likely sometime in the future.