Hacker News new | ask | show | jobs
by hos234 2377 days ago
I am still a fan of Googles OpenRefine tool. It's reconciliation feature that helps disambiguate Named Entities etc based on wikidata is really powerful - https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation

You can hook in your own reconciliation end point which we do at work to expand internal knowledge graphs.

2 comments

Note that OpenRefine isn't really kept up to date.

The basic capabilities work ok, but lots of the additional capabilities have atrophied away.

This is awesome, thanks so much for sharing. I'm really surprised I've never come across it because I've thought of building something like this before.

I really want to look into how this could ingest my own post-GDPR data exports, as well as data sanitization for ML projects.