Hacker News new | ask | show | jobs
by ktk 2657 days ago
Unfortunately a lot of stuff going on in the RDF domain is behind firewalls so it's a bit hard to give a lot of details. But I can contribute some public and some private use-cases of where RDF is used:

Refinitiv (formerly Thomson Reuters Financial and Risk) knowledge graph is built completely on the RDF stack: https://www.refinitiv.com/en/products/knowledge-graph-feed

When I talked to them in late 2017, they told me they have 100 billion triples in their database, plus more in a versioning back-end. Their triplestore is open source: https://github.com/CM-Well/CM-Well

Several government-agencies all over the world start to build public RDF knowledge graphs. I'm closely involved in the one from the Swiss government, see my presentation from last week http://presentations.zazuko.com/Swiss-LOD-Platform/

There are similar projects in other countries like the Netherlands, Belgium, UK, etc. This stack makes a lot of sense for open data, as you can do some pretty crazy queries without spending 2 days on preparing your data. See for example the Swiss Open Data Advent Calendar of 2018: https://twitter.com/linkedktk/status/1076064066525949952

As I said there are many "behind the firewall" use-cases where people use the stack exactly because of its features like OWL. Yes it comes at a price (bootstrapping is not really super easy) but this is stuff we will still run in 40 years from now. I see it in:

Finance: Fraud detection, compliance, customer 360° views, ... Stardog (https://www.stardog.com/) lists Moody's, BNY Mellon and National Bank of Canada as customers, last week I've met someone from Credit Suisse which is Mr. RDF there. * Production: You have a ton of databases containing products you create but there is no way to figure out what a final product consists of as the data is scattered across at least 5 of them. The automotive supplier I talk about here is using RDF to get that view.

Life sciences: The largest RDF dataset available to the public is UniProt and related datasets. In total they provide a SPARQL endpoint (RDF database) with 50 billion (!!) triples available. This is a highly popular dataset and is used in pretty much every larger pharmaceutical enterprise as well. See https://www.uniprot.org/ as a starting point. I know at least of one large life sciences company that just recently decided that RDF will be the base of all future data unification standards within the organization.

Insurance business: One of our customers is using RDF to unify a ton of different systems and get the 360° view as well about their customers.

RDF is an absolutely amazing stack and I do not see anything else available that gets remotely close to the power of it. The day I find something more powerful, I will be the first using it. But most of the time people dismissing RDF have zero clue about what it really can do.

2 comments

I am part of a team building an RDF database to be used for environmental footprinting and industrial ecology (https://github.com/BONSAMURAIS), and am also slowly becoming part of the Swiss open data scene - I would appreciate a chance to chat with you about your experiences!

For us, RDF seems like the only technology that can easily adapt to the large number of data types that we envision collecting.

sure, more than happy to. You will find me at @linkedktk on twitter or adrian.gschwend @ zazuko . com
You sound like you may understand RDF well enough to answer.

If I have transcribed voice convo data, with date/time, names, location, sentiment and extracted subject matter keywords, would Jena/RDF and/or related tools be appropriate for exploring relationships and trends between data points?

Thank you.

Yes that sounds like a pretty good fit for RDF. Do you have some examples on the data? Probably pretty straight forward to do an RDF model that could be used for analytics later