| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by bhuga 2826 days ago

I was involved in the semweb community ~7 years ago, particularly the "RDF knowledge graph" end, and it's still a bewitching idea. A lot of smart people have/still do work in it, but it never reached any kind of success on the commercial (as opposed to academic) web, because:

Serialization is not the hard part.

The semweb community was obsessed with ontologies and OWL and schemas and taxonomies. If we can just break the problem down enough, the logic went, then systems will be able to infer new data about the world. But it never worked out that way.

Eventually you just have to write some code. If you have to write code anyway, all the taxonomies and RDF in the world aren't helpful (indeed, they're almost certainly the least efficient way to model the problem). You just scrape the pieces of knowledge out of whatever JSON, HTML, or whatever else and glue them together with the code. You don't need the all-knowing semantic web, you just need a .csv of whatever tiny piece of it you care about.

I have a distinct memory of trying to sell someone on the startup I was working on, a SPARQL database. I was pitching RDF as a way to model the problem, but eventually the person I was pitching just said "well, we can just outsource the scraping to our eastern european devs and put it all in one big table." I had a kind of "oh my" moment where I realized that the startup was never going to work: in the real world, you just write code and move on. Taking part in the great semantic knowledge base of the world doesn't matter and isn't needed.

The other end of semweb, the "machine-readable web", more or less came to pass. schema.org, opengraph, and that sort of thing did 99% of what the semweb community wanted at 5% of the effort. The fact that all of that data is not in one giant database doesn't really matter to anyone; you rarely care about more than 2 or 3 web pages at once.

1 comments

pbhat 2825 days ago

I worked for semantic web startup. The idea was we'd build private "knowledge graphs" for companies especially Pharma and Biotech. We experienced something similar to what you describe. We had a nice RDF generator and a query engine. The idea was we'd parse data from clients' DB and unstructured stuff and generate semantic graphs - whcih would be used for semantic graph apps like search and inferences. Looking back, it was never going to work. Most clients came to us for "analytics dashboard". They were happy with giant tables to power these dashboards (and they were right!)

link