Hacker News new | ask | show | jobs
by joe_the_user 5586 days ago
Graph databases intrigue me too.

The thing is, it took me a little while to realize that the "the semantic web" is a very specific model where providers are be expected to explicitly provide the semantic decoration/meta-data for all their content. http://en.wikipedia.org/wiki/Semantic_Web

I basically don't believe that this particular approach will ever work (ie, the flood-gates won't open and content providers won't suddenly label all their data). I mean, this approach has been the failed-model of hypertext since ... project Xanadu, mid-sixties (a well-tended, fully meaningful store of data).

Instead, Google and other search engines and tools will just get smarter.

We'll find more ways to incidentally get semantic information from the raw data that's out there. But no will have enough incentive to manually provide that much deep-meaning for their data themselves (and anything whose semantic meaning can be automatically processed can be put on the web for someone else to automatically process). The semantic web approach is always going to be behind the curve compared to just putting raw, unstructured data out-there.

The more uses we find for information, the more ability we'll get to extract meaning from it without the data starting out labeled.

Look at what Watson could do.

-- And I am working on a tool that extract implicit information from the process of people interacting with data. Extracting implicit, inferred and deduced relations has much more promise even if it can't rely on explicit semantic labels. This is more or less what Google does also (it's true that so-far, Google's stuff is considered "semantically meaningless" and I know Google bought Metaweb. We'll see what they get from it...)

1 comments

I've tried many times over the years to get excited about the semantic web, but invariably always arrive at the same conclusion as you. I also think that semweb really suffers from over engineering a solution in search of a problem. If you have a problem that it seems like the semantic technologies could be helpful for, you can probably solve it faster using logic programming and automated reasoning techniques from the 90s (ask the fast majority of semantic web enthusiasts how RDF triples can be represented as prolog clauses and you'll likely be treated as though you asked how a raven is like a writing desk). Worse yet though are tools like OWL, which is over-engineered to the point where out-of-the-box it describes logic in a way that is computationally intractable. If you where designing semantic technologies to solve a real problem you would never arrive at something like OWL.

I actually think something like semantic technologies could be more useful than the tools that drive watson and google for smaller data sets, where machine/statistical learning are less useful, but even then they are an over engineered solution.