| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by PaulHoule 643 days ago

If you can get the math right you can frequently develop a very good system for representing data in RDF and writing SPARQL queries against it. A week of high-quality thinking can save you six months of time developing an alternate query system; the custom query system might be better but it probably won't be. It's easy to make something that is faster for specialized queries but unlikely you can build something that will let you write complex and versatile queries better than SPARQL.

The key though is coining good identifiers, developing a good set of properties, and understanding how datatype properties work and using them well. It's very easy to develop a bad standard like Dublin Core that, unfortunately, perpetuates the bad stereotypes people have of the RDF world.

The SPARQL spec is dense reading

https://www.w3.org/TR/sparql12-query/

but it's a tiny spec. The SQL spec on the other hand is broken up into numerous $200 documents and if you did look at them you'd find it's much much messier. If you felt SPARQL needed something extra it's a good base to work from to develop some kind of SPARQL++ and the same is true with the RDF model. (e.g. add something to every triple to record provenance, for instance)

My two complaints with SPARQL are: (1) there are two official ways to represent ordered collections and a third unofficial one; if you are good at SPARQL you can write queries that can do the obvious things you want to do with ordered collections (like you'd see in JSON query languages like N1QL or AQL) but there ought to be built in functions that just do it, (2) you can write path queries like

   ?s (ex:motherOf|ex:fatherOf)+ ?o .

which will match ?o being an ancestor of ?s. Sometimes you need to capture the matching path and SPARQL as it is doesn't provide a way to do that.

2 comments

ktk 637 days ago

I'm just here to say that I fully agree with your comment. Hardcore RDF person myself but I wouldn't trade it for anything else. Once you master it it feels like cheating and/or magic sauce :)

spothedog1 643 days ago

Can you explain why you think Dublin Core is a bad standard?

PaulHoule 643 days ago

(1) You can line it up side by side with the 1970 MARC standard

https://www.loc.gov/marc/

and, in terms of capabilities, MARC comes out way ahead. MARC is a standard for a university library, Dublin Core seems to be a standard that almost works for an elementary school library.

(2) Specifically, people who write a paper or a book will get prickly about the order that authors are listed in, but Dublin Core doesn't provide a good answer, particularly if you want to use authority records. I mean

   :Paper
       dcterms:creator "Alpher, Ralph" ;
       dcterms:creator "Bethe, Hans" ;
       dcterms:creator "Gamow, George " .

doesn't cut it because when you get the results back they could come back in any order. RDF has two different ways to represent ordered collections and they could have let you (required you to) write

    :Paper
       dcterms:creator ("Alpher, Ralph" "Bethe, Hans" "Gamow, George") .

which looks just like a Lisp list and internally is structured like one, but they didn't. In the XMP specification Larry Masinter specified that you do this

https://github.com/adobe/XMP-Toolkit-SDK/blob/main/docs/XMPS...

and boy there was a lot of good ideas in the XMP spec but Adobe wound up NERFing the implementation because Adobe was accused of throwing it's weight around too much. Sure you could write

   :Paper dcterms:creator "Ralph Alpher, Hans Bethe, George Gamow" .

but that won't work if you want to use URIs that point to authority records like the DC spec advises you to do. People hear RDF and think "Nothing to see here, move on" because of standards like Dublin Core that simultaneously seem inadequate and over complicated at the same time.