Hacker News new | ask | show | jobs
LinkML (linkml.io)
4 points by saxomoose 965 days ago
1 comments

Quoting from the project page: "LinkML is a flexible modeling language that allows you to author schemas in YAML that describe the structure of your data. Additionally, it is a framework for working with and validating data in a variety of formats (JSON, RDF, TSV), with generators for compiling LinkML schemas to other frameworks."

I work on data modelling for a public organisation based in Europe. I am evaluating linkml as a way to improve the production and management of our models. We would use it mainly for the linked data features, but are also interested into the implementation-oriented artefacts (JSON Schema, SQL DDL,...).

Trying to gather some feedback and open a discussion on this topic.

It's pretty clear the semantic web made a mistake in pushing OWL first and validation later. I don't know how many products that generate OWL really generate OWL DL but I know it isn't most of them. Without a validation system that gives good error messages for both publishers and consumers it is garbage in, garbage out with an emphasis on garbage.
I agree. Keep ontologies lightweight (RDFS/OWL) and work validation constraints out in SHACL.

I like the notion of polyglot modelling pushed forward by LinkML. Have a single source of truth in YAML (or in some format which conforms to the LinkML metamodel) and derive whichever schemas are needed from there. The management of dependencies is made easy thanks to the imports of other models. LinkML also forces the user to work with a well-identified set of datatypes.

[SOML](https://platform.ontotext.com/semantic-objects/soml/) is a similar initiative developed at Ontotext.

I'd add also that the SPARQL database paradigm runs into trouble because it doesn't have a completely clear model of where a "document" or a "record" begins and ends. If I want to build an everyday boring bizapp I need to be able to delete a "customerrecord" and that's not simple as nuking

   :ThatCustomer ?p ?o .
but there could be RDF lists and other hierarchical stuff that I think is part of a customer record, but other related things (say an order record) that I don't think are part of the customer record.

Document databases like arangodb give a more clear model for updating and transactions as do SQL databases with rows. I guess I could write some framework to help but it just seems very dangerous to build an OLTP application on top of a SPARQL database without one... And some specification of what exactly a "record" is would be necessary for such a framework, does LinkML answer that?

Interesting point. terminusdb works as a graph of documents and I believe it addresses the issue you are discussing.

LinkML on the other hand is a modelling language and it does not concern itself by how things are queried when the models are implemented.