Hacker News new | ask | show | jobs
by afrisch 2404 days ago
CDuce was the result of my PhD thesis (about 20 years ago); mostly just a research prototype with enough engineering efforts to make it usable for small enough projects. It came after XDuce, which introduced the idea of building a functional language around regular expression types (used to XML schema languages, DTD, XSD, Relax). My work focused on distilling the theory from XDuce into more primitive constructs from type theory (products, unions, recursion), and embedding them into a more expressive type system and language (with set-theoretic intersection and negation, function types, extensible records -- used to model XML attributes, etc), also with a powerful XML pattern matching engine and an efficient implementation of type-checking (just deciding subtyping is in theory exponential in the size of schema, but works well in practice). The theory could probably be used to serve as the basis of statically-typed languages working, on, say, "typed" JSON structures. The work was/is continued by my PhD advisor and other colleagues to include parametric polymorphism (original CDuce supported ad hoc overloading polymorphism only).

The idea was just that if your language could directly express constraints on your document types in its native type system, the compiler could directly type-check statically complex transformations and make sure they produce documents from the expected output schema (assuming the input complies with the announced input schema). This is more direct than having to rely on mapping between XML and "native" data types, which (usually) don't fully preserve constraints imposed by XML schema languages, and are themselves tedious and fragile to write. This works well for XML->XML transformations. Of course, in most applications, XML parsing and/or generation is just a tiny part, which shouldn't affect the choice of an implementation language. With OCamlDuce, I explored the idea of extending OCaml to include XML types. The combination felt a bit ad hoc, but was ok. Today, it could be rebuilt indeed about PPX extension points + some type-checking hooks in the OCaml compiler.