Hacker News new | ask | show | jobs
by HelloNurse 1497 days ago
It helps if this machinery can reject data and thus perform validation. Since recursive construction of union types (valid records can look like this, or also like that...) is trivial, a programmer somewhere has to draw the line between "loosen the schema to allow this record" and "reject this record to enforce the schema".
1 comments

Author here. Agreed! Validation is important. While I didn't make this point in the article, our thinking is schema validation does not require that the serialization format utilize schemas as the building block and you can always implementation schema (or type) validation (and versioning) on top of super-structured data (as can also be done with document databases).
this is a major hassle when converting from avro (from kafka which uses a schema registry, so schemas are not shipped with the avro data) and storing in parquet which requires a schema in the file but you can 'upgrade' it with another schema when reading it. It would be great to have a binary protocol-like format (schema-less avro), and a schema-less columnar storage format.. which is I guess is what these guys are doing.
Hear, hear!