| I see lots of questions about TLV scheme problems. I should have listed them last night, indeed. First, some generic problems with TLV encodings: - they necessarily result in unnecessarily
redundant encodings -- this is wasteful, bloat
- that redundancy is of zero help to a compiler
- that redundancy is a psychological crutch to
any programmer writing hand-coded codecs, but
this often has led to serious bugs
- tag allocation has to be managed, and here
again you really want a compiler to do it for
you -- ASN.1 eventually added AUTOMATIC tags,
but the damage of not having had those was
done
Next some problems specific to DER-like definite-length TLV encoding rules: - streaming encoding is infeasible -- you have
to know the definite lengths before you
start encoding, so you lose
- you either have to compute the length of the
encoding of any value before you begin
encoding it, or you have to encode "back to
front" (and then possibly realloc as needed)
or both
There's more, but I'm not too familiar with the issues around CER-like indefinite-length encoding issues.Bottom-line: TLV is an unnecessary crutch. Compilers simply don't need it. For proof by existence consider that Sun's rpcgen(1) existed in 1986, a mere two years after ASN.1's 1984 standard, and rpcgen(1) uses XDR syntax and encoding -- XDR is NOT a TLV encoding at all. But ASN.1 tooling -proprietary and open source- took much longer to catch up with XDR and IDL/NDR and other things. It's almost like TLV encodings made it harder to get to compilation because they were a crutch for hand-coding codecs. But even XDR is easy to hand-write codecs for! BTW, XDR and NDR were basically the first flatbuffers-like encodings. Lustre RPC has an even more flatbuffers-like encoding, but it's hand-coded. There's just nothing new in this space, and there hasn't really been anything new in this space in many years. > At least in file formats it to me seem they would be instrumental to have a extendible and flexible format, where you can skip unknown or uninteresting chunks (in say, PNG chunks, or IFF-based formats like OBJ, etc.). TLV is NOT necessary for this sort of extensibility. You naturally end up with something like TLV when using non-TLV encodings with support for extensibility, though it's often more like LTV. Let's say you have a struct you want to make extensible in some non-TLV encoding you're designing... What would you do? Well, knowing ASN.1's PER/OER and knowing how we've dealt with this in XDR I would do this: add an octet string field to the end of every extensible struct! What would that octet string contain? The encoding of the extensions. What if you want to support different kinds of extensions in a mix-and-match way? Well, that's easy too: add a discriminated union or "typed hole" to the end of every extensible struct, with every choice taken having a Length prepended to it so you can skip it. Extensibility is something that has been beat to death in the ASN.1 space, and it has all of these options: - extensibility markers in SEQUENCE / SET types (i.e., "struct" types) - extensibility markers in CHOICE types (i.e., discriminated union types) - extensibility markers in INTEGER and BIT STRING constraints (i.e., enum types) - rules for handling known and unknown extensions in each ER (encoding rules) - typed holes. A typed hole is just a glorified discriminated union with an "external" sort of discriminant and specification of the union arms' types. Basically, a typed hole is just a struct with two fields: a) a type identifier of some sort (an integer, a string, an OID, a relative OID, whatever), b) an octet string containing an encoding of the value of a type identified by (a). ASN.1 has syntax and semantics for expressing what type IDs go with what types, and so you can actually have compilers that recursively and automatically decode/encode through typed holes. > Do you feel that the same doesn't apply to serialisation formats? How are the non-tlv binaries encoded then? Just implied offsets according to the schema? Can you then evolve the schema at all, or do you feel that both producer and consumer should have always access to the full schema, and flexiblity here is a non-feature? I address this above. This is all addressed in ASN.1 (and also XML because of XMLNS). Many very smart people who came before you and I saw to it that ASN.1 addressed all these issues definitively long ago. |