| Firstly, `parsing` is just a way to say "serialise from a string". The reverse operation can be done for every type you are creating. If the reverse operation (serialise to a string) does not exist in the interface then adding it gives you a single place to catch all the bugs. I'm thinking of that recent git bug that occurred because the round-trip of
`string -> type -> string` had an error (stripping out the CR character). Using a specific type for a value that is being round-tripped means that a bugfix needs to only be made in the parser function. Storing the value as simple strings would result in needing to put your fix everywhere. > The trouble I have with this approach (which, conceptually, I agree with) is that it's damned hard to do anything with the parse results. You're right - it is damn hard, but that is on purpose; if you're doing something with the email that boils down to "treat it like a `char *`" then the potential for error is large. If you're forced to add in a new use-case to the `email_t` interface then you have reduced the space of potential errors. For example: > Want to print that email_t? Then you're right back to char, unless you somehow write your own I/O system that knows about your opaque conventions. is a bug waiting to surface, because it's an email, not a string, and if you decide to print an email* that was read as a `char *` you might not get what you expect. It's all a trade-off - if you want more flexibility with the value stored in a variable, then sure, you can have it but it comes at a cost: some code somewhere almost certainly will eventually use that flexibility to mismatch the type! If you want to prevent type mismatches, then a lot of flexibility goes out the window. |
“Serialization” is the act of taking an internal data structure (of whatever shape and depth) and outputting it for transmission or storage. The opposite is “deserialization,” restoring the original shape and depth.