|
|
|
|
|
by justinpombrio
1820 days ago
|
|
Haha, "Those symbols need not be text", you say, right after quoting a definition that says they need to be "a string or text"! There's a field of study called "parsing", which studies "parsers". Hundreds of papers. Very well defined problem: turning a list of symbols into a tree shaped parse tree (or data structure). The defining aspect of parsing, that makes it difficult and an interesting thing to study, is that you're starting with a list and ending with a tree. If you're converting tree to tree (that is, a typical data structure to a typical data structure), all the problems vanish (or change drastically) and all the parsing techniques are inapplicable. I'm kind of annoyed that people are starting to use the word "parse" metaphorically. Bit by bit, precise words turn fuzzy. Alas, it will be a lost battle. |
|
Voila, now your 'string' is 'binary data' not 'text'.
Parsing binary data is my bread and butter, so I might be biased but: it works fine.
Anything which comes over the wire is a string, anything which comes out of store is a string. If you're using something like protobufs, that's great, because having to marshal/serialize/parse along every process boundary is expensive and probably unnecessary.
But at some point, and anywhere on the 'surface' of the system, data has to be un-flattened into a shape. That's parsing.