Hacker News new | ask | show | jobs
by edd_dumbill 5544 days ago
Line-based processing is still important! This work reminds me of an article I wrote 11 years ago covering Sean McGrath's work on PYX—a line-based format for XML—see http://www.xml.com/pub/a/2000/03/15/feature/index.html.

That work derived from that of Charles Goldfarb on SGML, dating from 1989 on ESIS, ISO 8879.

We'll always be downsampling to something we can use with sed, grep and awk. They're too handy not to.

1 comments

Does line-based processing still hold up? I tend to use it less and less these days, in favor of tools that process records instead of lines. There's only so much you can meaningfully store in a line of text, there is no standardized parsing, and it has all kinds of escaping issues if you have fields with embedded newlines / separators.

(FYI even syslog is moving from strictly line based to a more structured format, RFC5424/5425)