|
|
|
|
|
by textmode
2878 days ago
|
|
In processing HTML, XML or JSON with sed I have often used tr (e.g., delete newlines, add non-printable delimiter, then replace delimiter with newline) to reformat into sed-friendly input. However, an easy alternative to using tr for this is flex. As an example, below is a one-off/reusable HTML/XML reformatter in flex. This makes HTML/XML easier for me to read. It also makes it very easy to process with sed and other line-based utilities. ftp -4o 1.xml http://web.archive.org/web/20130814000845/http://zombofant.net/blog/ |a.out |less
flex -8iCrfa 038.l
cc -static lex.yy.c
cat 038.l
#define echo ECHO
#define jmp BEGIN
#define nl putchar(10)
#define ind fputs("\40\40\40",stdout)
%s xa xb
xa \11|\40
%%
^\x0d\x0a jmp xb;
\<{xa}*script nl;ind;echo;jmp xa;
<xa>\<{xa}*\/script{xa}*\> echo;jmp xb;
<xa>{xa}{xa}* putchar(32);
<xa>. echo;
<xb>\< nl;ind;echo;
<xb>\> echo;
<xb>{xa}{xa}* putchar(32);
<xb>. echo;
.|\n
%%
int main(){ yylex();}
int yywrap(){ nl;}
|
|