I usually write a test exerciser for the library/command and a "known good output" file then diff the output and the known good.
will take a look at this for certain.