testing was mostly manual with a test corpus we generated. its not perfect but its pretty close for most files we've seen