You can use the filter tool to sequentially use rules to apply labels to emails from certain senders. Having labels makes searching/sorting much easier.
You could then manipulate that file to hand remove duplicates, could use A2=A1 type formula in a spreadsheet and fill down to find dupes, copy relevant column to text file and sort and uniq in *nix: http://linux.about.com/library/cmd/blcmdl1_uniq.htm
And then GPLv3 your script, put it on GitHub, use a default GitHub template to create a nice looking site for it, and post the link back to HN with the cool doc saying how to use it.