It's in a very rough state, but I have been working on something similar. The idea is to build a superoptimizer by exhaustive enumeration of instruction sequences. In a first pass, each sequence is executed on some test machine states, and bucketed by the hash of the results. On the second pass, we use a theorem prover to check the equivalence of the sequences.