Thank you for freeing me from one of my to-do projects. I wanted to do a similar autoencoder with optimisations. Did you write about it anywhere? I'd love to read the details.
There's code there to generate unoptimized / optimized pairs via C generators like yarpgen and csmith, then compile, train, inference, and disassemble the results
https://github.com/SuperOptimizer/supercompiler
There's code there to generate unoptimized / optimized pairs via C generators like yarpgen and csmith, then compile, train, inference, and disassemble the results