| Well, no, but yes. The critical piece is that this can be done in training. If I collect a large number of C programs from github, compile them (in a deterministic fashion), I can use that as a training, test, and validation set. The output of the ML ought to compile to the same way given the same environment. Indeed, I can train over multiple deterministic build environments (e.g. different compilers, different compiler flags) to be even more robust. The second critical piece is that for something like a GAN, it doesn't need to be identical. You have two ML algorithms competing: - One is trying to identify generated versus ground-truth source code - One is trying to generate source code Virtually all ML tasks are trained this way, and it doesn't matter. I have images and descriptions, and all the ML needs to do is generate an indistinguishable description. So if I give the poster a lot more benefit of the doubt on what they wanted to say, it can make sense. |
If what they're actually saying is that it's possible to train a model to low loss and then you just have to trust the results, yes, what you say makes sense.