Why would the ability to generate source code imply the ability to generate bytecode? Also you wouldn’t want that, humans can’t review bytecode. I think you may be taking the metaphor too literally.
Because the semantic for each term in a programming language is pretty much a 1:1 relation to a sequential and logic-based ordering of terms in bytecode (which are still code).
> Also you wouldn’t want that, humans can’t review bytecode
The one great thing about automation (and formalism) is that you don't have to continuously review it. You vet it once, then you add another mechanism that monitors for wrong output/behavior. And now, the human is free for something else.
I dont think they are... LLMs can learn from anything thats been tokenized. Feed enough decompiled and labeled data with the bytecode and it's likely the machine will be able to dump out an executable. I wouldn't be surprised if an llm could output a valid elf right now other than the tokens may have been stripped in pretraining.
> Also you wouldn’t want that, humans can’t review bytecode
The one great thing about automation (and formalism) is that you don't have to continuously review it. You vet it once, then you add another mechanism that monitors for wrong output/behavior. And now, the human is free for something else.