| I said nothing about writing any program. I explicitly said I don't care what AIs can output. It can read and devour every program. Imagine a database where all possible programs are stored. Any github project, any private repository of source code is reduced to an LLM pattern and stored away. We can now generalize the meaning of integers and capture every single program that has been written and that you could write. I'm not going to rewrite my comment a third time. The idea is plainly explained. LLMs can map out and store all possible programs. Think about the structural implications of this fact. If you make money from programming, you won't like where it leads. What AIs generate is irrelevant and non-theeatening. What AIs are capable of storing, is a threat to programming. A "complete map". I'm about to set my mongoose on the snakes today. |
You can easily "store" all programs written in the history of humankind and all those who are yet to be written, in the same way you can "store" all books ever written by humans and those yet to be written (see "The Library of Babel" by Jorge Luis Borges)
The threat to the working programmer is having something that "understands" the programs and tells which ones are "good" (correct, useful, etc), from those who don't work (perhaps because they are a little wrong or perhaps because they are just a random jumble of instructions).
That's what I understand when I read the word "store" in the context of what you've written, but I'm pretty sure I'm misunderstanding something here.
Are you referring to the fact that the incentives behind the current wave of AI is to get them trained by copying all source code publicly available and that that is a problem because these AI models now absorbed the code without honouring the intellectual rights of the authors (and not honouring the license the authors have chosen for their creations)?
BTW, I agree that that is a problem. But it is a problem because of the outputs: the models _output_ snippets of code that is distilled from patterns (and sometimes outright copies) of the code (and other text) it was trained on. The process of extracting that information from the model *is* the process of generating an output.