|
|
|
|
|
by joshka
513 days ago
|
|
The only reason to believe that statement would be that training data is finite and cannot be meaningfully synthetically generated in a way that is useful to the LLM model. If you can agree that there are certain things which can be qualitatively measured by deterministic logic (e.g. "does this build", "what is the cyclomatic complexity of this", "does this pass the unit tests", "what is the performance characteristic of this", "can this be proven to be susceptible to a XSS bug", ...), and you can see that there are ways to use this information for feedback into the models, then there's no reason to think that the available training data is finite and limited by unclean generated data. There's several missing steps in that logic that would be difficult to (linguistically) prove with certainty, but I'm reasonably sure that your statement is false. |
|