„Fitting“ is still too nice of a word choice, because it implies that it’s easy to identify the best solution.
I suggest „randomly adjusting parameters while trying to make things better“ as that accurately reflects the „precision“ that goes into stuffing LLMs with more data.
It was called learning already back when the field was called cybernetics and foundational figures like Shannon worked on this kind of stuff. People tried to decipher learning in the nervous system and implement the extracted principles in machines. Such as Hebbian learning, the Perception algorithm etc. This stuff goes back to the 40s/50s/60s, so things must have gone south pretty early then.
I suggest „randomly adjusting parameters while trying to make things better“ as that accurately reflects the „precision“ that goes into stuffing LLMs with more data.