LLMs didn’t exist in then. Attention only came out in 2017…
The network itself can be trained to solve most functions (or all, I forget precisely if NNs can solve all functions)
But the language model is not necessarily capable of solving all functions, because it was already trained on language.