This has definitely been discussed. There have even been some projects, although I haven't checked on the status of any of them lately. As best as I can recall, there are some specific structural reasons why it's hard to train LLM's this way, but I don't recall all the details offhand.