|
|
|
|
|
by davebren
55 days ago
|
|
At the same time treating everything as tokens and next word prediction will never produce any real understanding like what humans do when they learn how to program. The bitter lesson is an admission that we still have no clue what is at the core of human learning and reasoning so we have to brute force it with tons of data generated by humans. I also don't know if expert systems and ML techniques like feature extraction are really any worse in practice or if we just didn't have enough engineering resources or a proper way to organize and scale their development. They seemed to work quite well in a lot of cases with more predictable results and several orders of magnitude less compute. And LLMs still suffer the long-tail problem despite their insane amounts of data. If we're at the end of the data and most new data is now produced by LLMs with little human oversight, where do we go? Seems like figuring out ways to mix LLMS with more structured models that can reliably handle important classes of problems is the next logical step. In a way that is what programming languages and frameworks/libraries are doing, but they've massively disincentivized work on those by claiming that LLMS will do everything. The chess example is a good one, it's effectively solved so why shouldn't an LLM have a submodule that it can use to play chess and save some energy. |
|