| Congrats on the upcoming PhD! I'm curious where you are drawing your definition or scope for 'reasoning' from? For example, in Shuren The Neurology of Reasoning (2002) the definition selected was "the ability to draw conclusions from given information." While I agree that LLMs can only process token to token and that juggling context is critical to effective operation such that CoT or ToT approaches are necessary to maximize the ability to synthesize conclusions, I'm not quite sure what the definition of reasoning you have in mind is such that these capabilities fall outside of it. The typical lay audience suggestion that LLMs cannot generate new information or perspectives outside of the training data isn't the case, as I'm sure you're aware, and synthesizing new or original conclusions from input is very much within their capabilities. Yes, this has to happen within a context window and occurs on a token by token basis, but that seems like a somewhat arbitrary distinction. Humans are unquestionably better at memory access and running multiple subprocesses on information than an LLM. But if anything, this simply suggests that continuing to move in the direction of multiple pass processing of NLP tasks with selective contexts and a variety of fine tuned specializations of intermediate processing is where practical short term gains might lie. As for the issue of new domains outside of training data, I'm somewhat surprised by your perspective. Hasn't one of the big research trends over the past twelve months been that in context learning has proven more capable than was previously expected? I'd agree that a zero shot evaluation of a problem type that isn't represented in a LLMs training data is setting it up for failure, but the capacity to extend in context examples outside of training data has proven relatively more successful, no? |