| Amazing insight, particularly section 6. "- The two important but different abilities of GPT-3.5 are *knowledge* and *reasoning*. Generally, it would be ideal if we could *offload the knowledge part to the outside retrieval system and let the language model only focus on reasoning.* This is because:
- The model’s internal knowledge is always cut off at a certain time. The model always needs up-to-date knowledge to answer up-to-date questions.
- Recall we have discussed that is 175B parameter is heavily used for storing knowledge. If we could offload knowledge to be outside the model, then the model parameter might be significantly reduced such that eventually, it can run on a cellphone (call this crazy here, but ChatGPT is already science fiction enough, who knows what the future will be)." & "Yet there was a WebGPT paper published in Dec 2021. It is likely that this is already tested internally within OpenAI." It definitely feels like this may be the next step in making this kind of system robust. It ends up being an interface for search. |
- Reasoning typically requires base knowledge to work from. A side effect of training reasoning is embedding knowledge into the model parameters.
- Even if you offload the search portion (either through outputting special tokens that are postprocessed, or applying the model in multiple steps with postprocessing), you still need embedded knowledge for the model to decide what to search for, and then to successfully integrate that knowledge (in the multi-step case).
Maybe some kind of post-facto pruning of model weights?