|
|
|
|
|
by timdellinger
551 days ago
|
|
My personal view is that the roadmap to AGI requires an LLM acting as a prefrontal cortex: something designed to think about thinking. It would decide what circumstances call for double-checking facts for accuracy, which would hopefully catch hallucinations. It would write its own acceptance criteria for its answers, etc. It's not clear to me how to train each of the sub-models required, or how big (or small!) they need to be, or what architecture works best. But I think that complex architectures are going to win out over the "just scale up with more data and more compute" approach. |
|
Now with 4o-mini I have a similar even if not so obvious problem.
Just writing this down convinced me that there are some ideas to try here - taking a 'report' of the thought process out of context and judging it there, or changing the temperature or even maybe doing cross-checking with a different model?