|
|
|
|
|
by highfrequency
2 days ago
|
|
Unless I'm missing something, this argument seems to apply only to the original pretraining era (eg GPT 1-4). The post-training and reinforcement learning paradigms are clearly doing variation, evaluation and selective retention no? |
|
I wonder if this is a precursor to Keen Tech leaning into David Silver's Ineffable Intelligence approach.