|
|
|
|
|
by cubefox
490 days ago
|
|
This suggests fine-tuning a base model (with SL or RL) generally doesn't make the model inherently smarter, only the initial self-supervised learning during pretraining does. Though it would be strange if no amount of reinforcement learning could make the LLM truly smarter. |
|