|
|
|
|
|
by ren_engineer
503 days ago
|
|
not sure why people are surprised, it's been known a long time that RLHF essentially lobotomizes LLMs by training them to give answers the base model wouldn't give. Deepseek is better because they didn't gimp their own model |
|