|
|
|
|
|
by Der_Einzige
404 days ago
|
|
Me being old man yelling at cloud about how your chat/tool template matters more than your post-training technique. DeepSeek-R1 is trivially converted back to a non reasoning model with just chat template modifications. I bet you can chat template your way into a good quality model from a base model, no RLHF/DPO/SFT/GRPO needed. |
|