Hacker News new | ask | show | jobs
A prompt-trained DeepSeek R1 70B can perform better than GPT-o1 using AdalFlow (colab.research.google.com)
4 points by meame2010 503 days ago
3 comments

Time to move to open-source and smaller reasoning model.

Here are the top three learnings from auto-prompt optimizing DeepSeek R1 LLaMA70B for RAG:

1⃣ A trained DeepSeek R1 LLaMA70B(r1 distilled) is even better than GPT-o1 without training. 2⃣ The “Reasoning” model is less susceptible to overfitting compared with non-reasoning models. By comparing it with GPT-3.5, both gpt3.5 and r1 distilled start at the same accuracy and reach similar accuracy on the validation dataset. However, on the test dataset, r1 distilled often achieves much higher accuracy. 3⃣ R1 can think too long and run out of output tokens before finishing the task. The optimized prompt specifically added instructions for it to “think less.”

This is really exciting news! I can't wait to see how things progress from here. I'd love to see some cost comparisons alongside the performance gains in the future.
So this is mainly optimizing the prompt itself? E.g. RAG + optimizing few shot examples?
not few shot, but prompt tuning via text generation via auto-differentiation.

https://arxiv.org/abs/2501.16673