Hacker News new | ask | show | jobs
by eightysixfour 498 days ago
aider found that with R1, the best performance was to use R1 to think through the solution, and use claude to implement the solution. I suspect that, in the near term, we'll need combinations of reasoning models and instruction-following coding models for excellent code output.

My experience is that most of the models focused on reasoning improvements has been that they tend to be a bit worse at following specific instructions. It is also notable that a lot of 3rd party fine-tunes of Llamas and others gain in knowledge based benchmarks while reducing instruction following scores.

I wonder why that seems to be some sort of continuum?

1 comments

Kind of like an ai “thinking fast and thinking slow”.
Sort of? I don't see why thinking slow should inhibit the ability to follow instructions.
I think they're referencing "Thinking, Fast and Slow" - https://en.wikipedia.org/wiki/Thinking,_Fast_and_Slow

"The book's main thesis is a differentiation between two modes of thought: "System 1" is fast, instinctive and emotional; "System 2" is slower, more deliberative, and more logical. "

Yes, I understand the reference. I don't understand their argument that this is a good example of that common mental model for LLMs.

In this case "fast, instinctive, and emotional" models are better at instruction following than "slower, more deliberative, and more logical" models.