Hacker News new | ask | show | jobs
by zurfer 477 days ago
Yes. Our main finding was that o3 mini especially is great on paper but surprisingly hard to prompt, compared to non reasoning models. I don't think it's a problem with reasoning, but rather with this specific model. I also suspect that o3 mini is a rather small model and so it can lack useful knowledge for broad applications. Especially for RAG, it seems that larger and fast models (e.g. gpt4o) perform better as of today.
1 comments

I suspect you're right here! Excited to get our hands on the non-distilled o3. :)