Hacker News new | ask | show | jobs
by LASR 1184 days ago
We have a whole team of folks just watching for these to come out and then go evaluate them.

Short answer: none of them do as well as the OG Davinci-003. Not even close. Even the 3.5 Turbo models from OpenAI don’t do as well.

We throw some sophisticated prompts at them to attempt chain of thought reasoning.

6 comments

That's quite a confusing comment. `davinci-003` is from OpenAI, whereas ChatGPT is some sort of variants more "optimized" for chatting. Said differently, GPT3 or 3.5 is a customized version of `davinci-003`, made for chatting. Please don't ask me on the details, I don't know, but `davinci-003` is not an alternative to ChatGPT
>but `davinci-003` is not an alternative to ChatGPT

Why makes you believe that? In my testing davinci does better than gpt-3.5-turbo for most tasks.

I think people, and this article, is about suggesting alternatives (competitors) to ChatGPT. `davinci` is obviously not an alternative, ChatGPT is `davinci` made for chatting. As to whether davinci produces better responses than ChatGPT ... maybe? but that's a different question
It is an alternative. It’s just more expensive.
Do you have a citation for that?
would be interested in that as well
What kind of things have you seen davinci-003 do better than 3.5 turbo?
We need open benchmarks, clearly. Know any projects in that space?
Could you expand on this a bit more? What types of prompts? What are your evaluation criteria?

This actually sounds fascinating. Not unlike birdwatching! ))

That’s interesting - what about 4?