| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kraemahz 1292 days ago

I'm curious why everyone keeps getting confused about this model being GPT-3 and using their past experiences with GPT-3 to justify their position. The model is not GPT-3 and and at this point GPT-3 is far behind the state of the art. OpenAI calls this model "GPT-3.5".

It is also capable of far more than relaying information, as such it is also serving the purpose of Q/A sites like Stack Overflow. You can put wrong code into it and ask for bug fixes and it will return often exactly the correct fix.

Framed as a search engine it obviously fails on some measure, framed as a research assistant it exceeds Google by leaps and bounds (which suffers greatly from adversarial SEO gumming up its results).

1 comments

evrydayhustling 1292 days ago

I don't agree people are confused (I wasn't) or that they are depending on prior experiences (many of these points aren't rooted in direct experiences at all!). OpenAI is choosing to brand this as a fine tuning of a model that is a minor version of GPT 3.X, so it's a pretty natural shorthand.

Agree with you directionally on the research assistant point, although I think it would be interesting to define that task with more detail to see the comparisons. I'd expect that most research workflows starting with ChatGPT still need to end in search to confirm and contextualize the important parts.

link

kraemahz 1291 days ago

Between the release of GPT-3 and GPT-3.5 there was Gopher, which raised the bar on TruthfulQA from essentially random (22.6%) in GPT-3's case to 45% for Gopher. GopherCite then brought the performance up to 80-90%. One has to assume that OpenAI is using state of the art techniques in their new model releases. That the LLMs went from choosing answers randomly to producing accurate results on a great deal of questions (they still suck at math) is missed for anyone who is not aware of the historical context that shorthanding 3.5 to 3 causes.

link