| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sberens 1170 days ago
	> other models are in the ballpark How true is this? From playing around with Bard and Claude, GPT-4 seems to be significantly better, especially around code generation / understanding.

2 comments

letitgo12345 1170 days ago

People really don't seem to understand just how far ahead OAI is in its reasoning abilities (https://crfm.stanford.edu/helm/v0.2.2/?group=reasoning)

Maybe PaLM is near there (it's not evaluated on that page) but nothing else even comes close at all

link

icelancer 1170 days ago

The level of denial people are willing to sink into regarding how good GPT-4 is compared to everything else is truly crazy. Not a single other project is an order of magnitude close to the quantitative and qualitative (actual experiential results, not just benchmarks) results that GPT-4 brings.

link

spaceman_2020 1170 days ago

I feel that there’s significant insecurity among a lot of coders about GPT-4. A lot of them are ignoring the pace of improvement and highlighting the few off chances where it gets things wrong.

link

sealeck 1169 days ago

I think there's a lot of people writing boilerplate programs who are going to be freed from these menial tasks (i.e. no more Java enterprise application development, thankfully).

link

AussieWog93 1170 days ago

I've not used GPT-4, so it could be different, but regular old GPT-3.5 gets a _lot_ of things wrong.

link

snowe2010 1170 days ago

GPT 4 is quite astounding. It might be wrong on occasion, but it will easily point you in the right direction most of the time. It still messes up, but like a twentieth of what 3.5 did. Honestly it is like an incredible rubber ducky for me. Not only can I just talk like I’m talking to a rubber duck but I can get fast, mostly informed, feedback that unblocks me. If I have a bunch of things competing for my attention I can ask gpt about one of them, a hard one, go do something else while it types out its answer, and then come back later and move on with that project.

link

AussieWog93 1170 days ago

Is a 95% reduction in errors an exaggeration or is it really that much better? Might just need to drop the $20/mo if it's really improved that much.

link

int_19h 1169 days ago

My favorite part about GPT-4 is that if it generates code that is wrong, and you ask it to verify what it just wrote - without telling it whether it's wrong or not, much less pointing out the specific issue - more often than not it will spot the problem and fix it right away.

And yes, it does indeed make an amazing rubber duck for brainstorming.

link

spaceman_2020 1170 days ago

GPT-4 is leaps ahead, and it's improving with every new release. The latest March 23 release is significantly better than the previous one and does a LOT of heavy lifting for my code at least.

At the very least, it's a massive productivity booster.

link

satvikpendem 1170 days ago

I've had decent success with Open Assistant, an open source model. I'd say it's within the order of magnitude of ChatGPT, given the prompts I'm looking at, including reasoning prompts. This, I believe, is due to the overwhelmingly clean data that OA have managed to acquire through human volunteers.

link

JumpCrisscross 1170 days ago

> How true is this? From playing around with Bard and Claude, GPT-4 seems to be significantly better

I have at most moderate confidence in this hypothesis.

link