| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nprateem 1241 days ago
	I know, all those frusty old grumpyboots who actually want the thing to return factually accurate answers and valid code. Just be happy with plausible sounding answers people. Sheesh!

4 comments

dsabanin 1241 days ago

Did you actually ever receive 100% factually correct answers from anything before for anything other than strictly mathematical statements?

Regardless, I find it silly to focus on the small flaws when we're witnessing a foundational shift in what kind of problems we can solve.

link

nprateem 1241 days ago

But they're not small flaws if you're relying on it to eg replace a person's job.

If i ask it for the dimensions of a product and it gives the wrong figures it should just tell me it doesn't know instead of inventing something.

That's the problem. It doesn't tell you when something is wrong so you can never trust if it's right unless you happen to know the field. That makes it far less useful.

link

aliqot 1241 days ago

I've had a few instances where it returned bad code or was unable to solve a challenge, however most were fixed by better prompts, or by clarifying prompts. In a way I think there is a two-way "learning" process going on here. I'm training it how to give me what I ask for, and it trains me how to ask for what I want it to give me.

link

macNchz 1241 days ago

The tricky bit IMO is when you’re at the threshold of being able to identify errors it makes. I tested some situations a while ago where I asked for some physics calculations functions. I’m an experienced programmer but haven’t really done anything with physics since high school 20 years ago. The code returned looked plausible and would run, but going through it line by line and looking up the real formulas it turned out to be super wrong.

link

dwild 1241 days ago

People can be wrong and can sound quite plausible too.

The key is to verify... and that's true for AI and people too, though for sure that's not something people are used to do sadly.

link

alexandrius 1241 days ago

Right? Devs are so butthurt that, are dismissing this all together. It’s called denial.

link