Hacker News new | ask | show | jobs
by travisjungroth 1122 days ago
> That was a great way to show my non-tech family members the limitations of AI and why they shouldn’t trust it.

These are the limitations of the version of ChatGPT you were using at that moment. They are not categorical limitations of AI or even LLMs.

It’s amazing to me how many people are sleeping on AI, mixing up the failing cases of a freemium chatbot for the full capability of the tech, even on HN. LLMs can say “I don’t know”. Even ChatGPT can do it. Ask some super niche historical questions of any version and see what you get. Is it perfect every time? No. But that’s something that can be reduced.

Over the next year, you’ll see more instances of lawyers citing hallucinated cases. There will also be a handful of startups that hook up LLMs to document stores, and they’ll be able to check for this sort of thing and do an even better job.

2 comments

> LLMs can say “I don’t know”. Even ChatGPT can do it.

That's the problem in my opinion. When you know something is capable of saying "I don't know" but confidently spits out some hallucinated BS is when the average person eats it up.

It is definitely a problem. OpenAI does a lot to warn people, but I’m not really sure it’s enough.
I don't know exactly why, but for some reason this made me think of qAnon, and now I'm thinking of an AI trained on qAnon theories that people can form a community around like they did qAnon, and frankly that's one of the most terrifying things I've thought in quite a while.
I remember someone built a 4chan robot and posted it to HN. robot immediately display the terrible part of there.
I made https://AskHN.ai

What it does is not try to answer, but collect previous topics discussed by experts. Then answer the question based on the text, a far more reliable approach.

How does it qualify experts? I love the discussion here but if it turns to international nuclear strategy or the minutae of electrical networks (or presumably anything outside the regular wheelhouse) I notice that the quality goes down but the confidence stays the same.
Under the hood it builds and ranks the expertise of everyone in the network. That said, it doesn’t have knowledge outside the network, so if the network itself has low quality experts or no data, it’s going to give subpar results