Hacker News new | ask | show | jobs
by atomicnumber3 207 days ago
"we can have a fluent conversation with a super smart AIwe can have a fluent conversation with a super smart AI"

But we can't. I can have something styled as a conversation with a token predictor that emits text that, if interpreted as a conversation, will gaslight you constantly, while at best sometimes being accidentally correct (but still requiring double-checking with an actual source).

Yes, I am uninterested in having the gaslighting machine installed into every single UI I see in my life.

2 comments

LLMs are severely overhyped, have many problems, and I don't want them in my face anymore than the average person. But we're not in 2023 anymore. These kinds of comments just come off ignorant.
I dunno, I'm not fully anti-LLM, but almost every interaction I have with an LLM-augmented system still at some point involves it confidently asserting plainly false things, and I don't think the parent is that far off base.
Agreed, some days I code for 4-6 hours with agentic tools but 2025 or not I still can't stomach using any of the big three LLMs for all but the most superficial research questions (and I currently pay/get access to all three paid chatbots).

Even if they were right 9/10 (which is far from certain depending on the topic) and save me a minute or two compared to Google + skim/read-ing a couple websites, it's completely overshadowed by the 1/10 time they calmly and confidently lie about whether tool X supports feature Y and send me on a wild goose chase looking through docs for something that simply does not exist.

In my personal experience the most consistently unreliable questions are those that would be most directly useful for my work, and for my interests/hobbies I'd rather read a quality source myself. Because, well, I enjoy reading! So the value proposition for "LLM as Google/forum/Wikipedia replacement" is very, very weak for me.

There are two types of LLM defender; those who claim that it’ll be non-shit soon, just keep believing, and those who claim that it is already non-shit and the complainer is just stuck in year-1 where year is the current year.

Given that this has now been going on for a few years, both are wearing thin.

Like, I’m sorry, but the current crop of bullshit generators are not good. They’re just not. I’m not even convinced they’re improving at this point; if anything the output has become more stylistically offputting, and they’re still just as open to spouting complete nonsense.

You seem severely confused about how low the probability of being “accidentally correct” is for almost any real life task that you can imagine.