Hacker News new | ask | show | jobs
by drzaiusx11 146 days ago
If only search engine AI output didn't constantly haluciate nonexistent APIs, it might be a net productivity gain for me...but it's not. I've been bit enough times by their false "example" output for it to be a significant net time loss vs using traditional search results.
3 comments

Gemini hallucinated a method on a rust crate that it was trying to use and then spent ten minutes googling 'method_name v4l2 examples' and so on. That method doesn't exist and has never existed; there was a property on the object that contained the information it wanted, but it just sat there spinning its wheels convinced that this imagined method was the key to its success.

Eventually it gave up and commented out all the code it was trying to make work. Took me less than two minutes to figure out the solution using only my IDE's autocomplete.

It did save me time overall, but it's definitely not the panacea that people seem to think it is and it definitely has hiccups that will derail your productivity if you trust it too much.

My favorite with ChatGPT is:

"Tell me how to do X" (where X was, for one recent example, creating a Salt stanza to install and configure a service).

I do as it tells me, which seems reasonable on the face of it. But it generates an error.

"When creating X as you described, I get error: Z. Why?"

"You're absolutely correct and you should expect this error because X won't work this way. Do Y instead."

Gah... "I told you to do X, and then I'm going to act like it's not a surprise that X doesn't work and you should do something else."

You're absolutely right
it's not just that you are absolutely correct but you are also absolutely right
It's even worse when LLM eats documentation for multiple versions of the same library and starts hallucimixing methods from all versions at the same time. Certainly unusable for some libraries which had a big API transition between versions recently.
The library that this happened to me repeatedly on was AWS' CDK, which did have a large delta between v1 to v2, so that may help explain it.
Using ChatGPT and phrasing it like a search seems like a better way? “Can you find documentation about an API that does X?”
It will often literally just make up the documentation.

If you ask for a link, it may hallucinate the link.

And unlike a search engine where someone had to previously think of, and then make some page with the fake content on it, it will happily make it up on the fly so you'll end up with a new/unique bit of fake documentation/url!

At that point, you would have been way better off just... using a search engine?

how is it hallucinating links? The links are direct links to the webpage that they vectorized or whatever as input to the LLM query. In fact, on almost all LLM responses DuckDuckGo and Google, the links are right there as sited sources that you click on (i know because I'm almost always clicking on the source link to read the original details, and not the made up one
I would imagine links can be hallucinated because the original URLs in the training data get broken up into tokens - so it's not hard to come up with a URL that has the right format (say https://arxiv.org/abs/2512.01234 - which is a real paper but I just made up that URL) and a plausible-sounding title.
Yeah, but the current state of ChatGPT doesn’t really do this. The comment you’re replying to explains why URLs from ChatGPT generally aren’t constructed from raw tokens.
You are absolutely right! The current state of ChatGPT was not in my training data.
How do you explain it then, when it spits out the link, that looks like it surprisingly contains the subject of your question in the URL, but that page simply doesn't exist and there isn't even a blog under that domain at all?
I’ve used Claude code to debug and sometimes it’ll say it knows what the issue is, then when I make it cite a source for its assertions, it will do a web search and sometimes spit out a link whose contents contradict its own claim.

One time I tried to use Gemini to figure out 1950s construction techniques so I could understand how my house was built. It made a dubious sounding claim about the foundation, so I had it give me links and keywords so I could find some primary sources myself. I was unable to find anything to back up what it told me, and then it doubled down and told me that either I was googling wrong or that what it told me was a historical “hack” that wouldn’t have been documented.

These were both recent and with the latest models, so maybe they don’t fully fabricate links, but they do hallucinate the contents frequently.

> maybe they don’t fully fabricate links

Grok certainly will (at least as of a couple months ago). And they weren't just stale links either.

After getting beaten for telling the truth so frequently, who wouldn’t start lying?
I haven't seen this happen in ChatGPT thinking mode. It actually does a bunch of web searches and links to the results.