Hacker News new | ask | show | jobs
by pishpash 3 days ago
Gemini (at least public free version) hallucinates way too much. If it's like that, it can go very badly for Apple.
2 comments

I used Gemini exclusively via the API but downloaded the app last week for something. Even on max settings, it is ridiculously nerfed!
Unfortunately, even the API variant got RLHF'd pretty hard into being that dumb end-user assistant personality :(

But beside that, I feel like the app variant got worse the day they've had that wwdc-style release thing recently.

Previously it was a sparring partner that could actually keep up. But now it just doesn't.

Truly a shame. And nothing that could be fixed by local models any time soon, given that you need the size for the (cross-)domain knowledge.

The public version of Gemini is ridiculous. At least half their search "answers" are just wrong. If you then start a follow up chat the answers change but usually still half wrong.

Search would be better without the added AI hallucinations above it. If I want an AI answer I'll go and ask Claude, the quality difference is huge.

> The public version of Gemini is ridiculous. At least half their search "answers" are just wrong.

That's not Gemini, that's AI Mode (in Search), they're different products built by fairly different part of Google (actually one is built by Deepmind).

(I don't think it's much comparable to https://gemini.google.com/app at least in the past you'd get very different results)

And it's extremely poor marketing by Google to do this - the general perception people have is that Google AI is dumb due to this.
To be fair, this is classic Google.

As I keep saying, they should win this AI stuff, but I have complete faith in their ability to snatch defeat from the jaws of victory.

It has to be really because think of how fast it has to come up with an answer (ie time for a regular google query) and the immense scale of billions of people querying it many times a day, all for free.
Just like search itself, caching does wonders. What do 90% of the people ask anyway but mundane, totally predictable questions?
If someone knows about caching it’s google engineers