Hacker News new | ask | show | jobs
by SilverBirch 735 days ago
This seems to be missing the most obvious optimization that Google are clearly doing. When you see an AI answer to your question on Goole I guarantee that Google hasn't gone off and put your query into an AI. What it has almost certainly done is stick your query into the same processing engine that all the other queries go to, and found a flag that says "Oh I have a pre-computed AI answer to this" and returns that. It's probably of the same order of magnitude as serving that little wikipedia summary that it shows for prominent people. Google isn't in the game of hand crafting answers to your queries so the one-off cost of putting that query into the AI is amortized over billions of answers it serves.
2 comments

Caching definitely helps, I agree there's no way Google would be wasting so much re-running every search token through Gemini.

That said, it's still worth calling out that their Gemini answers use drastically more resources whenever they are run and cached. We'd have to know Google's caching rules and the average frequency of cache hits to know how much it actually reduces the Gemini resource usage.

Yes, but all this compute is great for Google's Cloud Business revenues!
Smort. Monopolies gonna monopoly.
It’s a good thought, but I asked a pretty specific question (where can you get in to swim in the Napa river) the other day, and it had a generally correct answer, but then refined it to “within 5 miles of yountville” and it came back with an even more specific answer. I don’t think this was precomputed, though I could be wrong.