| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by seneca 125 days ago
	They seem to be a victim of their own success. Their response times are quite bad, and it's widely believed they are doing something to degrade service quality (quantizing?) in order to stretch resources. They just announced that they're cutting their usage limits down during peak hours as well. They're in serious risk of losing their lead with this sort of performance.

6 comments

ACCount37 125 days ago

> it's widely believed they are doing something to degrade service quality (quantizing?) in order to stretch resources

God, I wish this inane bullshit would just fucking die already.

Models are not "degrading". They're not being "secretly quantized". And no one is swapping out your 1.2T frontier behemoth for a cheap 120B toy and hoping you wouldn't notice!

It's just that humans are completely full of shit, and can't be trusted to measure LLM performance objectively!

Every time you use an LLM, you learn its capability profile better. You start using it more aggressively at what it's "good" at, until you find the limits and expose the flaws. You start paying attention to the more subtle issues you overlooked at first. Your honeymoon period wears off and you see that "the model got dumber". It didn't. You got better at pushing it to its limits, exposing the ways in which it was always dumb.

Now, will the likes of Anthropic just "API error: overloaded" you on any day of the week that ends in Y? Will they reduce your usage quotas and hope that you don't notice because they never gave you a number anyway? Oh, definitely. But that "they're making the models WORSE" bullshit lives in people's heads way more than in any reality.

link

BoneShard 125 days ago

It's possible though - it was a bug, a model pool instance wasn't updated properly and served a very old model for several months; whoever hit this instance would received a response from a prev version of a model.

link

hbrn 125 days ago

While it's true that people are naturally predisposed to invent the "secret quantizing" conspiracy regardless of whether the actual conspiracy exists or not, I think there's more to the story.

I've seen Sonnet consistently start hallucinating on the exact same inputs for a couple hours, and then just go back to normal like nothing ever happened. It may just be a combination of hardware malfunction + session pinning. But at the end of the day the effects are indistinguishable from "secret quantizing".

link

ramesh31 125 days ago

>"They're in serious risk of losing their lead with this sort of performance."

Nobody goes there anymore, it's too crowded.

link

seneca 125 days ago

You'll notice I specifically said "victims of their own success". Obviously these problems are induced by the fact that they have so many users. Blowing a lead due to inability to handle the demands of success is still a path to losing the lead.

link

sva_ 125 days ago

It can't be worse than gemini-cli using a Pro account.

link

seneca 125 days ago

Oh really? Do they have availability problems too?

link

nsingh2 125 days ago

Gemini CLI has been broken for the past 2-3 days, with no response from Google. Really embarrassing for a multi-trillion dollar company. At this point Codex is the only reliable CLI app, out of the big three.

https://www.reddit.com/r/GeminiCLI/comments/1s49pag/this_is_...

link

sva_ 125 days ago

Last time I tried it a single prompt ran for over an hour, mostly doing nothing/waiting on availability.

link

internetter 125 days ago

I can't speak on Gemini but OpenAI is far worse for free accounts at least

link

danelski 125 days ago

GeminiCLI is absolutely terrible, nothing comparable to the browser access. I've started using the 'AI Pro' tier lately and I get 15 minutes response times from Gemini 3 'Flash' on a regular basis.

link

orphea 125 days ago

  > this sort of performance

They've been very proud of it.

link

faangguyindia 125 days ago

i just use gemini 3 flash via api with custom agent.

only people who do not even look at code anymore need anything more than that.

link