|
|
|
|
|
by benjiro29
5 hours ago
|
|
Neuralwatt ... When you reverse calculate the actual energy usage / price on a token basis, the gap is large. I do not have GLM 5.2 numbers because the whole default max setting is overkill. But GLM 5.1 numbers had it at 12x cheaper then API rates. And about 2.5x more tokens vs zai their own subscription service. Yes, its FP8 but lets be honest, do we know for sure that even zai runs at FP16? I learned a long time ago with Claude and Codex how much cheating happens on model levels, even from the big boys. |
|
On top of that, the cloud offering doesn't seem that well-run, they randomly blocked a colleague's API key for a couple days without any heads up, had a weird rate limiting bug and they have been deprecating models without redirects with very short notice, all while taking weeks to onboard new models. I assume some of these problems would be addressed if we had an SLA/enterprise contract.
It's a promising idea though. They offer a $5 trial credit (with an aggressive rate limit) though so no harm in trying it out.