Hacker News new | ask | show | jobs
by yieldcrv 7 days ago
> Almost every model used the canonical provider: Zai for GLM, Deepseek for Deepseek, etc.

> I am never touching Minimax or GLM again. Their APIs had constant outages

Goofy take

You run these on a VPS based on the architecture of that VPS provider, or on your own cluster

2 comments

Sorry I don't understand, you're saying the direct providers aren't the canonical source you'd recommend?

If I was running these on my own machine or GPU wouldn't the argument then be "Well you didn't use the real providers?"

For the record I started doing this approach because the Kimi team released this which was shocking to me: https://github.com/MoonshotAI/K2-Vendor-Verifier

yeah boutique providers are dime and dozen

they host the models on their own cloud machines and you just look at tokens/sec and price of tokens

you'll have to evaluate their APIs independently but that doesn't tend to be the issue

GLM 5.1's smallest model size is 206 GB and really you're probably wanting to run a version that's ~400GB. If you want it to be performant, you're not just running it on a VPS.

And just saying "run it on your own cluster" sort of glosses over the cost of such a cluster.

Ok and omitting it would draw out the other pedants

so its part of the answer