| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by yieldcrv 7 days ago

> Almost every model used the canonical provider: Zai for GLM, Deepseek for Deepseek, etc.

> I am never touching Minimax or GLM again. Their APIs had constant outages

Goofy take

You run these on a VPS based on the architecture of that VPS provider, or on your own cluster

2 comments

jc4p 7 days ago

Sorry I don't understand, you're saying the direct providers aren't the canonical source you'd recommend?

If I was running these on my own machine or GPU wouldn't the argument then be "Well you didn't use the real providers?"

For the record I started doing this approach because the Kimi team released this which was shocking to me: https://github.com/MoonshotAI/K2-Vendor-Verifier

link

yieldcrv 7 days ago

yeah boutique providers are dime and dozen

they host the models on their own cloud machines and you just look at tokens/sec and price of tokens

you'll have to evaluate their APIs independently but that doesn't tend to be the issue

link

strictnein 7 days ago

GLM 5.1's smallest model size is 206 GB and really you're probably wanting to run a version that's ~400GB. If you want it to be performant, you're not just running it on a VPS.

And just saying "run it on your own cluster" sort of glosses over the cost of such a cluster.

link

yieldcrv 7 days ago

Ok and omitting it would draw out the other pedants

so its part of the answer

link