| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by indrora 98 days ago
	Ostensibly, a mix of VC funding and that they host an endpoint that lets them run the big (200+GB) models on their infrastructure rather than having to build machines with hundreds of gigs of llm-dedicated memory.

1 comments

wongarsu 98 days ago

But on inference they have to compete with other inference provider that just has a homepage, a bunch of GPUs running vllm and none of the training cost. Their only real advantage are the performance optimizations that they might have implemented in their inference clusters and not made public

link

MarsIronPI 98 days ago

Qwen, at least, IIRC has some proprietary models, specifically the Max series. IIRC these have larger context windows.

link