Hacker News new | ask | show | jobs
by jrandolf 66 days ago
Thanks to everyone who shared feedback. We’re implementing it now.

Here’s what’s changed:

- We’ve removed the other LLMs for now and are focusing entirely on Qwen 3.5. We’ll bring back additional smaller models later, but most usage was already concentrated on Qwen 3.5.

- Pricing is now around $50. You get roughly 2× the throughput (61 tok/s vs. 31 tok/s, verified in testing), and it’s still unlimited. For context, that’s about 158M tokens per month. Comparable providers like Novita charge around $3.2 per million tokens, so this comes out to roughly 10% of typical token costs.

- Context size is now capped at 32K tokens. For the vast majority of use cases, this is more than sufficient.

1 comments

support@sllm.cloud is bouncing.
Fixed.