Hacker News new | ask | show | jobs
by root_axis 218 days ago
> it technically requires less GPU processing to run

Not when you have to scale. There's a reason why every LLM SaaS aggressively rate limits and even then still experiences regular outages.