Hacker News new | ask | show | jobs
by wongarsu 81 days ago
Yes. I would not consider Kimi a particularly good model relative to its size, and making a SotA model is a lot more expensive. But training costs are explicitly excluded when talking about the cost to serve tokens