Y
Hacker News
new
|
ask
|
show
|
jobs
by
wongarsu
81 days ago
Yes. I would not consider Kimi a particularly good model relative to its size, and making a SotA model is a lot more expensive. But training costs are explicitly excluded when talking about the cost to serve tokens