|
|
|
|
|
by lalassu
194 days ago
|
|
Disclaimer: I did not test this yet. I don't want to make big generalizations. But one thing I noticed with chinese models, especially Kimi, is that it does very well on benchmarks, but fails on vibe testing. It feels a little bit over-fitting to the benchmark and less to the use cases. I hope it's not the same here. |
|
If it had vision and was better on long context I'd use it so much more.