Hacker News new | ask | show | jobs
by ACCount37 300 days ago
Go vibe check Kimi-K2. One of the weirdest models out there now, and it's open weights - with both "base" and "instruct" versions available.

The language it uses is peculiar. It's like the entire model is a little bit ESL.

I suspect that this pattern comes from SFT and RLHF, not the optimizer or the base architecture or the pre-training dataset choices, and the base model itself would perform much more "in line" with other base models. But I could be wrong.

Goes to show just how "entangled" those AIs are, and how easy it is to affect them in unexpected ways with training. Base models have a vast set of "styles" and "language usage patterns" they could draw from - but instruct-tuning makes a certain set of base model features into the "default" persona, shaping the writing style this AI would use down the line.

1 comments

Kimi tends to be very.. casual from my usage, like informal millenial style, without being prompted to do so.