Hacker News new | ask | show | jobs
by andai 180 days ago
Cool idea! You mentioned the model struggling with Chinese a bit. Have you tried any Chinese models, e.g. DeepSeek or GLM? I imagine they probably have a lot more Chinese in the pretraining. (And their English is certainly fine too!)
1 comments

I have personally had success with using Kimi for Chinese creative writing making the same assumption that Moonshot, as a Chinese company, has more/better Mandarin language pretraining data