Hacker News new | ask | show | jobs
by nojs 209 days ago
Surprisingly my experience has been the opposite with qwen, if you can force the thinking trace to English the results seem better. But probably just due to the amount of training data.