Hacker News new | ask | show | jobs
by ai_what 745 days ago
I agree. I personally don't have high hopes for the 0.5B model.

Phi-2 was 2.7B and it was already regularly outputting complete nonsense.

I ran the 0.5B model of the previous Qwen version (1.5) and it reminded me of one of those lorum ipsum word generators.

The other new Qwen models (7B and up) look good though.

1 comments

Phi-2 wasn't instruct/chat finetuned and it was very upfront about this, "I tried Phi-2 and it was bad" is a dilletante filter