| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by thot_experiment 554 days ago
	For prompt adherence it still fails on tasks that Gemma2 27b nails every time. I haven't been impressed with any of the Phi family of models. The large context is very nice, though Gemma2 plays very well with self-extend.

2 comments

impossiblefork 554 days ago

It's a much smaller model though.

I think the point is more the demonstration that such a small model can have such good performance than any actual usefulness.

link

magicalhippo 554 days ago

Gemma2 9B has significantly better prompt adherence than Llama 3.1 8B in my experience.

I've just assumed it's down to how it was trained, but no expert.

link

CuriousCosmic 554 days ago

Yeah they mention this in the weaknesses section.

> While phi-4 demonstrates relatively strong performance in answering questions and performing reasoning tasks, it is less proficient at rigorously following detailed instructions, particularly those involving specific formatting requirements.

link

thot_experiment 554 days ago

Ah good catch, I am forever cursed in my preference for snake over camel.

link