Hacker News new | ask | show | jobs
by realPtolemy 896 days ago
How is it compared to 7B LLaMA quantized to run on a raspberry pi?
2 comments

Probably similar token rates out of the box, although I havent done a straight comparison. Where they'll differ is in the sorts of questions they're good at. Llama2 was trained (broadly speaking) for knowledge, Phi-2 for reasoning. And bear in mind that you can quantise phi-2 down too. The starting point is f16.
If you can run quantized 7B, nothing beats mistral and its fine tunes- like openhermes2.5