Hacker News new | ask | show | jobs
by hcarlens 890 days ago
Agreed, and not only do they not compare their model to Phi-2 directly, the benchmarks they report don't overlap with the ones in the Phi-2 post[1], making it hard for a third party to compare without running benchmarks themselves.

(In turn, in the Phi-2 post they compare Phi-2 to Llama-2 instead of CodeLlama, making it even harder)

[1]: https://www.microsoft.com/en-us/research/blog/phi-2-the-surp...