|
|
|
|
|
by hcarlens
890 days ago
|
|
Agreed, and not only do they not compare their model to Phi-2 directly, the benchmarks they report don't overlap with the ones in the Phi-2 post[1], making it hard for a third party to compare without running benchmarks themselves. (In turn, in the Phi-2 post they compare Phi-2 to Llama-2 instead of CodeLlama, making it even harder) [1]: https://www.microsoft.com/en-us/research/blog/phi-2-the-surp... |
|