Hacker News new | ask | show | jobs
by illnewsthat 1060 days ago
The paper[1] says this in the conclusion:

> [Llama 2] models have demonstrated their competitiveness with existing open-source chat models, as well as competency that is equivalent to some proprietary models on evaluation sets we examined, although they still lag behind other models like GPT-4.

It also seems like they used GPT-4 to measure the quality of responses which says something as well.

[1] https://ai.meta.com/research/publications/llama-2-open-found...