|
|
|
|
|
by rnosov
1063 days ago
|
|
The report cites both GPT-3.5 and GPT-4 scores on page 7 [1]. I've checked the numbers and they compare FreeWilly2 to GPT-3.5. For example, HellaSwag score of 85.5% corresponds to GPT-3.5. [1] https://arxiv.org/pdf/2303.08774v3.pdf |
|