|
|
|
|
|
by GaggiX
806 days ago
|
|
The point is that it also beats the baseline on every other size (1B and 3B). So it wouldn't be surprising to see it beat a 7B transformer model like the 6B model. Note 2 on page 5 probably explains why the sizes are different. |
|