|
|
|
|
|
by stilist
1147 days ago
|
|
I have zero technical understanding of the math or statistics, but looking at the graphs it seems suspicious that supposed jumps happen across unrelated tasks and models at the same scales--for example, in figure 1, the discontinuities are consistently in the 10^22 to 10^24 range. Obviously I'm just going by what the authors have chosen to include, but I'd expect more variation. At best I'd assume it's something about LLMs in general. |
|