|
|
|
|
|
by tcdent
850 days ago
|
|
Page 8 of the technical paper [1] is especially informative. The first chart (Cumulative Average NLL for Long Documents) shows a deviation from the trend and an increase in accuracy when working with >=1M tokens. The 1.0 graph is overlaid and supports the experience of 'muddiness'. [1] https://storage.googleapis.com/deepmind-media/gemini/gemini_... |
|