|
|
|
|
|
by zone411
372 days ago
|
|
Omproves on the Extended NYT Connections benchmark compared to both Gemini 2.5 Pro Exp (03-25) and Gemini 2.5 Pro Preview (05-06), scoring 58.7. The decline observed between 03-25 and 05-06 has been reversed - https://github.com/lechmazur/nyt-connections/. |
|