|
|
|
|
|
by alphabetting
121 days ago
|
|
the agentic benchmarks for 3.1 indicate Gemini has caught up. the gains are big from 3.0 to 3.1. For example the APEX-Agents benchmark for long time horizon investment banking, consulting and legal work: 1. Gemini 3.1 Pro - 33.2%
2. Opus 4.6 - 29.8%
3. GPT 5.2 Codex - 27.6%
4. Gemini Flash 3.0 - 24.0%
5. GPT 5.2 - 23.0%
6. Gemini 3.0 Pro - 18.0% |
|