|
|
|
|
|
by sarreph
2 days ago
|
|
I had intended to caveat that: I'm sure I'm not the first person to ask about this! > you still see improvements This is expected if they are training their models on it, right? > objectively-bad results Keen to learn when this has been the case, i.e. across version increments in major models. |
|
I've been enjoying seeing how the quality of individual models differ based on the amount of reasoning effort you give them. If they were baking an a good pelican you wouldn't expect them to differ so much.
(Google Gemini are the only lab that have very clearly paid attention to the quality of SVG animals-riding-vehicles, see their announcement for Gemini 3.1: https://twitter.com/JeffDean/status/2024525132266688757 )