|
|
|
|
|
by gwern
925 days ago
|
|
I'd say, trying to read this, the biggest problems are: - tons of visual clutter, all those gradients and lines like the header or hero image
- a floating ToC which insists on jamming in 'recommended links' (?!) the entire time
- no outlines. Every single image or screenshot blends into the actual article.
- a visual summary which is hard to read because it has tiny text and looks like a correlation heatmap instead of a table
- highly inconsistent use of linking. Like, why does 'We have evaluated Gemini across four separate vision tasks:' link only 2 of the 4, and then not to the section in this article?
- highly repetitive screenshots, which add nothing, and in conjunction with the lack of outlines for the images and the many outlines inside the images, means that the benchmark sections are a frustrating visual jigsaw puzzle where you have to decode the screenshot again and again to look at the tiny text inside it. It would be better to provide one (1) screenshot of each model's UI, which is all I need to see to get an idea of what it looks like and the implied workflow and what sort of metadata/options it has, and then for each task simply show the image/prompt and each model's responses as a normal blockquote or text. |
|
- tons of visual clutter, all those gradients and lines like the header or hero image
- a floating ToC which insists on jamming in 'recommended links' (?!) the entire time
- no outlines. Every single image or screenshot blends into the actual article.
- a visual summary which is hard to read because it has tiny text and looks like a correlation heatmap instead of a table
- highly inconsistent use of linking. Like, why does 'We have evaluated Gemini across four separate vision tasks:' link only 2 of the 4, and then not to the section in this article?
- highly repetitive screenshots, which add nothing, and in conjunction with the lack of outlines for the images and the many outlines inside the images, means that the benchmark sections are a frustrating visual jigsaw puzzle where you have to decode the screenshot again and again to look at the tiny text inside it. It would be better to provide one (1) screenshot of each model's UI, which is all I need to see to get an idea of what it looks like and the implied workflow and what sort of metadata/options it has, and then for each task simply show the image/prompt and each model's responses as a normal blockquote or text.