Hacker News new | ask | show | jobs
by Laremere 757 days ago
This article was posted on August 28, 2023. The giraffe was born 31, 2023. So presumably none of the models tested included this particular giraffe in its training data. To directly compare it to GPT-4o, which was released recently, is an invalid comparison. I wouldn't be surprised if GPT-4o does better on novel recognition tasks like this, but you'd need a new novel concept in an image to directly compare GPT-4o and the models tested in this post.
3 comments

Gpt4, predating the image:

"This image features a young giraffe standing in a fenced enclosure. What's unusual is that the giraffe has what looks like an extra set of small horns, which are not typical for giraffes. Giraffes normally have two main horns (ossicones), but this one appears to have an additional pair above the usual two, possibly due to a genetic anomaly or variation. This feature makes the giraffe in the image quite distinctive."

Checking that we've gotten better at dealing with novelty seems like a very hard thing to do. Everyone knows black swans exist, they're famous for it!
Easy enough. Tint the photo purple and ask it again.
It has learnt all sorts of invariances, almost certainly also that.

I've gotten some very weird results with 4o on images, it seems entirely possible to me that it would go off the rails if the image wasn't in the training data.

For this specific case, it's really not easy to test at all.

An invariant isn’t the same thing as a purple giraffe. One is an image manipulation applied at training time to make the classifier robust against transformations. The other is a thing that might someday exist in nature. (The most straightforward way is to dump a barrel of wine over the giraffe and take a photo.)
You're thinking simple image augmentations. These nets learn much more complex invariants. Basically to isolate concepts from irrelevant context. The point is you can't remove that image from the training data (not practically) and the experiment is pointless if it's in there.
Sure you can. Have it generate a new photo. Or dump a barrel of wine over the giraffe.
The main feature is still the same: plain not spotted.

Maybe striped would be a better test.