Hacker News new | ask | show | jobs
by cbowal 757 days ago
Asking ChatGPT with GPT-4o:

> please describe this image [image]

> The image shows a young giraffe standing on a gravelly surface inside a fenced enclosure. The giraffe has a long neck, characteristic of its species, with a short mane running along the top. Its coat is a light brown color with a paler underbelly. The enclosure has a wire mesh fence and appears to be part of a larger outdoor area, with trees and additional fencing visible in the background. The giraffe is looking directly at the camera with its legs slightly apart, giving it a curious and alert appearance.

> anything unusual?

> Yes, the giraffe in the image appears unusual because it does not have the typical spotted pattern that giraffes are known for. Instead, its coat is a uniform light brown color. Additionally, the body proportions and the facial features look slightly different from a typical giraffe, suggesting it might be a unique or hybrid species, or possibly an artistically altered image. This giraffe also seems to have a different build, with a more compact body and shorter neck compared to most giraffes.

4 comments

This article was posted on August 28, 2023. The giraffe was born 31, 2023. So presumably none of the models tested included this particular giraffe in its training data. To directly compare it to GPT-4o, which was released recently, is an invalid comparison. I wouldn't be surprised if GPT-4o does better on novel recognition tasks like this, but you'd need a new novel concept in an image to directly compare GPT-4o and the models tested in this post.
Gpt4, predating the image:

"This image features a young giraffe standing in a fenced enclosure. What's unusual is that the giraffe has what looks like an extra set of small horns, which are not typical for giraffes. Giraffes normally have two main horns (ossicones), but this one appears to have an additional pair above the usual two, possibly due to a genetic anomaly or variation. This feature makes the giraffe in the image quite distinctive."

Checking that we've gotten better at dealing with novelty seems like a very hard thing to do. Everyone knows black swans exist, they're famous for it!
Easy enough. Tint the photo purple and ask it again.
It has learnt all sorts of invariances, almost certainly also that.

I've gotten some very weird results with 4o on images, it seems entirely possible to me that it would go off the rails if the image wasn't in the training data.

For this specific case, it's really not easy to test at all.

An invariant isn’t the same thing as a purple giraffe. One is an image manipulation applied at training time to make the classifier robust against transformations. The other is a thing that might someday exist in nature. (The most straightforward way is to dump a barrel of wine over the giraffe and take a photo.)
You're thinking simple image augmentations. These nets learn much more complex invariants. Basically to isolate concepts from irrelevant context. The point is you can't remove that image from the training data (not practically) and the experiment is pointless if it's in there.
Sure you can. Have it generate a new photo. Or dump a barrel of wine over the giraffe.
The main feature is still the same: plain not spotted.

Maybe striped would be a better test.

This was the first thing I did and I intuitively thought it would be able to do it.

I then thought I’d see if it can MAKE an image of a giraffe with no pattern - but it absolutely cannot. Even after 10 or so attempts with different prompts and continually sending it back its own generations with feedback, every giraffe has a pattern.

Dalle 3 or similar are different models compare to LLM, but in the future OpenAI will probably allow generating image directly from GPT-4o without using the Dalle model (as showed in the demo page), I bet it would much better at following the prompts and generating unexpected things.
Try asking it to generate a giraffe with a short neck. A few months back I tried it and the AI was completely unable to do it.
Note date of post above:

>> By Janelle Shane On August 28, 2023

The point of this post is that AI is horrendously bad at anything not in its training set. Asking an AI that does have it in its training set about it totally misses the point.
> AI is horrendously bad at anything not in its training set.

The whole point of AL/ML algorithms is to find generalizations that explicitly apply to data outside the training set. Just because it gets some things wrong doesn't mean that it's bad at anything not in its training set.