| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dartos 503 days ago
	Before seeing Claude’s response, did you see where the author said > I asked the model to closely look at the plot, and also uploaded a png of the plot it had generated.

2 comments

Kuinox 503 days ago

I sent the plot to ChatGPT 4o. Here is the conversation:

what do you see

ChatGPT said:

This is a scatter plot with the variables "steps" on the x-axis and "bmi" on the y-axis. The data points are colored by "gender" (red for female and blue for male). Interestingly, the arrangement of the points appears to form a drawing resembling a cartoonish figure or character, likely added for artistic or humorous effect.

If you'd like, I can help analyze or discuss the data further!

You said:

Wha the cartoonish figure looks like ?

ChatGPT said:

The cartoonish figure in the plot resembles a stylized gorilla or monkey-like character. It appears to be waving with one hand raised, while the other arm is resting downward. The face is expressive, with distinct eyes, a nose, and a slightly frowning mouth. The overall pose gives it a somewhat playful or cheeky vibe.

link

dartos 503 days ago

Sure whatever.

OC seemed to think that Claude did that with just the data and not the image of the scatterplot it’s.

link

smusamashah 503 days ago

LLM responses are random. One's failure is other's success. When evaluating we all should do rerurns and see how many times it fails or succeeds.

Without number of rerurns, the result is as good as random.

link

dartos 502 days ago

Okay?

OC was saying that the article said that Claude recognized the “artistic” lines of the image from just the scatter plot data.

That isn’t what happened.

The author added a png of the plot to the conversation.

Idk why I need to explain that twice.

link

johnfn 503 days ago

Hm, interesting. The way I tried it was by pasting an image into Claude directly as the start of the conversation, plus a simple prompt ("What do you see here?"). It got the specific image wrong (it thought it was baby yoda, lol), but it did understand that it was an image.

I wonder if the author got different results because they had been talking a lot about a data set before showing the image, which possibly predisposed AI to think that it was a normal data set. In any case, I think that "Your Ai Can't See Gorillas" isn't really a valid conclusion.

link

vunderba 503 days ago

Please read TFA. The conclusion of the article isn't nearly so simplistic, they're just suggesting that you have to be aware of the natural strengths and weaknesses of LLMs, even multi modal ones particularly around visual pattern recognition vs quantitative pattern recognition.

And yes, the idea that the initial context can sometimes predispose the LLM to consider things in a more narrow manner than a user might otherwise want is definitely well known.

link

johnfn 503 days ago

The title of the article is "Your AI Can't See Gorillas". That seems demonstrably false.

The article says:

> Furthermore, their data analysis capabilities seem to focus much more on quantitative metrics and summary statistics, and less on the visual structure of the data

Again, this seems false - or, at best, misleading. I had no problem getting AI to focus on visual structure of the data without any tricks. A more fair statement would be "If you ask an AI a bunch of questions about summary statistics and then show it a scatterplot with an image, then it might continue to focus on summary statistics". But that's not what the concluding paragraph states, and it's not what the title states, either.

link

8note 503 days ago

you knew that there was a visual gag in there before asking it to.

if you didnt know it was there, and took a look at only the text output, the llm would not have found it to tell you its there

link

genewitch 502 days ago

Yeah the people "it gives you the answer when you give it the answer" have kind of ruined my morning. Oh well.

link