| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Dylan16807 502 days ago
	You don't think "Examine the data" and "Which other conclusions can you draw from the data?" are open-ended? And even when explicitly prompted to look at the plot, they only brush up against the data anomalies rather than properly analyzing the plot.

2 comments

taberiand 502 days ago

I tried it on gpt4o with an upload of the image and "What do you see?" as prompt and it said "monkey". So ymmv, these tools can't be evaluated with just a bunch of gotcha prompts and ignorance of how to use them effectively

link

Dylan16807 502 days ago

It's not a gotcha to give it the data points and ask it to analyze. Uploading this data in image form is effectively a leading question tuned to the specific data, and an analysis tool that needs that kind of leading question is not good at its job.

link

taberiand 502 days ago

I don't know why you would expect it to see a gorilla without an image to look at. Humans can't.

link

Dylan16807 502 days ago

Without an image? No, not at all. It's supposed to make its own image. And it did make its own image. But it didn't properly analyze the image it made.

link

taberiand 502 days ago

That's a feature that would need to be implemented. There's no reason to think it could look at the image of the plot it generated automatically, but feeding it the image it generated back to it is no different to if it did view it automatically

link

Dylan16807 501 days ago

The point of telling it to explore the data is so I don't have to think of every angle myself. Humans can get an understanding from visuals that LLMs can't match, apparently, even without gimmicks.

link

brookst 502 days ago

What’s the point though? That LLMs tend to be constraint by constraints in their prompting? That seems unsurprising.

Humans are visual animals. We can spot a chicken in a graph, but we’re unlikely to be able to tell that a different graph is using XY coordinates to encode a message against a one-time pad. But so what?

link