|
|
|
|
|
by hashemian
417 days ago
|
|
To those argue that LLMs might cheat by using EXIF, I saw a post recently on twitter (https://x.com/tszzl/status/1915212958755676350) and out of curiosity, screen-captured the photo and passed it to O3. So no EXIF. You can read the chat here: https://chatgpt.com/share/680a449f-d8dc-8001-88f4-60023323c7... It took 4.5m to guess the location. The guess was accurate (checked using Google Street View). What was amazing about it: 1. The photo did not have ANY text
2. It picked elements of the image and inferred based on those, like a fountain in a courtyard, or shape of the buildings.
All in all, it's just mind-blowing how this works! |
|
4o can do it almost as well in a few seconds and probably 10-50x fewer tokens: https://chatgpt.com/share/680ceeff-011c-8002-ab31-d6b4cb622e...
o3 burns through what I assume is single-digit dollars just to do some performative tool use to justify and slightly narrow down its initial intuition from the base model.