Hacker News new | ask | show | jobs
by dwaltrip 177 days ago
If they game the pelican benchmark, it’d be pretty obvious.

Just try other random, non-realistic things like “a giraffe walking a tightrope”, “a car sitting at a cafe eating a pizza”, etc.

If the results are dramatically different, then they gamed it. If they are similar in quality, then they probably didn’t.