|
|
|
|
|
by dwaltrip
177 days ago
|
|
If they game the pelican benchmark, it’d be pretty obvious. Just try other random, non-realistic things like “a giraffe walking a tightrope”, “a car sitting at a cafe eating a pizza”, etc. If the results are dramatically different, then they gamed it. If they are similar in quality, then they probably didn’t. |
|