Y
Hacker News
new
|
ask
|
show
|
jobs
by
acid__
844 days ago
Wow, only 256 tokens per frame? I guess a picture isn’t worth a thousand words, just ~192.
2 comments
gwern
844 days ago
Back in 2020, Google was saying 16x16=256 words:
https://arxiv.org/abs/2010.11929#google
:)
link
swyx
844 days ago
gpt4v is also pretty low but not as low. 480x640 frame costs 425 tokens, 780x1080 is 1105 tokens
link