Y
Hacker News
new
|
ask
|
show
|
jobs
by
moffkalast
763 days ago
Lots of those tokens would have to be pixel patches and sound samples right?
1 comments
nojvek
763 days ago
Yep. Since it’s multimodal. Pictures, text, audio all go into token space.
link