Hacker News new | ask | show | jobs
by moffkalast 763 days ago
Lots of those tokens would have to be pixel patches and sound samples right?
1 comments

Yep. Since it’s multimodal. Pictures, text, audio all go into token space.