Y
Hacker News
new
|
ask
|
show
|
jobs
by
gruez
29 days ago
I thought all the recent models are "multimodal"? Is the image part just sticking an image recognizer in front of the text model?