Hacker News new | ask | show | jobs
by ac1spkrbox 583 days ago
Multimodal models are useful for lots of things! They can accomplish a range a tasks from zero-shot image classification to helping perform Retrieval-Augmented Generation on images. Like many generative model, I find the utility comes not necessarily from outperforming a human, but from scaling a task that a human wouldn't want to do (or won't do cheaply).