Hacker News new | ask | show | jobs
by brrrrrm 653 days ago
LLMs can deal with more than text. Impressive today is nothing tomorrow
1 comments

The technology that allows an LLM to "see" images and video is completely different though. It's not what is being trained on common crawl.
not really. embeddings are embeddings. check out llava