Hacker News new | ask | show | jobs
by Krastan 705 days ago
LMM is a large multimodal model. So it does more than just language, in this case interacting with UI, in others using voice and video