Hacker News new | ask | show | jobs
by twobitshifter 1 hour ago
Right, just give the text llm access to a vision specific agent and that problem can be solved. Or if you really want let it even call Opus with an image - seems like you’d still save money