Hacker News new | ask | show | jobs
by Barbing 280 days ago
I'd figured it performed image recognition on the scene visible to it, then told the language model it could see various ingredients including some combined in a bowl.