Hacker News new | ask | show | jobs
by trojan13 1100 days ago
I am surprised this article does not even mention multimodal LLMs. Because the more kinds of media the LLM can take as in input and interpret the easier the interaction with it gets.