Hacker News new | ask | show | jobs
by ACCount37 209 days ago
And? Even if I believed this to be a limitation, I could bolt an adapter to an LLM to make it input and output non-text data.

That's how a lot of bleeding edge multimodals work already. They can take and emit images, sound, actions and more.