Hacker News new | ask | show | jobs
by silverkiwi 502 days ago
The evolution from LLM to Reasoning is simply multi pass or recursive questioning.

What’s missing in the terminology is the modality- most often TEXT.

So really we on have Test LLM or Text Reasoning models at the moment.

Your example illustrates the benefits of Multi Modal Reasoning (using multiple modality with multi pass)

Good news - this is coming (I’m working on it). Bad news this massively increases the compute as each pass now has to interact with each modality. Unless the LLM is fully multi modal (Some are) - this now forces the multipass questions to accommodate. The number of extra possible paths massively increases. Hopefully we stumble across a nice solution. But the level of complexity massively increases with each additional modality (text,audio,images, video etc)