What I will say is that this is probably the first model after gemini live to do some of these things. It feels similar to gemini live, which I don't think is what they were going for exactly, but IMO it is still impressive as I don't think anyone else has matched full duplex video/audio/tool calling.
Next gemini releases coming next week though, we will see how that matches up!
hard agree, there's already "voice ai" companies that use the normal models and have this "interaction" engine on top of them to produce better results than I've seen in these demos. idk why people are impressed