Hacker News new | ask | show | jobs
by jchw 502 days ago
I happen to have Gemini Pro or whatever, because it's somehow bundled or free with some other Google thing I have (I don't care to ask.) Although it is a bit weird talking to a computer as if it's actually a person, I did try Google's equivalent feature in Gemini a couple of times and it seemed to work extremely well. I'm not sure how exactly it compares to OpenAI but it holds a natural conversation very well in my experience. I reckon this bodes well for Apple. It seems to me their OpenAI deal is mostly an admission that they couldn't build out their own technologies fast enough, but they certainly don't lack the capital or ambition to do it if it truly is going to be important for them in the future.
2 comments

Google successfully fooled everyone into thinking that "Gemini Live" is the same as Advanced Voice Mode in ChatGPT. It's not, Gemini Live is a "stupid" speech-to-text and text-to-speech and not multimodal like AVM.
OpenAI's approach is certainly more technically interesting, and probably the way to go in the longer term, if the juice is worth the squeeze with this sort of technology. (After being relatively unmoved by LLMs for many other tasks, I found the voice assistant concept a lot more interesting, personally, even though I still don't have any routine uses for it.) That said, it doesn't really matter exactly how it works internally: Gemini Live accomplishes what it sets out to do, in that it feels very natural and works fairly well. I think it's clear there will be benefits to the multimodal approach of running voice directly in and out of a model for this sort of application, but if stringing together other existing technology can get you 80% of the way, it's not really a rush to get there. I don't really find this too surprising, since as far as speech recognition and voice synthesis goes, the state of the art today is very good, and most of the time computer voice interactions were greatly held back mostly by other things.
Is this different to the free one on ai.google? Because that I’ve tested that extensively and it is absolutely garbage in every single respect.