I have an android phone which means I use android auto fairly often. The sheer quality destruction it's experienced since transitioning to Gemini is incredible.
I experience this mostly when asking for music. Before gemini, mistakes were common but deterministic. It was easy to understand where the query had gone wrong and so how to fix it. Example:
"Hey google, play Blackstar"
(Plays the album blackstar by David Bowie, not what I wanted)
"Hey google, play "Blackstar by Radiohead"
(Plays the right thing).
Now:
"Hey Google, play Blackstar by Radiohead" can result in playing... something vaguely semantically related with no way to course correct. In this exact instance (happened yesterday!) it played an album by the hip hop due Black Star.
I will admit that there are some superpowers hidden in Gemini that were not present in the previous AI assistant. I recently discovered that Gemini can manipulate the navigation app, and a prompt like "Mute alerts" works, which is kind of cool. However like OP said, it's incredibly verbose, which is super annoying.
I don't really use Google products except Youtube, but their auto-subtitler (closed captions) have really taken a weird turn lately. It used to try sounding out the words so even if it didn't pick the right word, it was at least close to right one. But now it substitutes entire phrases and sentences based on what is most likely to appear in a conversation even if it has no bearing or similarities to what was said.
Pre-gemini, you knew what you would get, basically the structured snippets that would appear at the top of the search results.
Now it's much more verbose.
My biggest gripe is that it basically stopped listening to me, since "upgrading" to Gemini, which is frustrating because I've used it to control the Hue lights for the past decade.
It listens to my partner though, so after it fails to listen to me, I have to ask her to ask Google to adjust the fing lights.
There was an Amazon UXR study that floated around ~8 years ago that said the only things people care about from a voice assistant are music, weather, and alarms.
PMs keep trying to make them "smarter," and it just makes the core user journeys worse.
Surely they think they're inventing cars when we're griping about buggy ships. But it really feels like voice assistants peaked ~10y ago for the things people actually want them for.
Considering the decade of poor results from doing anything other than music, weather and alarms I think we've all learned to avoid using them for anything else.
Similarly, Siri was arguably better a decade ago when it has deterministic intents that you could trigger with certainty, now it is all seemingly random at time.
I wonder how much of it is them trying to minmax their revenue generating marketing algos for content discovery. (Willing to bet it's somewhat relative)
I Fired theartofdoingstuff.com for extremely obnoxious ads - particularly the "Would you like to save this stuff?" one that greys out the rest of the article you're trying to read to... checks notes... ask if you want the article emailed to you? Who the hell thought that was a good idea?
The "ad" as described in my comment isn't really an ad in the typical sense, it's baked into the website. But the real reason is, I'm on my work computer and unable to install browser extensions.
I had Claude whip up a local solution for me using Gemma 4 26b-a4b on my Mac and a Raspberry Pi Zero 2 W. It can do web search (valyu), file reading -- I have business & personal context stored in many markdown files -- weather, Apple reminders, and has cross-session memory. Streamline but capable agent that has been pieced together over a few weeks after using Karpathy's LLM wiki pattern, with bits of Hermes logic. Orpheus-TTS streams the spoken reply back with the first word usually landing in half a second. Voice input is openWakeWord for the wake word plus faster-whisper for speech-to-text, all on-device. I can run it straight on the Mac but I use it with the pi satellite and a cheap USB speakerphone (ConfCall MS13B). You can barge in by just talking over it rather than having to say the wake word again. Pretty handy Google Home replacement.
I got rid of my Alexa devices after they would not shut up. "Alexa what is the weather for today?" "The weather for today will be ... By the way, did you know that you can set alarms? Just say ..."
I do not care about whatever stupid feature you want to build engagement around. Do what I asked you to do and then shut up.
And don't get me started on the Alexa Show that had the audacity to display ads.
> I ordered it because when one superpower fails you, eventually you have to try the other one before they eventually ruin the thing that was working perfectly well before they tried to improve it in order to make enough money to be a trio of planets.
When your options are a few competing BigCos and you don’t have incentive to try to build it for yourself because it straddles the annoying in-between space of “frustrating enough to do something” and “ not frustrating or valuable enough to actually solve the problem.”
I sadly did the same as I didn't know best. The Google speakers started to have issues identifying the orders, so we bought some alexas, which we also use as speakers for our tvs.
Do you have a better recommendation of a smart speaker that can play spotify, youtube music, or tidal?
Gemini made me turn google off completely. There is no provision to not allow your chat data from being shared send for training - even if you pay. Clearly they don’t care about end users and I am guessing they are losing money on paid cunaumer tier as well. But that is not acceptable to me. I just turned off all Google services and finally switched to ddg as the default search engine.
I think that this is a really good article for a reason that none of us are mentioning. This is not a tech blog saying this. This is a cooking/gardening/home blog. I use these terms jokingly, but we are nerds on hacker news and this blog writer is a normie. Regular people are getting fed up with how bad google is getting, not just the nerds.
Companies don't care when nerds complain. We always complain. But when their normal user base starts jumping ship, then they could very well start listening.
> For years I could ask Google what song was playing and it would identify it. It was one of the most useful features it had. You'd hear a song in a store, on television or drifting over from a neighbour's backyard and Google would identify it in seconds.
A huge pet peeve of mine is when I’m in the car and want to know what song is playing on the radio. I run Shazam and my phone mutes the stereo to activate a microphone. I have to disconnect from CarPlay then run Shazam, then reconnect — it’s a passenger only operation.
Song recognition is built into both iOS and Android, the device should always use the internal mic instead of a CarPlay/Android Auto microphone over Bluetooth.
Side note: is there a good “dumb smart speaker” I can have run with a wake word connected to my own API? Speech to Text and Speech to Speech are fairly well supported for local AI workflows now, it would be great to have my own Home device without worrying about where the audio goes.
I’m sure it’s a very niche audience today, but I imagine giving this thing MCP for Wikipedia, a music app, and my recipes would be perfect.
The whole upgrade thing is stupid. For years I was able to tell my phone in the car: "Play this song on spotify" and it did it. Not any more.
"Ok google! Set a timer for 20 minutes." -> "Sorry Bob, I don't have permission to do that".
"Ok google, navigate home" -> "Navigating to Store@Home24!" (half a country away).
I agree. I wasn't smart but it was useful in certain cases. Now it's just lobotomized.
This is a chronic LLM style, particularly the multipart dramatic flourishes and the Haiku paragraphs. But at this stage, there's a good chance that LLM re-diffusion of this style is feeding back into human culture. God save us.
I know nobody did, but seeing as I was too young (and maybe not even alive?) I have always wanted to try it. I'm a Coca Cola enthusiast after all. I wish they'd release a "Throwback Experimental Coke" batch out. I assume it was their attempt to flavor coke without the coca leaves?
> I don't need a four-minute explanation. I don't need context. I don't need a balanced discussion that considers multiple viewpoints and concludes with a summary.
Karen, you mean you don't want those things. Stop confusing want and need.
Fun article, but the sea of pop ups and advertisements battling to distract you from the text ironically exemplifies the point being made about Google Home.
The quality of Gemini varies so much throughout the day, the week and the month that it’s hard to rely on it for anything. It feels like Google is trying to throttle costs but has completely missed the mark.
I don't really use our Google Home devices, but my wife does. Ever since the "update" I've noticed now that Gemini likes to ask her 2-3 follow up questions after each query. The first might be something like, "Would you like to know more about <X>?" My wife would answer "No" and Gemini would respond like, "Is there anything else you would like to discuss on the subject of <X>?" It's gotten to the point where she will start yelling "cancel" and "stop" to get it to shut up.
I talk to chatbots while doing yardwork, often with both hands full. Yesterday I tried to get it to shut up mid conversation and just could not do it with voice only. I had to wait through several minutes of blathering, restarted by every random noise, before I could put the chainsaw down and turn it off.
I was amazed at my own level of anger at that. It was just a voice in my ears but I reacted to it viscerally like it was an assault. It didn't help that it was in the middle of a sequence of it telling one lie after another, like "yes, I can disconnect this conversation." Maybe what I had is a natural reaction to having a lying clueless asshole refuse to go away or shut up, which I haven't otherwise had to deal with lately.
The interface into the LLM is tokens in and out (text, images, audio). And the harness generally doesn't understand what you're passing in. The LLM has nothing to do other than to respond with tokens and empty responses (eg. just a stop token) have been aggressively trained out of it.
> Nobody asked for cars that require IT support and 3 sub-menus to lower the air conditioning. (Screens are cheaper to install than buttons and knobs.)
This doesn't really track. Sibling comment pointed out that the average new car transaction price is massively higher than before knobs were replaced with screens.
But more importantly... screens were put in more expensive cars first, and slowly trickled down to budget cars. It's a very weak argument that it was done for cost reasons. Screens are flashy and impress people during their 5 minute test drive. "Wow! Think of all the things I can do in my car that I couldn't do with a knob for changing fan speed." Sure, living with those screens tends to be a bit less enjoyable than those first impressions lead you to believe, but bright colorful animated screens helped to sell cars. If they're actually less expensive than knobs and buttons, that's just a bonus for the manufacturers.
Also keep in mind, when screens first appeared in (expensive) cars, they weren't actually cheaper than the knobs and buttons they replaced. Technology is, sorry was getting cheaper per unit of performance over time. Screens became commonplace and inexpensive to put in cars, but I suspect they were ten times more expensive than all the knobs and levers they replaced when they started appearing in luxury cars.
But nobody got cheaper cars. The shareholders on the other hand asked for a third yacht, and they got exactly what they wanted (after the car manufacturer fired the team which used to do usability testing for their interface)
The only things these "smart" speakers are good for is timers, adding to lists (especially grocery lists in the kitchen), and playing music. And even then they often stutter...
Agree, but they are very good at playing music. They are still useful even if the don't do anything else, just like a microwave just needs to cook food.
I wondered how long it would be until someone took the obvious step of wiring a voice assistant up to a full blown LLM model. Seems like the thing I failed to consider was whether that was actually a good idea.
I feel like the author has another disappointment coming. Alexa has been annoying in this way for a long time now. Still very useful as well, but yeah, annoying. It has always been too proactive about suggesting things I don't want or need in response to simple questions like "What is the outside temperature?" ("It's currently 46 degrees, with a low of 32 degrees expected overnight. Did you know you can turn your lights on and off according to a schedule? Would you like to schedule turning some lights on or off?") But recent AI-oriented updates have just made it worse. Now its chatty and full of attitude, and having raised a few teenagers let me tell you, that's what we want more of in our lives. Chat. And attitude.
Sorry but this page and site have the most obnoxious number of ads on almost any page I've ever seen. After reading a couple of paragraphs I closed the window, no content is worth subjecting myself to that many ads and trying to navigate which is real content and which isn't.
Don't forget a pihole! Sometimes I really do forget how obnoxious the internet is until I try to look at a website when I'm outside of my ad free setup at home. Which in a way keeps me from browsing around websites on my phone and wasting time in that way.
Having used Google home assistant since it came out for all the things that it's good for, and watched its quality fluctuate, I find I really have to be more careful nowadays when I ask it questions because it can go overboard more easily.
Is there a term in AI research where it underappreciates the specificity that is being asked for?
Perhaps the AI could be default prompted, "you are a kitchen AI assistant and should tend to answer in facts and details and that are relevant to the current moment."
I don't understand how anyone bothers with these things. Other than timers in the kitchen (where hands might be greasy) I've always found it faster to just pull out my phone and Google it.
P.S. I hope that dehydration/headaches question was a poorly chosen example and not something someone over the age of 5 seriously needs an answer to.
I experience this mostly when asking for music. Before gemini, mistakes were common but deterministic. It was easy to understand where the query had gone wrong and so how to fix it. Example:
"Hey google, play Blackstar"
(Plays the album blackstar by David Bowie, not what I wanted)
"Hey google, play "Blackstar by Radiohead"
(Plays the right thing).
Now:
"Hey Google, play Blackstar by Radiohead" can result in playing... something vaguely semantically related with no way to course correct. In this exact instance (happened yesterday!) it played an album by the hip hop due Black Star.
I will admit that there are some superpowers hidden in Gemini that were not present in the previous AI assistant. I recently discovered that Gemini can manipulate the navigation app, and a prompt like "Mute alerts" works, which is kind of cool. However like OP said, it's incredibly verbose, which is super annoying.