|
|
|
|
|
by bastawhiz
1253 days ago
|
|
While plausible on paper, it's not practical unless they jam an order of magnitude or two more compute into the devices. To get reasonable accuracy (i.e., enough to be able to use for profit) from any casual speech, the current models run far from realtime on a modern MacBook. You're not going to squeeze reasonable accuracy from the tiny processor on the devices in the world today, even if you record and process async as a way to hide from people inspecting traffic. Edit: it's worth noting that this dramatically increases the cost of the device. They'd need to be able to see a way to recoup those costs if they eat the additional hardware cost. But that's silly for a company that's literally in the business of cloud computing and where the goal of the hardware is to hide what you're doing. When will people start asking why there's a full GPU in their Echo? |
|
Do you really need 100% accuracy here? This isn't like cops setting up a wiretap. Google isn't waiting for you to slip up and admit that you like funko pops or whatever. If you're constantly talking about your cat, or wanting to get a car, that's all they need to target ads to you.
Also, the processing doesn't have to be real time. It doesn't matter that google learns about your cat 8 hours late because the device is running its ML models in the background while you're asleep. If the device picks up 3 hours of speech per day, it only needs to process at 1/8x speed to catch up. On the off chance you have a house party and it's picking up 6 hours of speech, it can always buffer it for later, or drop it altogether (see above paragraph about how it doesn't need to pick up everything).