Hacker News new | ask | show | jobs
by josefresco 726 days ago
> The problem is all of the real world interface challenges, keeping speakers and microphones working outdoors across many different climates and weather conditions, and temperatures and humidity is incredibly expensive.

This sounds like a problem for humans too - are humans just better equipped to deal with bad audio?

3 comments

Yes, but not in the ways you think. Humans can do about as good a job as a computer in understanding poor audio quality in context, but the compute needed for the latter, in realtime, is pretty substantial.

As the commenter says below, a human can intervene in many more ways when equipment malfunctions or customers have special needs that an AI just gets blocked by.

There are literally hundreds of edge cases where a voice powered drive through just stops working, from high winds to pouring rain, to thick accents, broken equipment, out of stock or seasonal items not available. Just a few of the ones I encountered personally in the wild tagging 15,000+ orders.

Humans have had to evolve to be able to communicate with each other effectively in this and far worse conditions - on battlefields, in driving hurricanes, while being stalked by other humans (and animals), yelling across vast distances, over raging rivers, while wounded, sick, etc.

Humans can generally figure it out eventually.

Yeah this sounds strange to me - every drive thru has had an intercom system like this working outside for decades. Doesn't seem like an insurmountable challenge to me.
You're not entirely wrong, but often these AI systems need some pretty clear audio to work. It's kinda shocking how good we are at working around bad audio when it comes to conversation, and I'm certain most people know how bad these intercom systems get. The issue isn't that they need to be fixed at all, it's how far they can go before they must be fixed. And the one thing we can do that AI can't is have face-to-face conversations. If the speaker simply doesn't work, it's a bit of a drag, but you can just pull up to the window directly and skip the entire audio system. Or just walk inside. Both options eliminate the problem hardware, where as AI would need additional hardware to do those jobs.
Humans have more work arounds in failure cases, at least. Of just the ones I can recall seeing:

1) Sign taped to intercom: "Drive up to window to order". Order at window.

2) Worker standing at the intercom with a notebook and pen and a runner trading sheets between them and the kitchen.

3) "Drive-Thru Closed; Order Inside"

4) Restaurant's phone number posted over intercom.

I'm sure there are a lot more variations I haven't seen. Humans are pretty resourceful when machines fail them but they still want to get paid.