Hacker News new | ask | show | jobs
by doubtfuluser 1519 days ago
I have the feeling, that the paper is flawed and missing an important experiment. All their results seem to rely on skills being used. There is indeed a need then for Amazon to prevent user tracking in skills (like for example Apple does and requires consent by the user). But to come to the conclusion that Amazon shares the data with advertisers I would have expected an experiment with eliminating skills as a reason and just having personas interact with Alexa core services. I guess just from shopping questions or general knowledge questions a lot of information for ad targeting could be inferred. If that’s however not influencing ads served when no skills are used, then it’s not necessarily Amazon directly sharing the information, but the skills being able to do so using Amazons provided tooling.

That’s a difference at least for my interpretation how “evil” company xyz is.

2 comments

They run this down as best they can (TLDR they don't think the skills have enough information about their personas to target ads to them, therefore Amazon must be doing the targeting, but they can't 100% rule it out):

> In contrast, skills can only rely on persona’s email address, if allowed permission, IP address, if skills con- tact non-Amazon web services, and Amazon’s cookies, if Amazon collaborates with the skills, as unique identifiers to reach to personas. Though we allow skills to access email address, we do not log in to any online services (except for Amazon), thus skills cannot use email addresses to target personalized ads. Skills that contact non-Amazon web services and skills that collaborate with Amazon can still target ads to users. However, we note that only a handful (9) of skills contact few (12) advertising and tracking services (Table 1 and Figure 2), which cannot lead to mass targeting. Similarly, we note that none of the skills re-target ads to personas (Section 5.3), which implies that Amazon might not be engaging in data sharing partnerships with skills. Despite these observations, we still cannot rule out skills involvement in targeting of personalized ads.

Additionally, they are trying to imply the common trope of “voice assistants are listening to everything we say all day for ads”, whereas their test methodology was to actively use the top skills for those interests and perform actions.

While I don’t like the sharing of such data for ads, it’s a far cry from Alexa processing voice in the background with zero interaction.

What part of the paper gives you the impression they imply voice assistants are listening to everything? I don’t get that.

The discussion in the paper is nuanced on that point and does not make that claim as far as I read it. Section 2.2 (page 2):

> The content of users’ speech can reveal sensitive information (e.g., private conversations) and the voice signals can be processed to infer potentially sensitive information about the user (e.g., age, gender, health [82]). Amazon aims to limit some of these privacy issues through its platform design choices [4]. Specifically, to avoid snooping on sensitive conversations, *voice input is only recorded when a user utters the wake word*, e.g., Alexa. Further, only processed transcriptions of voice input (not the audio data) is shared with third party skills, instead of the raw audio [32]. However, despite these design choices, prior research has also shown that smart speakers often misactivate and unintentionally record con- versations [59]. In fact, there have been several real-world instances where smart speakers recorded user conversations, without users ever uttering the wake word [63].

> What part of the paper gives you the impression they imply voice assistants are listening to everything?

For me, it's this: "Your Echos are Heard"

So, the opening salvo. That's what gives me the impression they imply voice assistances are listening to everything.

I don't refer to voice commands or normal interaction as "echos" so the user of the word "echos" here implies something nefarious. Sure, it's the name of the product, but for me, it reads like something more.

Alexa uses the data we give to it by speaking and performing actions via downloaded skills - is very similar to all ad platforms, conveying user intent into ad profiles.

Saying “process voice for ads” has subtle connotations in the current landscape of privacy discussions.

>Alexa uses the data we give to it by speaking and performing actions via downloaded skills - is very similar to all ad platforms, conveying user intent into ad profiles.

There is an argument that this is more privacy conscience than other ad platforms. One needs to say the word "Alexa" before Amazon will collect any potential targeting data. There is an active and distinct choice that must be made before every interaction. That isn't true for Google and Facebook. They will collect data in the background while you are doing other things. There is much less transparency in when and how they are collecting their targeting data and therefore we have much less agency in the process.

As clarification, you are objecting to the phrase “process voice to [serve] ads” in the title which was provided by the submitter not the paper authors?
From the abstract:

> We find that Amazon processes voice data to infer user interests and uses it to serve targeted ads on-platform

For example, this sibling comment: https://news.ycombinator.com/item?id=31178067
Unlike other ad platforms, Amazon claims that they do not use voice data for ad targeting. From paper:

Amazon has publicly stated that it does not use voice data for targeted advertisements [83], [75].

https://www.nbcmiami.com/news/local/are-smart-speakers-plant...

I'm sure Amazon isn't processing voice data to target ads. Why would they need to?

They're using skill interaction, order history, listening history, etc to target ads.

"Subtle connotations" are not much to make an objective complaint out of.