Hacker News new | ask | show | jobs
by joe_hoyle 2487 days ago
> Siri uses a random identifier — a long string of letters and numbers associated with a single device — to keep track of data while it’s being processed, rather than tying it to your identity through your Apple ID or phone number — a process that we believe is unique among the digital assistants in use today. For further protection, after six months, the device’s data is disassociated from the random identifier.

Interesting, I thought I had heard it widely reported that Apple was keeping hold of audio records tagged with your Apple ID for 6 months, before anonymizing. That looks like it wasn't the case, and Apple was only tagging those recordings with a device ID, presumably to associate recordings with other recordings.

2 comments

Yeah, "widely-reported" was The Verge. As John Gruber points out[0], The Verge wasn't wrong, but I can't say their reporting would give the average reader a good grasp on what was really going on. That would include myself: https://news.ycombinator.com/item?id=20724558

[0] https://daringfireball.net/2019/08/apple_siri_privacy

From 2017, about the recording and tokenization steps:

Siri records your queries too, but she doesn’t catalog them or provide access to the running list of requests. You can’t listen to your history of Siri interactions in Apple’s app universe.

While Apple logs and stores Siri queries, they’re tied to a random string of numbers for each user instead of an Apple ID or email address. Apple deletes the association between those queries and those numerical codes after six months. Your Amazon and Google histories, on the other hand, stay there until you decide to delete them.

http://themillenniumreport.com/2017/03/not-only-are-alexa-si...

From Wired, “Apple finally reveals how long Siri keeps your data”, in 2013, about later disassociation from the tokens:

Once the voice recording is six months old, Apple "disassociates" your user number from the clip, deleting the number from the voice file. But it keeps these disassociated files for up to 18 more months for testing and product improvement purposes.

"Apple may keep anonymized Siri data for up to two years," Muller says "If a user turns Siri off, both identifiers are deleted immediately along with any associated data."

https://www.wired.com/2013/04/siri-two-years/

I don’t want audio recordings or transcripts to rain in their servers. I don’t even want “smart” Siri. I want stupid Siri aka proper speech to text locally + a fixed set of commands I know/learn/discover

There is no need to send all of this to them. If I want to suggest a command, allow me using a simple form.