Hacker News new | ask | show | jobs
by vatys 916 days ago
When I signed up for 23andme many years ago, it was via a friend in another country, who re-mailed it for me under a fake name and paid in cash. For some time I would log in through a locale-specific 23andme sub-domain until they eventually merged it all together.

It wasn't long before they figured out who I was and placed me within my family tree. My fake name now lives among near and distant relatives I was not aware had signed up themselves or their parents/grandparents. They know who I am, who my siblings and cousins and aunts and uncles are, etc. This was always going to happen as soon as I sent them my sample.

I never believed my anonymity trick would truly work, I just wanted to make it sufficiently difficult for when 23andme inevitably sold out, got gobbled up, or turned evil. I learned what I wanted from the service, and have only logged in once a year or so since to see if they updated any findings or disease studies.

While I truly appreciate the concept of bringing privacy and anonymity to this field, it's worth considering we are all quite easy to identify using these samples.

6 comments

> While I truly appreciate the concept of bringing privacy and anonymity to this field, it's worth considering we are all quite easy to identify using these samples.

Yes, as long as they have the data. If a company would process the sample, send me a thumb drive of my information, and not retain a copy, that data can't leak because it doesn't exist.

> not retain a copy

Unfortunately this is just one step away from a blog post where the CEO apologizes for letting down their customers by keeping copies of all data in an unsecured s3 bucket that was downloaded in its entirety by a 13 year old "hacker".

If you prominently advertised that you don't retain data, but it turned out that you did and it got leaked, that's a straightforward case of fraud. Given that the services would be advertised over the internet, it probably counts as wire fraud which means the feds would get involved. On the other hand if they had permission to keep your data and they got hacked, it becomes a messy tort case where the plaintiffs has to prove that the company didn't try hard enough to secure the data. In other words, the point isn't to guarantee that your data won't be leaked/hacked, it's to make it straightforward to go after you if you decide to lie.

This is why I won't use any genome sequencing service that has a bunch of ancillary services attached (eg. analyzing your ancestry, or figuring out what diseases you're at risk for), and you have to request deletion of data. The fact they provide such services means that your data is getting automatically uploaded to the cloud, probably resulting in multiple copies to different systems/databases/vendors. Even though you can theoretically request deletion, all those copies means there's a non-negligible chance that there's a copy lying around in a decommissioned s3 bucket that they didn't delete. If they service promises sample -> sequencing machine -> lab computer -> [PGP encrypted email/mailed CD], that cuts the risk considerably.

> I just wanted to make it sufficiently difficult for when 23andme inevitably sold out, got gobbled up, or turned evil.

You might as well add "hacked" to that list given recent events.

Yes, I definitely considered that as well. Basically, I knew that 23andme data would eventually exist outside 23andme, whether that be via hack, acquisition, or eminent domain.

I accepted that and did it anyway, taking steps to at least not be directly associated with my sequence, even if my association can be inferred or derived later. My main concern is that their testing would identify something which in the future would be a "pre-existing condition" and get me denied medical care, but there is certainly a long list of other possible consequences.

At this point I don't trust any company or agency that collects and uses data, or the promises made in any privacy policy, but I also don't lose any sleep over it.

There used to be a way to request full data deletion on their website. Probably too late to do it now for people who are included in the hack, but could still be a good idea to do it asap.
I did the same, sans the cash payment. I REALLY wanted my DNA sequenced but they were the only consumer option at the time. Anonymous sequencing is the way to go. There's just too much opportunity for abuse or incompetence around my most private data.
…if you wanted full anonymity, why did you turn on DNA relative sharing? Why don’t you turn it off now? Or do you mean, you assume they could place your profile within a tree, if they wanted to?
Consider a service which promised to scan your genome, send you the data file, and delete the sample, and their copy of the file on confirmation of your receipt. This is still vulnerable to dishonesty, but only transiently.

There's nothing logically impossible about such a service, and I'd trust it modulo actual red flags. Too bad afaik nobody's offering it. Once they're archiving their copy I just don't see how they can credibly promise privacy in the longer term.

Last I looked it didn't seem really practical to just buy your own sequencer.

I thought about offering a product like this but the market seems tough given:

1. Most people don’t care about the privacy aspect

2. People who already got a test from 23andme, Ancestry, etc are unaddressable

People are conditioned not care, but I seriously doubt that is avoid argument for not providing anonymity as a default product considering the risks when genome data is breached.

Surely, it’s not that costly to delete data? The only reason to keep this data is for ulterior motives like monetizing.

I'd pay more for credible privacy. You'd think this could support a small business, even assuming there's no angle for a grand VC-funded startup. (Yes, easy for me to say.)

Nebula actually did use to let you download your data and tell them to delete it. When I was looking last year, though, they'd moved to some new model (which I assume this post was about).

I wish I was that smart when buying 23andme. Bitcoin is also not anonymous, unless you happen to mine it up yourself. Does Nebula accept Monero?
Paying anonymously does not resolve the problem of identity by descent / genetic relatedness for a service that retains your genetic data. As relatives sign up with any identifiable bit of information, your anonymity erodes.