Hacker News new | ask | show | jobs
by aniketh 5904 days ago
Wouldn't they be tied to the agreement that you had when you signed up for the data? I always thought that you'd have to agree with new "terms-of-use" whenever they change
1 comments

Just to give you an example, I just bought at an auction the complete contents of a defunct company. Amongst the stuff I got was their servers and their bookkeeping. I never entered in to any agreement with the people whose data is stored on those servers, and yet I have all of it.

A 'survival clause' that would guarantee destruction of your data in case the company goes out of business would be a minimum here.

But instead, 23andme says in their privacy policy:

"We may use Genetic and Phenotypic Information to conduct 23andMe-authorized scientific research and development. Any Phenotypic Information you provide is done on a voluntary basis. We may provide third party organizations access to this information for scientific research, but without your name or any other Account Information."

And that's the kicker, so your full genome gets sold to some 3rd party (you don't even get to know who) and all they do is strip off the 'meta data' regarding your person.

And AOL and NETFLIX have already shown that anonymization of data is a myth.

Looks like they do have a 'survival clause' in their privacy statment:

Business Transitions

In the event that 23andMe goes through a business transition such as a merger, acquisition by another company, or sale of all or a portion of its assets, your personal information and non-personal information will likely be among the assets transferred. You will be notified in advance via email and prominent notice on our website of any such change in ownership or control of your personal information. We will require an acquiring company or merger agreement to uphold the material terms of this privacy statement, including honoring requests for account deletion.

Good for them, that's a step in the right direction, but once your data is then in the hands of party #2 the whole thing starts all over again.

And what about all those companies that have received copies of your genetic data from 23andme in the meantime, and their survival clauses?

I don't think it is possible to do this in a watertight way.

edit, long after the edit term expired, how would that survival clause be handled in the case of a bankruptcy?
The potential upside of this type of research is worth the risk of loss of anonymity. I would post my genome to a public website if it would help advance medical science.

[Disclosure: I participated in public health genomics research in grad school.]

I agree with your sentiment in general, not your specific arguments though. First, they're only talking about research, not about selling data. Secondly, I don't think your comparison to AOL and Netflix is valid. AOL users were identifiable by googling their own name etc. How is one supposed to identify you when all there is is your DNA and nothing else, which has never been saved anywhere else at all?
Your DNA is half your parents', and your childrens DNA is half yours. Given a sufficiently large number of datapoints you can work out the DNA of those in between. And given a few 'confirmed' identities you could use that more complete picture to work out the identities of the rest, even if you did not know their names.

Your DNA is your identity. It just hasn't been tied to the meta data of your name, address and social security number, but again, with a bunch of confirmed identities of relatives that is a job that is probably doable.

In a far off future, where enough DNA data is publicly (!) available, that may be the case. Until then, to which data set should any company compare my DNA to find out who I am?
23andme is collecting data at a fair rate, there are other companies like them. Pooling the data between all those companies is going to allow you to fill in a bunch of 'blanks'. At some point that will reach critical mass and you can map the remainder.

I'm not good enough at math to give you the percentage of a certain population in order to be able to infer the rest, maybe someone else here can do that.

But given a population size 'n' if you get a random distribution of individuals and you know their genes and you know have a graph of relationships (say through facebook or some other means of tracing links between people) you should be able to make a formula that tells you what kind of 'coverage' you can expect based on how large a sample.

I'd argue that this is what makes 23andme interesting. Without that capability, it's a toy for rich people.

First of all, today, this is not full genome information, just your genotype. Second, if we want good molecular medicine, information like this is essential (some might argue the only way) to get good sampling and do the appropriate research.

Addendum since I can't reply to the comment. They don't quite have the technology on hand today to do full genome analysis today. I am not sure there is enough material to do whole genome sequencing with current technology (I could be wrong).

> this is not full genome information, just your genotype.

That's what they give you. But you give them your full genome.

First, to clear up some misunderstandings, your genome is your genotype. They mean the same thing. Also, what 23andme is offering isn't a complete sequence of your genetic code (at least not for $100). The current estimated price is for a complete sequence is about $10,000 to $100,000 (it's come down a lot recently, but it is still beyond the finances of most people). What they are actually doing is mapping certain known locations in the human genome, known as SNPs (single-nucleotide polymorphisms), and seeing what variation you have there. Some diseases are due to a changes in a single gene (for example cystic fibrosis is due to a mutation in a membrane channel), so one SNP they map is the location of the typical mutations in this gene. Other diseases have high correlations to certain combinations of nearby SNPs, and so these can be predictive. It is basically like a world map where the genetic code is on the level of city blocks, but all you have on your map is the location of random streets around the world.
Yes, you're right, sorry about the confusion. The point I was trying to make was they give you the 'limited' version, but you give them everything. They may not sequence it all today but they could do so in the future (I'm assuming they keep copies), and they could sell your samples to parties that can sequence it all (today or in the future).
From their privacy policy (https://www.23andme.com/legal/privacy/) it seems they destroy the sample after generating the SNP map (Personal Info section, Genetic Info subsection paragraph 1).