Hacker News new | ask | show | jobs
by jacquesm 5904 days ago
> this is not full genome information, just your genotype.

That's what they give you. But you give them your full genome.

1 comments

First, to clear up some misunderstandings, your genome is your genotype. They mean the same thing. Also, what 23andme is offering isn't a complete sequence of your genetic code (at least not for $100). The current estimated price is for a complete sequence is about $10,000 to $100,000 (it's come down a lot recently, but it is still beyond the finances of most people). What they are actually doing is mapping certain known locations in the human genome, known as SNPs (single-nucleotide polymorphisms), and seeing what variation you have there. Some diseases are due to a changes in a single gene (for example cystic fibrosis is due to a mutation in a membrane channel), so one SNP they map is the location of the typical mutations in this gene. Other diseases have high correlations to certain combinations of nearby SNPs, and so these can be predictive. It is basically like a world map where the genetic code is on the level of city blocks, but all you have on your map is the location of random streets around the world.
Yes, you're right, sorry about the confusion. The point I was trying to make was they give you the 'limited' version, but you give them everything. They may not sequence it all today but they could do so in the future (I'm assuming they keep copies), and they could sell your samples to parties that can sequence it all (today or in the future).
From their privacy policy (https://www.23andme.com/legal/privacy/) it seems they destroy the sample after generating the SNP map (Personal Info section, Genetic Info subsection paragraph 1).
You're right. I missed that, they contract out the sequencing though.

You seem to be pretty knowledgeable here, how much information is still present in those SNP maps (in terms of bits per person)? Would an SNP map still uniquely identify an individual ?

It depends on how many SNPs (read as 'snip') they map and how much variation is at each site. It basically boils down to a combinatorial counting problem, although there is the complication that variation across close SNPs might not be independent (especially if they are on the same gene, or are related in some phenotypic way).

This is also the basis for DNA forensics/paternity tests. If you sample enough SNP locations you should theoretically have a unique signature (this is where you get the courtroom statistic of 1 in 3 million)

A bit after the fact, but apparently they have the option to 'biobank' your saliva, which means they store it for the longer term. It would be interesting to know how many people use that option.