Hacker News new | ask | show | jobs
by gww 460 days ago
They didn't sequence the whole human genome (~3 billion bases) for multiple reasons. I am not an expert on ancient DNA but I will try to explain the paper as best I can:

1. Contamination with other flora and fauna DNA 2. Relative low proportions of human DNA 3. The DNA is usually highly degraded, which limits the analyses to short read sequencing (in this case they used 76 bp reads). The halflife of human DNA is ~521 years.

To mitigate these problems they used multiple targeted approaches including one to isolate mitochondrial DNA, where they managed to sequence the whole ~16kb human mtDNA, where each base was covered by 62 sequencing reads on average (62x coverage).

They used another to isolate specific regions containing single nucleotide polymorphisms (SNPs), which are DNA mismatches known to be related to ancient human DNA and humans. They targeted 470,724 single nucleotide polymorphisms of which 70% (336,429) were recovered.

They did perform shotgun sequencing on all of the DNA isolated, but due to species assignment issues they again focused on fragments that contain diagnostic SNPs in these cases they only recovered a small number of SNPs per sample, again due to the relatively low proportion of human DNA and its degradation (20,526, 3,734, 124,862, 85,901, 34,756, 41,632, 34,677 and 72,992) as per the legend in figure 3.

2 comments

That analysis makes me think of matching more than recovery.
"matching" is exactly how we do DNA sequencing right now. The current technology is called next generation sequencing (NGS), we multiply the DNA and perform matching digitally to construct the full DNA.
It's quite fascinating. It's like if order to figure out the shape of a teacup, we generate thousands of identical copies, smash them all to rather small bits, and then try to count the different types of shards as a first step to piecing together one full copy. Impressive that it works.
> It's like if order to figure out the shape of a teacup, we generate thousands of identical copies, smash them all to rather small bits, and then try to count the different types of shards as a first step to piecing together one full copy. Impressive that it works.

Yes, but you've got the order wrong.

The teacup is smashed before all of the identical copies are created.

(I wrote DNA analysis software for 6.5 years)

It's not fascinating; it's an endless source of trouble. We only do it because we don't have sequencers that produce extremely long (chromosome length) high quality reads, especially in sequences that contain a lot of repetition. This has been a source of errors and ambiguity for as long as we've used shotgun.
This is a great analogy. One small change is that there are two ways to reassemble it. One is to try to blindly put the pieces together and fork a teacup (read assembly) vs trying to use a picture of the teacup to figure out where the pieces go (read alignment / mapping)
Would it be possible to clone an ancient human being from DNA?
Probably not, not nearly enough material remained to make an accurate clone. The article mentions 70% recovery rate; according to the internet, humans share 98% of DNA with chimpanzees (and 35% with daffodils), so unless you have 100% or 99.9999% of the DNA, the clone will be imperfect at best and a Thing That Should Not Be at worst.
I think your ethical board would probably stop you first.