|
|
|
|
|
by iang
5187 days ago
|
|
- Could you make a baby? No, there's more to us than just DNA. For example methylation, the addition of methyl chemical groups to some bases, which isn't tracked in "normal" DNA sequencing controls which genes get expressed by which cells. Plus there's a reasonable chance of errors in the sequencing due to the need to copy the DNA repeatedly to identify. - Data size Most bioinformatics data formats are plain ASCII. So even the reference data would be 3Gb per person. But 1,000 genomes contains sequencing reads where each DNA bases is sample multiple times (20-40 is typical "read depth") so that errors in identifying bases can be minimised. Each base of each of these samples has a quality score associate (which is about a 6 bit value). Plus identifiers for all billion odd reads per person. |
|
As to read errors that effectively just a 'mutation' which are generally fairly harmless. If you stay below say 1,000 mutations, which would still take vary high accuracy, you have not significantly reduced your chances for success.