That seems like a very high error rate, about 10 million errors in the three-gigabase genome, and 100 thousand errors in the 30-megabase exome (protein-coding regions.) That might be an acceptable rate for population-level analysis if the errors are sufficiently uncorrelated, but I wouldn't want to be making decisions on the basis of it for personalized medicine. For comparison, here's a rough estimate that an individual human genome has 2-3 million SNPs [2].
I thought you could do better than that with 30x coverage, so I might be misinterpreting them, somehow. Or maybe they're using an unconventional sequencing technology which is cheaper but less accurate.
There's no simple answer to your question as it depends on many things - sequencing technology used, library prep and coverage to name a few.
Generally, it's not far from none when aligning short reads to a high-quality reference genome. Provided there's sufficient coverage and a majority of reads covering a particular nucleotide don't have a error at that position, than the correct answer will be given. Errors creep in due to things like systemic errors in library prep (such as a PCR error), and very low coverage over particular loci due to weird AT/GC content, meaning errors are harder to correct for. Repetitive regions can cause issues for short read alignment too, but coding regions generally aren't that repetitive.
$200 is very cheap for WGS - guessing it would be at the low end of the accuracy range, as they can't be sequencing to great depth (presumably).
In the genetics world, when you say the words "Gold standard", that usually translates to "Sanger sequencing", which is a high accuracy method of sequencing a small section of DNA, like a single gene. I don't think your statement is very helpful in that context.
Most of the world's whole genome sequencing is done using the Illumina platform. This service is using the BGI platform, which is arguably higher quality than Illumina. Our lab has data showing the error rate with BGI is about 1/6 the error rate of Illumina.
Yes, there are some even better sequencing technologies out there, such as PacBio, which provides longer reads capable of sequencing slightly more of the genome, and the error rates are constantly improving. However, these technologies are much more expensive.
Sensitivity = true-positive-rate = 0.997.
Precision = 0.997 = #true-positives / (#true-positives + #false-positives) = true-positive-rate / (true-positive-rate + false-positive-rate) = 0.997 => true-positive-rate + false-positive-rate = 1 => false-positive-rate = 0.003. [1]
That seems like a very high error rate, about 10 million errors in the three-gigabase genome, and 100 thousand errors in the 30-megabase exome (protein-coding regions.) That might be an acceptable rate for population-level analysis if the errors are sufficiently uncorrelated, but I wouldn't want to be making decisions on the basis of it for personalized medicine. For comparison, here's a rough estimate that an individual human genome has 2-3 million SNPs [2].
I thought you could do better than that with 30x coverage, so I might be misinterpreting them, somehow. Or maybe they're using an unconventional sequencing technology which is cheaper but less accurate.
[0] https://us.dantelabs.com/products/whole-genome-sequencing-wg...
[1] Equations given here: https://en.wikipedia.org/wiki/Sensitivity_and_specificity
[2] https://biology.stackexchange.com/a/51315/37343