Hacker News new | ask | show | jobs
by ampdepolymerase 1630 days ago
I recommend reading this review

https://genomebiology.biomedcentral.com/articles/10.1186/s13...

I guess there are limits to ensemble methods if the underlying accuracy doesn't increase. I don't work on gene sequencing algorithms but from what I understand of ML ensemble techniques, there are certain assumptions regarding the underlying independence of the errors. The errors for nanopore should be uniform but I am not sure. Any molecular biologist here care to comment?

1 comments

I know that the error rate of the oxford nanopore sequencer depends on GC content (guanine/cytosine nucleotides), and that the Pacific Biosciences sequencer uses a polymerase that gets worn down during reading. So there is some non-uniformity in the chemistry.
GC rich regions as in hairpin loops? How would the sequencer deal with those?
If I'm not mistaken the nanopore tech unwinds double-stranded DNA during the reading, so I don't think hairpins are the issue.