|
|
|
|
|
by marsdentech
1956 days ago
|
|
As others have said, you're reading a sliding window of k-mers over the target sequence; I think for the MinION k is presently 5. To answer your question directly, it struggles with homopolymer runs, not inherently because they're low complexity, but actually because it's tricky to "clock" how many like, contiguous k-mers have passed through the pore after a given period of time. That is to say, for example, if your target sequence is "GGGGGGG" (i.e. a homopolymer run of 7 Gs), you'd expect to observe three like, contiguous signals (i.e. in current space) for the all-G 5-mer, one signal each per "clock cycle" (which corresponds to the dwell time of the k-mer in the pore). If these "clock cycles" were always constant, it's merely a case of dividing the "time spent on the observed all-G 5-mer" signal by the the "time spent on one clock cycle". Sadly, for our purposes, there's enough wobble in any one such "clock cycle" that that calculation won't always yield a reliable result. The upshot: your "GGGGGGG" (7 Gs) target sequence may be registered as "GGGGGG" (6 Gs) or "GGGGGGGG" (8 Gs), or even something else. Now, for distinguishing two alleles where the difference between them is, say, a doubling in length of an already-very-long homopolymer run, even with the aforementioned "clock wobble", you'd likely be able to see that in MinION data quite clearly. As with all thing DNA sequencing (for the time being, at least!), your precise biological question will determine which (one or more) sequencing techniques are best for the job! |
|