Hacker News new | ask | show | jobs
by ufo 4888 days ago
An extra difficulty with genome assembly is that DNA often has lots and lots of repeated junk sequences that can confuse the algorithms. I don't work with bioinformatics to know how they usually get around this though.
1 comments

Repeats aren't necessarily junk (e.g. TAL Effectors http://en.wikipedia.org/wiki/TAL_effector#DNA_recognition). Resolving them requires long reads. PacBio is currently of interest as an alternative to Sanger sequencing for this, although the error rate of PacBio reads is a bit of an issue.
pacbio is dead, they just don't know it yet. BGI (or somebody, doesn't matter, BGI is just the obvious candidate) would need to buy 50 SMART sequencers a year just for PacBio to stay in business. That seems unlikely given the lower cost and complexity of Illumina and Life sequencers