| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by comstock 3063 days ago
	I’d agree with you, that long reads would be useful if the error rate wasn’t so shockingly bad. There is, likely value in long reads, but what non-niche research applications are there for highly error’d reads that justify a valuation of several billion dollars?

3 comments

bayesian_horse 3062 days ago

Virtually all applications can benefit from long reads. There are already hybrid assemblers out there which take Illumina, Pacbio and Nanopore reads. The long reads tie the short reads together, whereas the short reads improve the accuracy.

The area where DNA sequencing will first be revolutionizing clinical practice is in sequencing pathogens for sake of identification. In these instances nanopore sequencing rules, because it can give answers in minutes.

link

comstock 3062 days ago

Most clinical applications don’t need long reads. Pathogen identification from short reads is easy. Blood tests for cancer, and NIPT (which will likely be the first big applications) both use fragmented DNA in the blood, so long reads are not useful. Depth (lots of sequencing) and quality are far more important.

link

maxander 3062 days ago

It's worth noting that those clinical applications were developed when technology didn't allow long reads, so "clinical applications don't need long reads" is at present a truism. There may be potential applications that require long reads that simply couldn't have been invented yet (albeit I haven't the slightest what those would be.)

link

comstock 3062 days ago

Yes, but I would say quality is most important in almost all cases. Well, quality being defined as <1% error rate, which isn’t such a high bar.

The most compelling near term applications (NITP etc) use fragmented DNA, and long reads will have no benefit here.

So, yes. Long reads are useful, but you need to have at least reasonable performance in other respects. The same thing has been seen with PacBio, who have not played well in the market, despite having a read length advantage.

link

bayesian_horse 3062 days ago

How long does it take to get the answer? Even if a big, expensive short read sequencing machine is in the building, it still takes a day or two to reach the necessary data.

With sepsis, every hour counts.

link

eggie 3063 days ago

The per base error rate is bad. In the case of pacbio, this error process approximates white noise, and so you can deal with it perfectly by increasing read coverage. Things are somewhat complicated with the nanopore tech described in this post, as errors may be correlated due to the way the basecalling is done, but in practice it's nearly as big a problem as you think it is.

For things approaching a read length the per-base error rate of a single read is simply irrelevant. In practice, with sufficient coverage (e.g. 20x) you simply don't care about the per base error rate of the reads.

link

comstock 3063 days ago

That might be the case if the throughput wasn’t so low and the error rate wasn’t so high.

link

space_fountain 3063 days ago

I feel like there is or was an assumption they'd be able to improve their tech

link