Hacker News new | ask | show | jobs
by zubyak 2594 days ago
the program will work on fastq files. The sequencing technology makes long reads.

As another commenter said, I don't need superdeep sequencing knowledge because my work will mostly be on the programming side (enhance performance, not adding new functionalities) but anyways it could be useful to have a clear picture of the process.

Thanks for your help

1 comments

Unfortunately I don't have many long-read resources to share, but here's a short video about the process for the MinION nanopore sequencer for long reads:

https://www.youtube.com/watch?v=Wq35ZXyayuU

At about 1:30 there's a cartoon of the data signals that get processed into sequencing data.

It's been a while since I looked at long read data, but last time I did, the individual base calls in FASTQ files (A, C, G, T) have a fairly high error rate, and there are systematic biases in the errors, which makes it harder to correct them. Most of the processing of these data is trying to correct these errors, either by looking at a known reference sequence or by sequencing many times.