Hacker News new | ask | show | jobs
by 08-15 3437 days ago
1. Depends on the technology. On Illumina (cheapest tech and highest throughput), you get the first and last 125 bases of smallish DNA molecules with an acceptable error rate. Pacific Biosciences (lower throughput and more expensive) gives you up to 40.000 bases with a rather horrible error rate.

2. They fail epically. There is nothing you can do computationally. With paired end reads (two reads at an approximately known distance), you still can't assemble repetitive regions, but you can get the contigs around the repeat in the right order.

3. Definitely, but I don't know the details. Plants are often more difficult than animals; they have bigger genomes and often have multiple chromosome sets. Assembly of a wheat genome is more difficult than assembly of the human genome---and I'd argue even the latter isn't actually a solved problem.