Hacker News new | ask | show | jobs
by searine 4134 days ago
Except this is totally reinventing the wheel.

If I want to do fast genome alignment, I'll use lastz which is already blazing fast and written in C. If I want to just look for homology, I'll use blast.

There isn't much need for improved alignment algorithms. If it's a big job, most bioinformatics have access to clusters. If it is a small job, who cares about speed.

1 comments

To be fair, to my knowledge, LASTZ, BLAST, or BLAT don't treat IUPAC ambiguous bases (K, Y, R, etc.) in the way the OP was looking for. That's not to blame the tools, since they have a very good reason not to (they build an index on the target first, and treating ambiguous bases properly would increase the size of the index).

That said, I wonder if grep wouldn't be much faster, since this program is only looking for exact matches, which are easily transformed into a regex by replacing the ambiguous nucleotides with something like (A|T|C).