Hacker News new | ask | show | jobs
by a_bonobo 16 days ago
There has been a bit of a 'trend' to rewrite common bioinformatics/comp-bio into faster languages (Rust) via LLMs, OP's repo seems to be an early example.

Seqera Labs has a bit of a manifesto: https://rewrites.bio/

Heng Li has an overview here too: https://lh3.github.io/2026/04/17/the-ai-rewrite-dilemma

IMHO it's... OK? Bioinformatics code quality is generally poor, untrained biologists writing functioning code that is poor in scoping, but works. (Unguided) LLMs write on that level, too, so not much harm done.

1 comments

How well tested would you say these libraries are? It doesn't sound promising, sadly. If there are comprehensive test suites, that would go a long way to ensuring new, faster tools arent producing subtly wrong answers. That's a pretty big deal, just because the code compiles or there is no exception thrown doesnt mean the analysis was correct.
It's very context-dependent - the seqera rewrites so far seem to be pretty reliable, most of the work was spent merging the functions of multiple data QC tools into a single program (previously, there was a lot of redundancy that wasted compute). The success of other rewrites that I've seen tends to depend on the author's care/experience and usefulness. In my experience, bioinformaticians are fairly slow on the uptake of new software which might actually be an advantage here :-)

In defense of a lot of these bioinformatics-specific rewrites, there are some really dodgy coding practices and bugs that exist in well used tools, so there is scope for genuine improvement. The most recent release of minimap2 fixed some bugs identified in a rewrite, for example: https://github.com/lh3/minimap2/releases/tag/v2.31