Hacker News new | ask | show | jobs
by roye 4615 days ago
BFs are getting pretty popular in bioinformatics. The post mentions one example, here is my favorite recent one: http://minia.genouest.org/files/minia.pdf
1 comments

Very cool – I hadn't seen this paper before (haven't done any real work on de novo assembly or otherwise requiring de Bruijn graphs).

Here's a paper using Bloom filters in metagenomic classification (my relative* area): http://bioinformatics.oxfordjournals.org/content/26/13/1595....

In this vein, a friend and I are researching/implementing a probabilistic key-value store for some bioinformatics applications (one is metagenomic organism and gene identification). It's fast and space-efficient, just like a BF (though obviously less so as it stores keys and not single-set-membership). Any use cases for that kind of thing in your sub-field? Always trying to figure out interesting new applications (we aren't ready to write it all up yet, but hope to at some point not too far down the road).

* I'm really just a dabbler / don't have a formal bioinformatics background. My friend is the genetics PhD.

Can you explain "probabilistic key-value store?" Would it be that each gene has some defined probability of belonging to a given organism, or is it probabilistic in the sense of having a defined error rate as BFs do?
The latter - probabilistic in the sense of having a defined error rate. In one of our use cases, the keys are simply kmers and the values are organism or gene IDs.
not sure, but would definitely take a look.
Contact info?
You'll find it here: http://tau.ac.il/~rozovr/