Hacker News new | ask | show | jobs
by dekhn 564 days ago
It depends on what sort of model you're implementing.

There's a big implementation (and result quality) difference from direct string searching (fixed pattern matching) and probabilistic methods (everything from simple profile methods to hidden markov models). Finding direct matches is the same as the "string.find()" method, while probabilistic methods usually involve dynamic programming, heuristic approximations, floating point matrices, etc.

But more importantly, techniques like Nucleotide Transformers are much less supervised than existing search techniques. Previously people had to do a fair amount of labelling and QC work to identify patterns that underlying general sequence categories, these methods spontaneously learn them from the data. I could imagine building an entire transformer model in COBOL although it would be cumbersome; building one with a wordperfect macro would be extremely challenging if not impossible. Even a profile-based method would be painful (I don't know if WP macros are turing complete/general purpose programming).

I don't think it's particularly fair or nice to imply that the work being done here is the same sort of work that was being done with a promoter search algorithm; I'm an expert in this area and you're being unnecessarily dismissive. The field has come a long way.