Hacker News new | ask | show | jobs
by dekhn 784 days ago
Personally I think it was obvious that LLMs were going to be useful for protein modelling since the previous generation used HMMs very successfully. Pfam (a library of HMMs for classifying proteins into preexisting known families) is one of the most important resources we have because of the power of HMMs to model sequential language.

I suspect we will need to move from sequential modelling to graphical modelling to level-up again, though.

1 comments

> I suspect we will need to move from sequential modelling to graphical modelling to level-up again, though.

Out of curiosity, would you mind elaborating on this?

I don't work in the field so I'm probably just repeating something Hinton already said, but it seems to me like attempting to model things in reality that have graph-like structures (like interacting pairs of residues in a 3d protein structure) using sequences with finite context lengths is ultimately going to be less efficient than modelling graphs. My guess is this work is roughly describing that I think of: https://www.cis.upenn.edu/~mkearns/papers/barbados/jordan-tu...

it could also be I completely misunderstand context in sequential models and what I'm describing is already being used, or has been evaluated and has been unsuccessful.