|
|
|
|
|
by heycosmo
2021 days ago
|
|
Fascinating! AlphaFold (and other competitors) seem to use MSA (Multiple Sequence Aligment) and this (brilliant) idea of co-evolving residues to build an initial graph of sections of protein chain that are likely proximal. This seems like a useful trick for predicting existing biological structures (i.e. ones that evolved) from genomic data. I wonder (as very much a non-biologist), do MSA-based approaches also help understand "first-principles" folding physics any better? and to what degree? If I write a random genetic sequence (think drug discovery) that has many aligned sequences, without the strong assumption of co-evolution at my disposal, there does not seem any good reason for the aligned sequences to also be proximal. Please pardon my admittedly deep knowledge gaps. |
|
Not really. MSA-based approaches, as most structure prediction methods, have as a goal to find the lowest energy conformation of the protein chain, disregarding folding kinetics and basically all dynamic aspects of protein structure.
> If I write a random genetic sequence (think drug discovery) that has many aligned sequences, without the strong assumption of co-evolution at my disposal, there does not seem any good reason for the aligned sequences to also be proximal.
I don't think I fully understood this, but I'll give it a shot anyway. If your artificial sequence aligns with others, there's a chance that it will fold like them, depending on the quality and accuracy of the multiple sequence alignment. Since multiple sequence alignments are built under the assumption of homology (all sequences have a common ancestor), it's a matter of how far from the "sequence sampling space" your sequence is located compared to the others.