|
|
|
|
|
by Windchaser
13 days ago
|
|
It seems just obvious that it's at least a sampling problem. Assuming an average protein length of 400 amino acids and 20 possible amino acids, that's about 10^520 different possibilities for sequences, which is a mind-bogglingly large number. We haven't even begun to explore the biological universe. |
|
And if you take it up a level of abstraction and say there are 4 ( ish ) basic types of secondary structure ( helix, turn, sheet, disordered ). Then you could argue the structural space is even smaller still.
Or put it another way if you can have sequences with 30% identity or lower with the same fold - that's a awful lot of different unique combinations that collapse into a single structural space.
And on the flip side - what we don't know is what percentage of sequence space don't actually result in a functional fold - ie results in instability and multiple stable or unstable conformations.
So it could be we are close to all the possible folds ( where fold is a single stable form - obviously there are quite a lot of disordered states - but I'm not including those in a 'fold' even if evolution uses unstructured states as well) already.