|
|
|
|
|
by bglazer
1031 days ago
|
|
It’s actually about 3 gigabases (ATCG). There are some recurrent features of the genome whose function we’ve worked out. For example the TATA box is a classic sequence that typically indicates the start of a part of the genome that codes for a protein. The vast majority of the genome doesn’t code for proteins. The function of these genome regions are much more murky. Some of these regions function like scaffolds for proteins to assemble into complexes. These protein complexes then start transcribing the genome into into mRNA. So the genome regulates its own expression, in a sense. Many of the sequences that function in this way are known. There are also just a bunch of parts of the genome that probably don’t do anything. There are also many regions of the genome that are basically self replicating sequences. They code for proteins that are capable of inserting their own genetic sequence back into the genome. These are transposons. In short, a lot of very painstaking genetics and molecular biology work has gone into characterizing the function of certain sequences. |
|
In humans even though hervs don’t reactivate into infectious viruses they have been implicated in both harmful (senescence during aging[0]) and beneficial (protection from modern retroviruses)[1] activities in the body.
They might be up to 8% of the human genome.
0: https://www.cell.com/cell/pdf/S0092-8674(22)01530-6.pdf
1:https://www.microbe.tv/twiv/twiv-956/