Hacker News new | ask | show | jobs
by pacificmaelstrm 952 days ago
DNA is a form of code, but it doesn't encode programs. Instead, like an STL or STEP file it encodes HARDWARE designs.

While you could think of it as encoding infrastructure AND code (as in IaS) you'd need to go beyond that to include the hardware for computing AND physical function (like a whole car + computer) in that conception, which is not what IaS means.

The hardware side of DNA is easy to overlook since we don't yet have the necessesary (CAD) design tools to easily understand the shape and mechanics of proteins just from reading a DNA sequence like we do for macroscale 3D models. But there are hard technological reasons for this.

DNA encodes information, but instead of binary organized into 8-64 bit bytes (10010110) it uses four base pairs (ATCG) organized into 3 letter codons, each of which represents one amino acid.

The cell assembles chains of amino acids which are then placed in an "oven" where the string of molecules folds back on itself to assemble a complicated and functional 3D shape.

When we say complicated, we really do mean complicated. Even the fastest modern super computers are unable to determine the shape of these protein based only on the DNA sequence input. Further, we are unable to simulate the way that a folded protein will interact with other molecules reliably.

Fortunately these kinds of problem will someday be easily solved by quantum computers, but for now we are stuck with approximations of questionable accuracy.

But there are very computer code-like elements to how cells work. Unfortunately it is all spaghetti code. One section of DNA often codes for proteins which bind to one or more other sections of DNA either increasing or decreasing the activity production of the proteins from those locations.

Additionally, some DNA sections code not for protein but RNA strings which are used mechanically by themselves or as part of proteins like CRISPER. RNA is always created as an intermediate step between DNA and Protein, but in this case it is used directly as fRNA (functional RNA). RNA can even fold on itself and act similar to proteins though it is much more fragile.

The many interactions between protein, DNA and RNA perform a kind of computation but it is very obfuscated.

The following are generalized interactions that take place in a cell (perhaps analogous to machine instructions) written in a kind of pseudocode, to help illustrate the recursive functions involved.

DNA + Protein = RNA;

RNA + Protein = Protein;

Protein = Protein++;

Protein = Protein--;

Protein = RNA++;

Protein = RNA--;

RNA = RNA++;

RNA = RNA-+;

RNA = Protein++;

RNA = Protein--;

Protein + RNA = DNA;

Any protein or fRNA can have multiple functions in a cell and affect the production other proteins and fRNAs by interacting with DNA or RNA or with other Proteins involved in the production chain. In addition to this, proteins and fRNA also physically move around other proteins and molecules and make up the structure and machinery of a cell.

Untangling it all is close to impossible currently. There is several billion years worth of tech debt and zero documentation.

2 comments

This looks like it was written by generative AI but I can't really say for sure.

BTW: protein structure prediction didn't need supercomputers (in the traditional sense) and the PSP problem wasn't solved using supercomputers applying a high quality physics function to simulate folding- instead, it was solved using a combination of ML supercomputers, a really good algorithm (transformers), and a couple of really good data sets- the known structures of proteins, and the known relationship of proteins.

Instead of simulation on a huge supercomputer so they could predict a single strucfture, they trained a model which approximates structure well enough to beat every competitor. From what I can tell, most of the resulting quality doesn't come from their force field but from the distance constraints that are mostly derived from historical relationships between proteins, and the coevolution of their sequences.

Came here to say this. It is extremely over-simplistic to think of DNA as Infrastructure as Code.