I'm profoundly ignorant in neither (PhD in biophysics, software engineer for 20 years). Genomics and programming analogies are cool, but the most important thing is that understanding that molecular structures can encode information in a replicable way, and the discovery of application of entropy to data storage and transmission, demonstrates that information is a universal concept, that the genome is a data storage system, and the enzymes that operate it are operating on information, in a computational way. To me that's a pretty useful comparison.
Software changes over spans of minutes to decades; genomes change over spans of millions of years. Software is written; genomes are not. The complexity of software is constrained by programmers' ability to comprehend it; the complexity of genomes is not. The environment in which software functions is determined by humans; the environment in which genomes function is not.
Those are trivial surface level differences relative to the central idea of encoding, storing, replicating, editing digital information, which interfaces with other digital and analog systems.
Not that there's much point to saying so, since you appear to be here for no other reason than to assert that my argument is false because you would prefer it be so, but here's another: software is digital; genomes are not.
FWIW all of these differences still feel extremely surface level. I'm no expert but I certainly am, so far, aware of everything you've said with regards to how they differ - I'm kinda hoping for more, given the strong assertion you made that one can not relate the two without being fundamentally ignorant of either topic.
I also think it's somewhat ironic that you're accusing them of only being here to say "you're wrong" but that's what you've done in this thread? I only bring this up because I think we're all after the same thing here - to understand an incredibly interesting topic.
I suspect most of us are really here to learn and discuss. You seem like you have a background in the area, I'm sure we would all benefit from learning about the differences.
If it's the case that the similar is that DNA and code both encode information, and the differences are based on how they do so, it's hard to see why you think they can't be related at all. You've been relating the two.
If I've given the impression that the difference is merely a question of varying encodings, then I have to agree my arguments have thus far been lacking.
The idea that a genome as expressed in nucleic acid is purely, and only, an informational medium, is fundamentally in error. It does encode information in the sequence of base pairs, this is true. But it is also a physical structure in its own right, and properties of that structure incidental to the encoded information have what recently looks to be at least as important a role in the process of transcription as the sequence itself.
There are, for example, some sequences which will cause a ribosome to transcribe the surrounding genes differently or with varying frequency, due to the physical interaction between the molecules involved. (I recently discussed this here in the context of recent research on causes of eye color; it should not be too far back in my comment history.) We also see, for example, that both viral and eukaryotic DNA can be and often are transcribed in ways that produce different proteins from the same sequence, again as a result of physical constraints affecting the interaction with the ribosome. This is one reason why "junk DNA" is a bit of a misnomer, and why we more recently see the term fall out of use in favor of "noncoding DNA" - these regions carry no information in their own right, but nonetheless can strongly affect the outcome of transcription because transcription is not only an informatic process. This isn't true of software; there is no general case in which two programs varying only in nonsyntactic ways will be evaluated differently under otherwise identical conditions - we create programming languages as we do in part to ensure that won't happen, and it's also part of the reason why we use transistors instead of vacuum tubes or relays: in order to engineer that kind of variance as much as we can out of existence. What is therefore an accidental property in software is an essential one in gene expression, and cannot be overlooked without reaching an inaccurate conception of how the latter process works.
That's just one example, and it's true that processes like these can be modeled in software to variously imperfect degrees of fidelity and that information-theoretical models can be useful in understanding some aspects of how they work. But that's not the same thing as them working similarly enough that understanding one very well suffices to reason about the other. I definitely can see how it's easy to assume otherwise! It's an assumption I shared, before my own yearlong exposure to the field at a sufficient level of detail to start to understand what I hadn't understood about it before, and considerable reading and study thereafter.
Unfortunately, I was there to provide engineering support to people doing that work, not to do it myself, and the knowledge I've derived from that experience apparently does not extend so far as producing a concise and positive statement of the fundamental difference between the two fields of study - I spent considerably more time teaching informaticists how to program, formally and otherwise, than I spent learning about bioinformatics. That leaves me able to recommend little beyond seeking out similar experience of your own, which I do recommend if the depth of your interest suffices -although I do also have to say working in academia as a nonacademic has very little else to recommend it.
I know there are some folks on HN with formal knowledge and training greatly exceeding my own, and some of whom have probably also had experience teaching the basics in an accessible way. Perhaps one of them might give a more useful answer here than I've been able to.
Genomes are absolutely digital. GATC is no different from 1 and 0. It's just using a different base (pun intended).
Files on disks have end of file markers, just like the start and stop sequences in DNA. Operating systems have cron jobs (themselves digital) that control when other programs execute.
Genomes are much more than just their sequence. Their spatial organisation, their methylation, their fiolding, their packing etc, have no equivalents in a filesystem.
False by definition: Digital data is "information represented as a string of discrete symbols each of which can take only one of a finite number of values"
I agree with you here but I get to a happy conclusion. The (self- or culturally imposed) constraint on computation to be semantically meaningful for humans does not apply for genomes. But this is already useful, because it means we at least have a hint about where to dig more in programming.
There is Theory of Computation and there is Theory of Programming. Your arguments apply to TOP but not to TOC.
Plenty of software is neither written nor comprehensible I can assure you of that.
Like I don't think your necessarily wrong, but pointing out the literal differences between the two topics doesn't explain to me why the analogy is wrong and therefore doesn't support your argument.
It's like saying "I'm nothing like my mother; I don't even have long hair"
I love this. It's a little black and white, but the comparison is as between video game worlds and the real world. Only enough to fool the willing eye.
I use a variation of this form as 'persons whos science and religions conflict don't know enough about either one'.
As an enthusiastically former staff engineer at a bioinformatics institute, I'm happy to have been of help! Please feel free to do so without attribution; if nothing else, it'd be a shame at this late date to have my opinions of the caste system in academia disturbed by the novel experience of receiving credit for my contributions to the work of people with letters after their names. :D