Hacker News new | ask | show | jobs
by hirenj 669 days ago
I read a great piece from Michael Bronstein about this very topic earlier this year.

https://towardsdatascience.com/the-road-to-biology-2-0-will-...

I think an important point raised here is the distinction between good data, and the "relative" data present in a lot of biology. As examples from the article, a protein structure, or genome/protein sequence data is good data, but data like RNA-seq or mass spectrometry data is relative (and subject to sensitivity / noise etc). The way I like to think of it is that sequence data and structural data is looking at the actual thing, but the relative data only gets you a sliver of a snapshot of a process. Therefore it's easier to build models to capture relationships between representations of real things, rather than models where you can't really distinguish between signal and noise. I spend a fair amount of time these days trying to figure out how to take advantage of good data to gain insights into things where we have relative data.