|
|
|
|
|
by dekhn
1031 days ago
|
|
Sure, although I'm not aware of anybody who is contemplating quite the level I believe is necessary to really nail the problem into the ground. When I worked at Google, I proposed that Google build a datacenter-sized sequencing center in Iowa or Nebraska near its data centers, buy thousands of sequencers, and run industrial-scale sequencing, push the data straight to the cloud over fat fiber, followed by machine learning, for health research. I don't think Google wants to get involved in the physical sequencing part but they did listen to my ideas and they have several teams working on applying ML to genomics as well as other health research problems, and my part of my job today (working at a biotech) is to manage the flows of petabytes of genomic data into the cloud and make it accessible to our machine learning engineers. The really interesting approaches these days, IMHO, combine genomics and microscopic imaging of organoids, and many folks are trying to set up a "lab in the loop", in which large-scale experiments run autonomously by sophisticated ML systems could accelerate discovery. It's a fractally complex and challenging problem. Statistics has been key to understanding genetics from the beginning (see Mendel, Fisher) and so at a big pharma you will see everything from Bayesian bootstrappers using R to deep learners using pytorch. |
|
[1] https://www.genomicsengland.co.uk/blog/data-representations-...