| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dekhn 1031 days ago

Sure, although I'm not aware of anybody who is contemplating quite the level I believe is necessary to really nail the problem into the ground. When I worked at Google, I proposed that Google build a datacenter-sized sequencing center in Iowa or Nebraska near its data centers, buy thousands of sequencers, and run industrial-scale sequencing, push the data straight to the cloud over fat fiber, followed by machine learning, for health research. I don't think Google wants to get involved in the physical sequencing part but they did listen to my ideas and they have several teams working on applying ML to genomics as well as other health research problems, and my part of my job today (working at a biotech) is to manage the flows of petabytes of genomic data into the cloud and make it accessible to our machine learning engineers.

The really interesting approaches these days, IMHO, combine genomics and microscopic imaging of organoids, and many folks are trying to set up a "lab in the loop", in which large-scale experiments run autonomously by sophisticated ML systems could accelerate discovery. It's a fractally complex and challenging problem.

Statistics has been key to understanding genetics from the beginning (see Mendel, Fisher) and so at a big pharma you will see everything from Bayesian bootstrappers using R to deep learners using pytorch.

3 comments

2dvisio 1031 days ago

Guys at Verily are working on Terra.bio with the Broad institute and others. Genomics England in the UK is also experiencing with multimodal and machine learning applied to whole genome sequences [1].

[1] https://www.genomicsengland.co.uk/blog/data-representations-...

link

krab 1031 days ago

But why Google? This is what big pharma are doing. Also you can outsource the data collection part. See for example UK Biobank. Their data are available to multiple companies after some period so it makes it more cost efficient.

link

dekhn 1030 days ago

Why Google? Because this is a big data problem and Google mastered big data and ML on big data a long time ago. Most big pharma hasn't completely internalized the mindset required to do truly large-scale data analysis.

link

panosfilianos 1031 days ago

I have spent the better part of the past year looking obsessively over genomics papers for cancer and I've grown very fond of the field.

Are there any positions at Google/ companies you wold suggest me to look into? I'm coming from algortrading/ ML research with ML MSc.

link

asielen 1031 days ago

You could try Calico. They are an Alphabet company that specifically studies aging. There how a good amount of machine learning roles. However biotech typically pays less than finance or software.

https://calicolabs.com/careers/

link

panosfilianos 1031 days ago

Thanks!

link