Hacker News new | ask | show | jobs
by plaidfuji 670 days ago
> Get those linguists out of here, more data will replace whatever insights they have! It’s a fun and increasingly popular stance to take. And, to a degree, I agree with it. More data will replace domain experts, the bitter lesson is as true in biology as it is in every other field.

I think it’s fundamentally shifting how people approach R&D in all physical fields. The power of “the ML way” is almost a self-fulfilling prophecy. Once you see ML upend the standard approach in one area, the question is not if but when it will upend your area, and the natural next step is to ask, “how can I massively increase data collection rates so I can feed ML”? It just completely flips all branches of science on their head, from carefully investigating and building first-principles theory, to saying “screw it, I really just wanted to map this design space so I can accurately predict outcomes, why don’t I just build a machine to do that?”

It then becomes a question of how easy it actually is to build an ML-feeding machine (not easy, very problem-specific), ergo the pendulum now swings to physical lab automation.

1 comments

In grad school (I was in Chemical engineering ) I took molecular biology course. We read/reviewd a number of papers in different areas. For my review I proposed a series of experiments to answer questions raised by the paper. It was very logical and well thought out. Problem was it would have amounted to 3grad students full time for at least a year. Once you see the effort involved you can see why the ML approach is exciting
How exactly would ML speed it up something that takes 3 grad students full time?

Listen. A lot of this shit gets discovered for crazy reasons. For two years these two postdocs were throwing away one fraction of their size exclusion chromatography step. I got into a really heated six hour argument where i insisted that the postdocs did not understand that size exclusion chromatography, big shit comes out first (they thought that big shit comes out last). The next day the postdoc apologized, since I was correct.

Oddly, a month or so later, they stopped to take a look at the fraction they were throwing out and it turned out that their molecule was self-assembling into cages. This is important for how the molecule is supposed to work. They got some very important papers out of it. I'm not even thanked.

ML is not accelerate this sort of stuff.

A hypothetical good AI would have reviewed the experimental design and pointed out the misconception
How would the AI know to look there? Trash on ends "void volumes" of chromatography is common.

Anyways what is your training set? Probably upwards of 50% of papers are trash. Will the AI have the intuition to know which ones are good? Does the AI listen at the water cooler to grad students griping about Corey yield?

(It's not on the internet. It's the general sentiment that yields reported by the E.J. Corey lab are inflated).