Hacker News new | ask | show | jobs
by throwawayiionqz 2032 days ago
This is the cost of training the final architecture with all the refinements enabled by years of research.

These years of research involved trying many different architectures, many of which received as much or more compute time than the final system.

The price of training the final architecture is meaningless. Researching and training AlphaGo was expensive but it enabled the ideas and development of AlphaZero which is more computationally tractable.

To have any chance, an academic team would need the same compute resources as what the DeepMind protein folding team used during the whole development of the architecture during the last few years, not only the resources used to train the final system. And I bet this funding is not available to most if not all academic teams.

3 comments

Even if you try to account for the overall R&D cost, DeepMind isn't that large an organization by the standards of biomedical research. It's very big and well funded for a computer science research organization, yes, and most CS departments can't match its resources. But the NIH budget is $40 billion, and private pharmaceutical companies do another $80 billion in annual R&D. It's interesting that this kind of breakthrough didn't come from those sectors.
DeepMind is taking advantage of NIH's funding. For example, Anfinsen who demonstrated that proteins fold spontaneously and reproducibly (https://en.wikipedia.org/wiki/Anfinsen%27s_dogma) ran a lab at NIH. Levinthal (who postulated an early and easily refutable model of protein folding) was funded by NIH for decades. Most of the competitors at CASP are supported by NIH and its investments have contributed to the modern results significantly.

That said I think the academic and pharma communities had engineered themselves into a corner and weren't going to see huge gains (even thogh they are exploring similar ideas) for a number of banal reasons.

That's a good point; this system certainly didn't come from nowhere! The protein datasets they used also mostly came out of various NIH-funded projects.

What I meant to focus on was that I think DeepMind has less of a pure money/scale advantage in this area than in some others. In something like Go or Atari game-playing, there are many academic groups researching similar things, but their resources are laughably small compared to what DeepMind threw at it. So you might argue that they got good results there in part because they directed 1000x the personnel and compute at the problem compared to what any academic group could afford. In biomed though, their peers in academia and industry are also pretty well-funded.

Personally I think a major part of the secret sauce is Google's internal compute infrastructure. When I was an academic, 50% of my time went to building infra to do my science. At Google, petabytes of storage, millions of cores, algorithms, and brains were all easily tappable within a common software repo and cluster infrastructure. That immediately translates to higher scientific productivity.
Has cloud computing changed this?
Mostly? I left google to work at a biotech startup working in a related area and found that the big three cloud providers have built systems that greatly improve computational science. That said, it's still a lot of work to get productive, many in the field are really resistant to changes like version control, continuous integration, testing, and architecting distributed systems for handling complex lab production environments.

Here's an exemplar of how I think it evolved well in a cloud world: https://gnomad.broadinstitute.org/

that project adopts many concepts from google and others and greatly improved our analytic capabilities for large-scale genomics.

Having recently experienced both, 1000x this.
You hit the nail on the head here.
It seems like spending these government funds on creating new challenges like CASP and ImageNet could have an enormous ROI. Don’t let them try to choose the winner, just let them define the game
> The price of training the final architecture is meaningless.

The research is the giant shoulders you stand on, the compute cost is the price of the tool you need to do the present-day work.

Both are relevant but the shoulder’s of giants are generally more accessible, particularly if we’re talking about published research and not proprietary tech.

A competing team is not starting from the same place the DeepMind team started at 5 or 10 years ago.

To expand on this, after fully reading AlQuraishi's "What Just Happened" post from a couple years ago, was this point that he made;

> I don’t think we would do ourselves a service by not recognizing that what just happened presents a serious indictment of academic science. There are dozens of academic groups, with researchers likely numbering in the (low) hundreds, working on protein structure prediction. We have been working on this problem for decades, with vast expertise built up on both sides of the Atlantic and Pacific, and not insignificant computational resources when measured collectively. For DeepMind’s group of ~10 researchers, with primarily (but certainly not exclusively) ML expertise, to so thoroughly route everyone surely demonstrates the structural inefficiency of academic science. This is not Go, which had a handful of researchers working on the problem, and which had no direct applications beyond the core problem itself. Protein folding is a central problem of biochemistry, with profound implications for the biological and chemical sciences. How can a problem of such vital importance be so badly neglected?

In short, academia got utterly schooled by a small group at Google spending a relatively small dollar amount on compute, using techniques that in hindsight are fairly described as "simplistic". There's no way around it.

I don't think AlQuraishi really hits the mark in his critique. The mere fact that hundreds or thousands of people working on a problem for decades doesn't account for the fact that the field of machine learning has been growing extremely rapidly over the last decade, the compute power available has grown exponentially, and the people working on the problem simply weren't looking at the problem in the way that the deepmind people were looking at it.

If you were trying to get across the Atlantic, this would be like getting upset at a group of bridgebuilders for trying to solve the problem by building a bridge across instead of by inventing the airplane. The approaches are that different.

> and the people working on the problem simply weren't looking at the problem in the way that the deepmind people were looking at it.

>The approaches are that different.

I'm not sure if that analogy applies here. DeepMind wasn't the first group tackling structure prediction with machine learning. Their success lies in the innovations that they implemented (predicting interresidue distances as opposed to contacts, for example).

To be fair, I'm not sure that they are "simplistic" in the sense that, e.g., writing a neural network to recognise cat pictures is now simplistic. I don't know how many people have Deepmind levels of expertise in ML, or could implement what they have done, but I doubt it is many, and they are thinly spread amongst many interesting problems.
> The price of training the final architecture is meaningless.

Meaningless in historical terms, but meaningful in future terms. It's meaningless how long the training took because there were countless resources spent to get to that point. It's meaningful in the future, because we know that training times are fairly short, and iteration can be done fairly quickly.