| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by chriswarbo 3532 days ago

Another issue with attempting to unify existing results is the focus on good performance, and the higher-level optimisation being performed by the researchers/implementors. This is partly because of the focus on engineering, as you say; I'd wager it's also due to a 'file drawer effect', where the emphasis is on achieving ever-higher benchmark scores, and that rewards tweaking of algorithms.

I suppose the alternative, more scientific/less engineering approach would be to treat benchmark scores as experimental observations, and try to form predictive models which take in descriptions of networks and output predicted benchmark scores. In the architecture analogy, this would be like modelling the strength of various materials and shapes. If good predictive models are found, they can be used to design networks which are predicted to have desirable scores, in the same way that buildings can be designed based on predictions of how the materials and geometry will behave.

Of course, to be more useful we'd also want to take into account things like resource usage, training time, etc. and the models themselves must be constrained somehow, to avoid trivial solutions like "run the given network and see how it behaves, give that as our prediction".