Hacker News new | ask | show | jobs
by Schoolmeister 1943 days ago
"However, the manifold of architecture search generally contains many points for which there is no feasible mapping from software to hardware."

I'm having some trouble understanding how this manifests itself. Can someone help me with this by e.g. providing a toy example?

1 comments

It means the ML algorithm can propose designs that do well on the objective function (e.g. improved runtime), but can't actually be constructed. They give the example of designs that have more memory than can actually fit on the chip.
Yes I understand that, but if that is what is meant I find the wording to be somewhat strange. They mention not being able to find a "feasible mapping from software to hardware", and later on "some of the constraints may not be properly formulated into the optimization, and so the compiler may not find a feasible software mapping for the target hardware".

So the problem is that there is no software mapping, which I understand to be the mapping of compiler instructions to the underlying hardware. It looks like I'm missing something. Is this the same as saying that the hardware design is not feasible?

I imagine they have a basic design in verilog with various tunable parameters (memory size, clock speed, how many instructions to issue at once).

They also have a way to run that hardware in a simulator and see how quickly it could train some network.

The ML optimization problem is to come up with a bunch of constants which performs well, but also compiles into a manufacturable chip. Clearly setting the clock speed to 9999Ghz isn't that...

But what I don't understand is: they claim their approch side-steps the "unfeasible" configs, which is and would be a major achievement, however I don't see how the unfeasibility is captured in their evaluation function, which measures mostly runtime and area, and none of them give negative clear negative rewards to unfeasibility since for example, as you noticed, unbuildable configs would return high runtime... Area might correlate negatively, but at that point I don't see how some methods work (eg evolutionary algorithm) and others really don't... Did you understand that part?