|
|
|
|
|
by gnoway
4470 days ago
|
|
I'm actually wondering where the 2.3x number he cites is coming from. I don't believe the Mill team is claiming 2.3x performance advantage over Haswell while using 2.3x less power, which is how I read that comment. I watched the replay of the Execution talk here: http://millcomputing.com/docs/execution/ I'd recommend watching all of the talks if you have the time. In this talk, maybe 2/3-3/4 of the way through, Godard made a claim about performance relative to OOO, 'like a Haswell' or Haswell specifically - can't remember which, and I can't go through the video again right now. He said something to the effect that they would approach performance for {OOO|~Haswell|Haswell} using less power. It was a very general statement, which I took to mean that a Mill family member intended for GP PC desktop use could approach - not match or exceed - performance of a typical GP PC desktop processor while using less power. Which is certainly not something we've never heard before. And I think the statement is coming from theoretical calculation. As far as difference with Itanium: I don't know anything about processor design, but I am pretty certain the belt concept central to the Mill is not applied in the Itanium/EPIC. I think it's likely that the Mill is intended to support more operations per instruction than Itanium. The other thing is that there is not 'The Mill Processor' - it's more of a design scheme and ISA. |
|
What we can say is that for equivalent computation capacity (i.e. number of functional units) the Mill will give somewhat better performance at much better power. Internally, the Mill's power budget is essentially the same as that of a DSP with the same function capacity, because they work in much the same way. DSPs have been around for a long time, and the power/performance comparisons with OOO have been long published. For equal process and equal Mips capacity the power difference for the core is 8-12x better than OOO, and we expect to do at least as well.
That's for equal compute capacity. Every architecture has a cap on scaling compute capacity. The cap seems to be around 8 pipelines in OOO machines; try to add more and you just slow down everything more than you gain from the extra pipes.
The Mill has caps too. We don't know yet where the diminishing returns point will be in detail, but our sims and engineering expertise suggests that it will be somewhere in the 30-40 pipes region. Such a high-end Mill would swap a good deal - but not all - of its power advantage for more horsepower.
You have the inverse story at the low end of the family: the lowest Mill has only five pipes, and no floating point at all. Not barn-burning performance, but much lower power even than existing non-OOO offerings.
So there's no one number, and no hard measurements anyway. If you doubt our projections then you are entitled to your opinion; in fact there's a fair amount of disagreement even within the Mill team as to what we will see in the actual chip. But the team includes quite a few who have been doing this for years, and in several cases were involved in the creation of the chips that you would compare the Mill against, so their considered opinion should not be rejected out of hand.