Hacker News new | ask | show | jobs
by adrian_b 1363 days ago
With any traditional ISA with wide registers and instructions, a.k.a. SIMD instructions, it is possible to implement the execution units with any width desired, regardless which is the architectural register and instruction width.

Obviously, it only makes sense for the width of the execution units to be a divisor of the architectural width, otherwise they would not be used efficiently.

Thus it is possible to choose various compromises between the cost and the performance of the execution units.

However, if the ISA specifies e.g. 32 512-bit registers, then even the cheapest implementation must include at least that amount of physical registers, even if the execution units may be much narrower.

What is new in the ARM SVE/SVE2 and which gives the name "Scalable" to that vector extension, is that here the register width is not fixed by the ISA, but it may be different between implementations.

Thus a cheap smartphone CPU may have 128-bit registers, while an expensive server CPU for scientific computation applications might have 1024-bit registers.

With SVE/SVE2, it is possible to write a program without knowing which will be the width of the registers on the target CPU.

Nevertheless, the scalability feature is not perfect, thus some programs may still be made faster if a certain register width is assumed before compilation, which may make them run slower than possible on a CPU that in fact has wider registers than assumed.