| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by prattmic 2170 days ago
	While CPUID itself is far from elegant, I don't think ARM is fundamentally different. e.g., the ID_AA64ISAR1_EL1 register (ARM wins no naming awards here) indicates the presence of certain instructions. https://developer.arm.com/docs/ddi0595/h/aarch64-system-regi...

1 comments

saagarjha 2170 days ago

aarch64 is actually worse as far as I can tell because reading from that register is restricted and cannot be done from userspace (hence "EL1") and thus doing this portably is a huge pain. On Linux I think some of this is exported via /proc/cpuinfo and auxval, so that's what most people use…

Edit: Also there's a ton of them, this lists a couple: https://www.kernel.org/doc/html/latest/arm64/cpu-feature-reg...

link

stephencanon 2170 days ago

It makes good sense for the OS to abstract this for a few reasons:

- You'll get into trouble when migrating a process between cores in heterogeneous environments (having cores that support different ISA extensions is relatively uncommon today, but Intel has already announced some, and people have gotten into trouble in the past with e.g. reading cacheline sizes when big and little cores used different values).

- This can be a pretty slow operation on some platforms, and the system can make it more efficient by caching it (and can provide additional capabilities information that may not have existed on older CPUs so that there's a uniform interface).

link

saagarjha 2170 days ago

I remember when that came up–I still don't understand how you are supposed to support a process that has queried the kernel for some feature ("can I use SVE2?"), gets an affirmative response, starts using that, and midway gets migrated to another core that doesn't support those instructions? Either you're going to either have the same extensions on all the cores, or lie about it by presenting the lowest common denominator to each of them…although I guess if you did want to do that, you'd want the kernel to be able to fake this from the start.

link

fwip 2170 days ago

One approach could be that the OS pins the process or threads to cores that support the features that were queried for.

i.e: If a process asks "can I use SVE2" and gets a yes, the OS doesn't move it to any core for which that isn't true.

link

saagarjha 2170 days ago

Right, but as I mentioned here: https://news.ycombinator.com/item?id=23835764 (and last month: https://news.ycombinator.com/item?id=23481049) every process is going to ask this…

link

fwip 2170 days ago

Ah, that's very a good point.

link

guenthert 2170 days ago

Alternatively, an invalid instruction fault is signaled, handled by the OS, which activates the core capable of that instruction and migrates the process there. Not cheap, but I would expect that to happen rarely, if the lesser core is only used during low-power operation.

link

saagarjha 2170 days ago

Yeah, but that means that every process is going to forced to run on the big core the moment that you call into libc's vectorized string functions…

link

mjevans 2170 days ago

There are similar issues for any suspend-able and/or hot-swappable system. An extension to process signals should be created to inform them that a system-context change has occurred. Clock timestamps, features, all assumptions should be invalidated.

link

saagarjha 2170 days ago

Yeah, but then programs have to know about that. Up until now the general assumption has been that context switches are largely transparent…

link

sroussey 2170 days ago

I think Intel is going the lie about it route for their first “hybrid” chip.

link

Symmetry 2170 days ago

Nah, they just disable all the features that aren't shared by the little cores.

link

saagarjha 2170 days ago

That's the "lie about it" route I mentioned above.

link