| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by als0 2006 days ago
	> I would not have expected Rosetta 2 to be missing AVX instructions AVX can’t be emulated by Rosetta due to Intel patents.

6 comments

phire 2006 days ago

Regardless of the patent issue, emulating AVX on the M1 (which only has 4-wide SIMD) would actually be significantly slower than forcing the x86 application to use it's SSE fallback path and emulating that.

link

Diggsey 2006 days ago

Emulating AVX via Rosetta should be just as fast as re-compiling the original without AVX support and then emulating it. Emulating larger SIMD instructions is very easy, you just use multiple smaller SIMD instructions.

On the other hand, disabling AVX for all Intel machines would make those programs significantly slower, so it's clear why there is reluctance to do that...

link

phire 2006 days ago

No. For many algorithms, AVX isn't a 2x speedup over SSE. Especially when lanes are conditionally masked.

Often you are happy to get a 1.25x speed up with AVX. Sometimes it actually goes slower.

If you were to emulate that code with a 1.25x speedup with AVX on the M1, you would end up with all the disadvantages of going to 8-wide, but with none of the speedup.

That 1.25x speedup is halved and the emulated AVX code actually runs at about 0.625x the speed of the emulated SSE code path.

link

a-dub 2006 days ago

plus doesn't the M1 have specialized hardware. what's the neuro-engine or whatever it is that they call it for speeding up ML? i imagine at it's core it's a bunch of instructions for doing vector operations.

side bar: is there documentation for the instruction set or abi for that hardware?

link

mhh__ 2006 days ago

> side bar: is there documentation for the instruction set or abi for that hardware?

No. Apple stans, food for thought.

I would hope there will be something soon although it's Apple so not much.

Edit: Still basically no, but https://github.com/geohot/tinygrad/tree/master/ane has got the instruction format apparently.

You can attack an instruction set blind (https://recon.cx/2012/schedule/events/236.en.html).

More edit: Bingo, patent: https://patents.google.com/patent/US20190340491A1/en?oq=2019...

link

DerekL 2005 days ago

The Apple Neural Engine is separate from the CPU; it's not additional registers and instructions for the CPU, like a vector unit. You go through the Core ML framework to use it, just like you go through Metal or OpenGL to use the GPU.

link

mhh__ 2006 days ago

The value of SIMD on a CPU these days is really a middle-ground where you value latency above throughput, so you probably would have the same trade off as getting the data to and from a GPU

link

a-dub 2006 days ago

i thought the M1 was a SoC with a unified memory architecture?

link

mhh__ 2006 days ago

That definitely changes the calculus, but as I've mentioned in a different comment there doesn't seem to be literally any microarchitectural documentation to read, so I (don't own an M1) have nothing to go off unfortunately.

I'll make a wild guess that getting data to the neural engine is still probably not quick because I assume it's some kind of statically scheduled type affair (exposed pipeline?). We literally seem to know almost nothing about it sadly.

https://jobs.apple.com/en-gb/details/200205070/neural-engine...

Even the job listing gives away next to nothing, other than poor English "Knowledge in compiler is a plus" ;)

link

mhh__ 2006 days ago

https://news.ycombinator.com/item?id=14523587

Discussion of Intel patents from a few years ago

link

trollian 2006 days ago

They're implemented by other emulation systems like bochs: https://sourceforge.net/p/bochs/news/

link

colejohnson66 2006 days ago

bochs is probably just too small a fish for Intel to care. But Apple is a big money maker for Intel, so they’d have an incentive to push back legally. The patents aren’t on the implementation, but the function, so I’d wager that bochs is actually infringing. But IANAL, so take it with a grain of salt.

link

HexagonalKitten 2006 days ago

Can't be emulated efficiently, or at all? Has it been tested or is it based on this comment from Intel?

"Emulation is not a new technology, and Transmeta was notably the last company to claim to have produced a compatible x86 processor using emulation (“code morphing”) techniques. Intel enforced patents relating to SIMD instruction set enhancements against Transmeta’s x86 implementation even though it used emulation"

link

varispeed 2006 days ago

Can't they rename it to ABX and switch things here and there? Or get someone to do clean room implementation of ISA specification? You can't patent an API.

link

mhh__ 2006 days ago

The patents aren't on the API but what the API does.

Like it or not they are patented https://patents.google.com/patent/US7499962 (FMA, for example)

link

xoa 2006 days ago

>Like it or not they are patented https://patents.google.com/patent/US7499962 (FMA, for example)

Yeah, and AVX is also relatively new, intro'd in 2008 and first shipped in a chip in 2011. AVX2 wasn't until 2013. So even with R&D and patents happening years beforehand it'll still be a good long while before they expire (that FMA example being a case in point, not until end of 2026).

Granted in Apple's specific case that's actually not a bad thing. Precisely because AVX is so new, many Macs supported up until the last version or two of macOS didn't have it. So AVX isn't at all a widely expected dependency for the kind of older software that may never get an ARM port and in turn most needs Rosetta 2.

link

Const-me 2004 days ago

If FMA is patented, how comes ARM has equivalent instructions, called MLA? And GPUs have FMA as well.

link

loeg 2006 days ago

Can you elaborate on what patents, and how they prevent emulating AVX but allow emulating the rest of x86?

link

colejohnson66 2006 days ago

Every new feature set (the many SSEs, AVXs, and others) is patented. But newer processor features (like AVX) don’t “renew” the patents on the older features. So when AVX-512 was patented, it didn’t change the expiration dates for AVX2 and prior.

The base requirements for x86-64 mandate SSE2 IIRC. Those patents expired this year, so Apple was now able to release an x86-64 “emulator” without negotiating patents.

link

tempay 2006 days ago

This leaves me wondering if Apple had waited until now for ARM macs solely because of the patent situation.

link

loeg 2005 days ago

Huh. Is AMD paying licensing fees for the privilege of implementing the AVX and AVX2 instruction sets? It seems weird and anticompetitive that patents are granted for what amounts to an API.

(Do you have links or identifiers for the specific patents, by any chance?)

link

colejohnson66 2005 days ago

The situation with AMD is a bit complicated. Basically, there was an antitrust lawsuit led by AMD years ago, and as part of the settlement, Intel and AMD would share patents to allow collaboration. I can’t recall the actual specifics though.

As for patents, I don’t, but someone else here linked in the one for FMA[0]. It’s a bit more complicated than just an API, but it seems broad enough that anything implementing that API would be covered. But IANAL.

[0]: https://patents.google.com/patent/US7499962

link