| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by the8472 2333 days ago
	Don't fancy x86 addressing modes provide most of those multiplications and offsets with very little IPC penalty?

2 comments

cfallin 2333 days ago

Yeah, this should be roughly the same overhead as an ADD:

    LEA rDest, [rBase + 8*rPtr]

(The "load effective address" instruction computes an effective address like a load or store would, but just gives the address without doing a memory access.)

link

the8472 2333 days ago

AIUI mov supports these things directly[0] and if I read the instruction tables correctly then at least on skylake the latency/throughput is the same for all addressing modes[1]

[0] http://www.c-jump.com/CIS77/ASM/Addressing/lecture.html#R77_... [1] https://www.agner.org/optimize/instruction_tables.pdf (page 238)

link

verwaest 2329 days ago

Decompression isn't the problem, compression is. Compression is just a mov. Now we need additional shifts.

link

verwaest 2329 days ago

Also we'll probably lose some cache benefits from compression due to larger alignment.

link