Hacker News new | ask | show | jobs
by exyi 10 days ago
It's 3 cycles for float multiplication (and 1 for shift right):

https://uops.info/table.html?search=mulss&cb_lat=on&cb_tp=on...

https://uops.info/table.html?search=shr&cb_lat=on&cb_tp=on&c...

In throughput it's even less of a difference: 2 per cycle vs 3 per cycle.

2 comments

Shift right isn't even relevant here - if you shift before conversion to float all your values end up 0 and if you want to divide afterwards its no longer a simple shift.
Exactly. Although if you do >> 8 while working with uint8, it will be the fastest :)
It's 3 cycles for float multiplication (and 1 for shift right):

3x faster

In throughput it's even less of a difference: 2 per cycle vs 3 per cycle.

50% faster