Y
Hacker News
new
|
ask
|
show
|
jobs
by
brucehoult
84 days ago
Of course it is. Emulating parallel operations on 4 or 8 or 16 or 32 elements one at a time using scalar instructions is expected to be slow.