Hacker News new | ask | show | jobs
by adgjlsfhk1 381 days ago
That is paralelizable. Each of the 5 registers has no depence on the value of the others.
1 comments

But when you split from 4 registers to 5, the bits for a given destination register may come from two different source registers.
That only means you need a couple of instructions to do each one [1]. The five output registers are each dependent on (at most) a pair of the four input registers, but not on each other.

[1] a left shift, a right shift, and an OR. Or just one 2->1 funnel shift instruction if your ISA has that e.g. arm64. And then ANDing with a 51 bit mask.

    and  out0, in0, 0x7FFFFFFFFFFFF
    extr out1, in1, in0, #51
    extr out2, in2, in1, #38
    extr out3, in3, in2, #25
    lsr  out4, in3, #12
    
    and  out1, out1, 0x7FFFFFFFFFFFF
    and  out2, out2, 0x7FFFFFFFFFFFF
    and  out3, out3, 0x7FFFFFFFFFFFF