| HN Mirror

All those class variables are already in __slots__ so in theory it shouldnt matter. Your advice is good

     self.shift_index -= 16
     shift_byte = (self.shift >> self.shift_index) & 0x5555
     shift_byte = (shift_byte + (shift_byte >> 1)) & 0x3333
     shift_byte = (shift_byte + (shift_byte >> 2)) & 0x0F0F
     self.shift_byte = (shift_byte + (shift_byte >> 4)) & 0x00FF


 but only for exactly 2-4 milliseconds per 1 million pulses :) Declaring local variable in a tight loop forces Python into a cycle of memory allocations and garbage collection negative potential gains :(

    SWAR                           :     0.288 seconds  ->    0.33 MiB/s
    SWAR local                     :     0.284 seconds  ->    0.33 MiB/s

This whole snipped is maybe what 50-100 x86 opcodes? Native code runs at >100MB/s while Python 3.14 struggles around 300KB/s. Python 3.4 (Sigrok hardcoded requirement) is even worse:

    SWAR                           :     0.691 seconds  ->    0.14 MiB/s
    SWAR local                     :     0.648 seconds  ->    0.14 MiB/s

You can try your luck https://github.com/raszpl/sigrok-disk/tree/main/benchmarks I will appreciate Pull requests if anyone manages to speed this up. I give up at ~2 seconds per one RLL HDD track.

This is what I get right now decoding single tracks on i7-4790 platform:

    fdd_fm.sr 0.9385 seconds
    fdd_mfm.sr 1.4774 seconds
    fdd_fm.sr 0.8711 seconds
    fdd_mfm.sr 1.2547 seconds
    hdd_mfm_RQDX3.sr 1.9737 seconds
    hdd_mfm_RQDX3.sr 1.9749 seconds
    hdd_mfm_AMS1100M4.sr 1.4681 seconds
    hdd_mfm_WD1003V-MM2.sr 1.8142 seconds
    hdd_mfm_WD1003V-MM2_int.sr 1.8067 seconds
    hdd_mfm_EV346.sr 1.8215 seconds
    hdd_rll_ST21R.sr 1.9353 seconds
    hdd_rll_WD1003V-SR1.sr 2.1984 seconds
    hdd_rll_WD1003V-SR1.sr 2.2085 seconds
    hdd_rll_WD1003V-SR1.sr 2.2186 seconds
    hdd_rll_WD1003V-SR1.sr 2.1830 seconds
    hdd_rll_WD1003V-SR1.sr 2.2213 seconds
    HDD_11tracks.sr 17.4245 seconds <- 11 tracks, 6 RLL + 5 MFM interpreted as RLL
    HDD_11tracks.sr 12.3864 seconds <- 11 tracks, 6 RLL + 5 MFM interpreted as MFM