|
|
|
|
|
by mesuvash
87 days ago
|
|
Fair point. I've updated the animation to address this. The grid now uses the correct non-uniform centroids (optimal for the arcsine distribution in 2D), so you'll see grid lines cluster near the edges where unit-circle coordinates actually concentrate, rather than being evenly spaced. The spacing does change with bit depth. On the second quantization step: the paper's inner-product variant uses (b-1) bits for the MSE quantizer shown here, then applies a 1-bit QJL (Quantized Johnson-Lindenstrauss) encoding of the residual to make dot-product estimates unbiased. I chose to omit QJL from the animation to keep it digestible as a visual, but I've added a note calling this out explicitly. |
|
I'm not sure if it's my own misunderstanding or if the paper [0] has something of an error. Section 3.1 starts out to the effect "let x be on the unit hypersphere" (but I'm fairly certain it's actually not). Neither algorithm 1 nor algorithm 2 show a normalization step prior to rotating x. Algorithm 2 line 8 shows that the scalar returned is actually the magnitude of the residual without accounting for QJL.
Anyway I'm pretty sure the authors inadvertently omitted that detail which really had me confused for a while there.
[0] https://arxiv.org/abs/2504.19874