|
|
|
|
|
by anonymoushn
1029 days ago
|
|
Do you have a cheap permutation function for large n? It seems like you still have to do it in two passes if you do this. One reason cited in TFA for the half-shuffled files approach is that it's easy to rotate old data out of and new data into the half-shuffled files. |
|
The downside is that it still needs one random read access per element, so cache friendly hierarchical algorithms, like the one described in the post, are probably still faster for on disk data.