|
|
|
|
|
by shwestrick
929 days ago
|
|
On modern multicore hardware this will be memory-bound; the amount of computation per byte is pretty small (just a few arithmetic instructions on average). My intuition is that the single scan will be faster because it requires a much smaller number of cache misses. And yes, definitely, the numerical accuracy thing could be a problem. I suspect it wouldn't be too difficult to work around, but I can't say for sure off the top of my head. |
|
[1] https://people.xiph.org/~tterribe/pubs/gpusurf.pdf Section III.A