|
|
|
|
|
by emcq
792 days ago
|
|
The 1.4s is _after_ having the file loaded into RAM by the kernel. Because this is mostly I/O bound, it's not a fair comparison to skip the read time. If you were running on a M3 mac you'd might get less than 100ms if the dataset was stored in RAM. If you account for time loading from disk, the C implementation would be more like ~5s as reported in the blog post [1]. Speculating that their laptop's SSD may be in the 3GB/s range, perhaps there is another second or so of optimization left there (which would roughly work out to the 1.4s in-memory time). Because you have a lot of variable width row reads this will be more difficult on a GPU than CPU. [1] https://www.dannyvankooten.com/blog/2024/1brc/ |
|