Hacker News new | ask | show | jobs
by AnotherGoodName 1085 days ago
Complexity? You mmap it in and then read the multi terrabyte file as if it was an array.

The opposite with actual file io sucks in terms of complexity. I get that you can write bespoke code that performs better but mmap is a one liner to turn a file into an array.

2 comments

Need to handle the exceptions/signals every time a disk read fails. With classic IO, you know when the read will happen. But with memory-mapped files, the exception can happen at any time you are reading from the memory range.

As for why disk reads fail, yes that's a thing. Less common on internal storage (bad sectors), but more common on removable USB devices or Network drives (especially on wifi).

Multi-terabyte? Better hope you have lots of spare RAM for all those page structures the kernel has to keep.