| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zneveu 2419 days ago
	Another easy solution I didn't see mentioned is to create swap space on Linux. This obviously isn't the fastest solution, but setting up 128GB of swap space let's me mindlessly load most datasets into memory without a single code change.

5 comments

olavgg 2419 days ago

Adding a 280GB Optane as swap is very efficient and cheap. It is still a lot slower than RAM though. But much much faster than NVME ssd's

link

p1esk 2418 days ago

Are you talking about Optane NVDIMM or NVME?

link

yummypaint 2419 days ago

Do you have a sense of how the performance would compare to chunking? My naive expectation is that loading data from disk to swapped memory involves writing that data to disk (even if it will only be read).

link

wongarsu 2419 days ago

Chunking can speed up processing even if the dataset fits into memory because you are interleaving disk reads and computation and the OS is likely to prefetch the next chunk into the read cache while you're still busy computing.

On other problems chunking doesn't work at all and just mmaping or dedicating giant amounts of swap are better strategies. It depends on the problem at hand

link

edoo 2418 days ago

Using mmap with the right flags should let you load and process huge files as well as if they were in ram.

link

Rafuino 2418 days ago

Plus, swap performance is being improved bit by bit, so it's not as much of a dirty word as it was before.

https://lwn.net/Articles/717707/

link

oxplot 2418 days ago

This! It's simple, performant depending on application and costs zero in extra development.

link