Hacker News new | ask | show | jobs
by jdmoreira 3781 days ago
"if the data set becomes too large to fit in memory on a single machine?"

You just use mmap. If it become bigger than the disk capacity of single machine, you can still mount a distributed filesystem and use mmap.

1 comments

Slightly off topic: http://yourdatafitsinram.com/ and more seriously https://www.youtube.com/watch?v=SiSY1b0am5w which is a really good talk about the challenges of sysadmining for scientists. What do you do when 2TB of RAM isn't enough?
You add an index and smaller-block loader. The size and structure of a "smaller-block" depends on the project and how you are using it.

In more extreme situations, you to learn all about materialized views, cache invalidation, etc.