Hacker News new | ask | show | jobs
by bob1029 1750 days ago
I was faced with this sort of issue yesterday - 1 file per instance of user state in some directory. After a year of typical usage we would be well into 7 figure file count. I had concerns that Windows would potentially choke on this in bad ways in production.

My solution was to use a nested directory approach where each could contain up to 4096 files or other directories. A path for the first 3 items in the sequence would look like:

  /0/0/0/0.item
  /0/0/0/1.item
  /0/0/0/2.item
The 4096th & 4097th identities would live at:

  /0/0/0/4095.item
  /0/0/1/0.item
Implementing the path scheme is a series of trivial bitmasks (0xFFF) over the identity of each.

The only thing I don't like about this is the lock around ensuring the directory exists, but its not in an especially hot path. Updates to existing items are where things get tight for us, not on creation.

1 comments

This is a viable solution, and an old one. My only comment would be that if I was you I'd make the filename the whole value (so your 4097'th item would be /0/0/1/4096.item). From experience it makes it (marginally) easier to change the directory structure if you decide you need it in the future if you don't have to rename the files. It's not a big deal, but it's been convenient.