Hacker News new | ask | show | jobs
by saturncoleus 3690 days ago
Those are all really good points actually. In response to them individually:

- The problem with generating id's is that it isn't known ahead of time how many there will be. This forces a solution that is suboptimal in all circumstances.

- The reason for rejecting UTF-8 is mostly backwards compatibility with existing software. Being able to use encoded UTF-8 strings that exceed the million is possible, but really burns a lot of bridges along the way. The point about boyer moore is really cool, I had no idea that was a goal!

- Having the length in folder structure is exponential, but only at the top most level. It will be uniform under each length dir. This is an acceptable price to pay when typing "ls ./dir", since removing the prefix would make it hard to read quickly:

    0/
    0.jpg
    1/
    1.jpg