Hacker News new | ask | show | jobs
by mathias_10gen 5472 days ago
FYI - Your _id trick is similar to the ObjectID type mongodb uses by default.

"A BSON ObjectID is a 12-byte value consisting of a 4-byte timestamp (seconds since epoch), a 3-byte machine id, a 2-byte process id, and a 3-byte counter. Note that the timestamp and counter fields must be stored big endian unlike the rest of BSON. This is because they are compared byte-by-byte and we want to ensure a mostly increasing order."

http://www.mongodb.org/display/DOCS/Object+IDs#ObjectIDs-BSO...

2 comments

My _id is different from ObjectID. It does begin with a timestamp but one that has 2010 as epoch. It's also followed by a kind of user ID and a piece of random identifier that appears elsewhere in the document.
Interesting, so the _id trick mentioned by FooBarWidget is not the real reason for the speedup?
It is. I was using a totally random string key, not the default ObjectID.
So if you stick with the default _id, the claim of "Your indexes must fit in memory" is no longer valid?
That totally depends on your workload. In my case my working set happens to be mostly equal to the most recently inserted data. If you have to regularly access lots of random documents with no locality whatsoever then your working set is very large and should fit in memory.

By sticking with the default _id, with my workload my _id index doesn't have to fit into memory. I can't actually use the default _id for various reasons but that's a whole different discussion.

Useful information, thanks!