| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jiggawatts 2307 days ago

> The raw array solution doesn't work if you can't fit the entire thing in memory.

Of course it works. You don't have to literally load the whole file into memory as an array. That's not how file I/O works! Just fseek() to the desired multiple-of-20 address and read 20 bytes.

> You could try mmap'ing the file

You'd only try that if you haven't read the documentation for mmap, just like a bunch of Rust programmers did. They started on 64-bit machines and never noticed that mmap is limited to a ~256MB window on most 32-bit architectures. Certainly less than the 11GB needed for this problem!

> There's nothing over engineered about a B-Tree.

Yes there is. This guy hand-rolled 400+ lines of low-level data structure manipulation code just for the B-Tree manipulation. How much do you want to bet that he's got zero memory safety issues or other unsafety in that thing?

Would you link this to your web app? Really, a C++ library?

> That gives you answers in microseconds vs milliseconds

I don't want to call him a liar... but now I have to. At the very least, he's made fatal mistakes in his benchmarking.

There is no way that he can do random password lookups from on-disk structures in under 50 microseconds, unless he's got an Intel Optane SSD, and even then I would be highly suspicious.

He's either cached the entire data structure in memory (at which point why bother with such a complex solution), or he's used the same set of passwords over and over for testing (which will falsely show a much lower latency than you would get with real passwords).

So answer me this: How would this solution work on a typical auto-scale cluster of stateless web server VMs? Do you replicate 5-12GB to every VM every time this database changes? Or do you put one copy on a network share? Congratulations, you now have 1ms latency minimum! That fancy microsecond optimisation has just gone out the window...

To be blunt: If your password-testing is the bottleneck to the point that a 50 microsecond latency is needed, then you're running a service that gets 20,000 password resets per second. At this point you're bigger than Office 365 and G Suite... combined, and you likely have much more important scaling concerns.

3 comments

stryku2393 2306 days ago

That's exacly why I publish my pet projects. To learn from others.

You're right. The benchmarks are bad. I benchmarked the best and the worst case in scope of the original file, so I look up for the first and the last hash.

I totally missed that if I look for the same hash over and over again, I'd end up reading the same B-tree files parts, so they can be easily cached. That's probably the reason it seemed so fast.

No lies here, I just missed this (:

I'll rewrite benchmarks and update the results.

link

haberman 2307 days ago

> You'd only try that if you haven't read the documentation for mmap, just like a bunch of Rust programmers did. They started on 64-bit machines and never noticed that mmap is limited to a ~256MB window on most 32-bit architectures.

Do you have a reference for that? I've never heard of this and I can't imagine the reason for such a limitation.

link

unilynx 2307 days ago

Having done a lot with mmap on 32bit systems I can't remember such a technical limitation either..

..although in practive it might be hard to find 256MB of contiguous memory in a 4GB virtual memory space due to fragmentation of other allocations and shared libraries, especially with ASLR.

link

jiggawatts 2307 days ago

https://stackoverflow.com/questions/5518084/memorymappedfile...

There's the hard limit of 2GB for most versions of 32-bit Windows and 4GB for any operating system.

Couple that with the requirement for a contiguous address space as well as various page table entry (PTE) limits, you get all sorts of "soft" limits way before 2GB. From what I've heard, 256MB is relatively safe to map, but anything much larger than that is increasingly likely to fail.

Correctly written code should be able to work with moveable "windows" into the file as small as 32MB to be properly robust, especially if the process memory is already fragmented.

Lots of software crashes with large files on 32-bit machines because of this. E.g.: https://www.monetdb.org/pipermail/users-list/2009-January/00...

As a more recent example, ripgrep had issues on 32-bit platforms because of a bug in the way the underlying mmap library worked in Rust.

Even on 64-bit platforms you can run into trouble. For example: https://jira.mongodb.org/browse/SERVER-15070

In that example, Windows Server 2008 R2 has an 8 TB limit. You could hit that if using a tool like ripgrep to do "forensic analysis" of disk images from a SAN, where virtual disks typically have 16 TB limit. So if you mount a SAN snapshot and open the disk as a file to scan it, you will hit this limit!

Programmers make all sorts of invalid assumptions...

link

burntsushi 2306 days ago

ripgrep doesn't require memory maps, and if they fail to open, it will fall back to a more traditional buffering strategy: https://github.com/BurntSushi/ripgrep/blob/50d2047ae2c0ce2ed...

ripgrep has always had a fast traditional buffering strategy using `read` calls for searching, because I knew that mmap couldn't be used in every case.

Anyway, this has been fixed for a couple years at this point, so if you're still experiencing a problem, then please file a new bug report.

> As a more recent example, ripgrep had issues on 32-bit platforms because of a bug in the way the underlying mmap library worked in Rust.

This is false. The bug you're thinking about is probably https://github.com/BurntSushi/ripgrep/issues/922, which was not caused by an underlying bug in memmap. memmap did have an underlying bug with respect to file offsets, but ripgrep did not use the file offset API. The bug was caused in ripgrep itself, since I made the classic mistake of trying to predict whether an mmap call would fail instead of just trying mmap itself. That bug was fixed on master before the Windows bug was even reported: https://github.com/BurntSushi/ripgrep/commit/93943793c314e05...

> You'd only try that if you haven't read the documentation for mmap, just like a bunch of Rust programmers did.

This isn't exclusive to Rust programmers. C tools make the same mistake all the time. Because memory maps aren't just problematic with large files on 32-bit systems, but they also don't work with virtual files on Linux. Try, for example, `ag MHz /proc/cpuinfo` and see what you get. Crazy how, you know, sometimes humans make mistakes even if they are a C programmer!

And the implication that I (or the author of memmap) never read the docs for `mmap` is just absurd.

If you're going to be snooty about stuff like this, then at least get the story correct. Or better yet, don't be snooty at all.

link

jiggawatts 2305 days ago

We've spoken before about this issue and at the time ripgrep was just erroring on large files on 32-bit platforms, it didn't fall back. You were using the Rust crate "mmap" at the time, you removed it temporarily as a fix, and now you're using the much improved "memmap" crate. Good stuff! I do use your tool occasionally, and it's useful, albeit the CPU fan noises annoy my co-workers.

The specific issue making the "mmap" crate incorrect was that it used a "usize" instead of "u64" for some of the functions, limiting it to 4GB files on 32-bit platforms. I believe it's this line of code: https://github.com/rbranson/rust-mmap/blob/f973ae1969b4b7e80...

Now, I'm not a mindreader, but to me this feels an awful lot like its author made a tacit assumption that mmap() is a "memory operation" that is tied to the architecture's pointer size. In similar conversations, heck, in this very discussion people were incredulous that a file can be bigger than memory and be processed.

I absolutely believe that people do not read much past the function declarations, and it might be a "snooty attitude" but experience unfortunately has shown it to be an accurate attitude.

I'm also not accusing you of incorrectly using mmap(), buuuuut... having a quick flip through your current code I see that you still have the attitude that "mmap() takes a filename and makes into a slice that the kernel magically reads in for me on demand".

This is just not true, not even on 64-bit platforms. On smaller devices with only 2-4GB of memory, it's entirely possible to simply run out of page table entries (PTEs). It's possible the memory space simply gets too fragmented. It's possible the kernel has other limits for processes. It's possible the that file is some virtual device with an enormous reported size. Etc, etc, etc...

The correct usage of mmap() is to use moderately-sized sliding windows of, say, 128MB at a time or whatever.

But, having said that: Your code is now correct in the sense that it won't crash, it won't have unsafety, it'll run on 32-bit just fine, and will probably work for all practical scenarios that people want to use a grep tool for. I also know that you have specific optimisations for "the whole file fits in a byte slice", so there's benefits to using the simple approach instead of a sliding window.

However, if this was a database engine that required mmap() to work, it would be absolutely incorrect. But it isn't a database engine, so no big deal...

link

burntsushi 2305 days ago

> You were using the Rust crate "mmap" at the time, you removed it temporarily as a fix, and now you're using the much improved "memmap" crate.

I don't understand why you're saying this. Could you point me to the place in the commit history where I used the `mmap` crate? The second commit in ripgrep's history is what introduced memory map support and it used the `memmap` crate: https://github.com/BurntSushi/ripgrep/commit/403bb72a4dd7152...

> albeit the CPU fan noises annoy my co-workers

ripgrep is happy to be told to run more slowly with `-j1`.

> I absolutely believe that people do not read much past the function declarations, and it might be a "snooty attitude" but experience unfortunately has shown it to be an accurate attitude.

This sounds to me like "I'm right so I can be as much of an arse as I want." Just don't be snooty about this. Sometimes I can read a man page thoroughly and still come away from misconceptions. Sometimes the docs are just bad. Sometimes it's just very dense. Sometimes there's a small but important detail that's easy to miss. Or sometimes I'm just not smart enough to comprehend everything. Instead of getting up on your holier-than-thou perch, maybe tone it down a notch next time.

> I'm also not accusing you of incorrectly using mmap(), buuuuut... having a quick flip through your current code I see that you still have the attitude that "mmap() takes a filename and makes into a slice that the kernel magically reads in for me on demand".

Not really. Especially since ripgrep's man page explicitly calls out memory maps as potential problem areas, and even gives users the option to avoid the issue entirely if they like:

> ripgrep may abort unexpectedly when using default settings if it searches a file that is simultaneously truncated. This behavior can be avoided by passing the --no-mmap flag which will forcefully disable the use of memory maps in all cases.

But, invariably, one of the nice things about memory mapping a file is precisely that it "mmap() takes a filename and makes into a slice that the kernel magically reads in for me on demand." And it generally pretty much works.

> However, if this was a database engine that required mmap() to work, it would be absolutely incorrect. But it isn't a database engine, so no big deal...

It's good enough where SQLite actually provides an option to use memory mapped I/O (noting pertinent downsides): https://sqlite.org/mmap.html Lucene also provides it as an option: https://lucene.apache.org/core/6_3_0/core/org/apache/lucene/... --- They likely both do the windowing you're talking about, but as the SQLite docs mention, that's not enough to stop it from crashing and burning.

At that level, it's good enough for ripgrep and it sure as hell is good enough for a random fun project like the one the OP posted. Absolutely no reason to get on your soapbox and snub your nose.

link

e12e 2306 days ago

> never noticed that mmap is limited to a ~256MB window on most 32-bit architectures

That's interesting, but processing more than 4gb of data on a 32bit systems seems pretty niche, these days? Where do you find them outside of industrial embedded applications? I think even my long discarded smart watch ran 64bit.

Now, vfat filesystems will bite you (Esp for removable media), but that's also fixable with some other fs, like xfat or zfs.

link