Hacker News new | ask | show | jobs
by fwlr 822 days ago
“The Java Virtual Machine […] leverages the protected memory access signal mechanism both for correctness (e.g., to handle the truncation of memory mapped files) and for performance.”

Where by “protected memory access signal mechanism”, they mean SIGBUS/SIGSEGV, i.e., a segfault.

This is probably because the JVM is doing “zero cost access checks”, which is where you do the moral equivalent of:

    try {
      writeToFile()
    } catch(err) {
      if (err == SYSTEM_CRASH_IMMINENT) {
        changeFilePermissions()
        retry
      }
    }
…because it’s faster than checking file permissions before every write. (It’s a common pattern in systems programming, so it’s not quite as crazy as it sounds.)

I guess my opinion on this is that if you write your program to intentionally trigger and ignore kill(10) / kill(11) from the host OS, for the sake of a speed boost, you can’t really get too mad when the host OS gets fed up and starts sending kill(9) instead.

I also wonder what happens in the (extremely rare) case where the signal the JVM is trapping is a real segfault, and not an operating system signal.

2 comments

This isn't about files, this is about plain pages of RAM[0]. It is a basic CPU operation to trap on unmapped pages, and OSes rightfully expose this useful feature (in addition to using it themselves), allowing processes to do many things, from lazily-computed memory regions to removing significant amounts of overhead doing a thing the CPU will inevitably do itself anyway.

I believe the "the truncation of memory mapped files" section is for when the Java process memory-maps a file (as Java provides memory-mapping operations in its standard library, and probably also uses them itself), and afterwards some other unrelated process truncates the file, resulting in the OS quietly making (parts of) the mappings inaccessible. Here the process couldn't even check the permissions before reading (never mind how utterly hilariously inefficient that would be, defeating the purpose of memory-mapping) as the mappings could change between the check and subsequent read anyway.

[0]: https://bugs.java.com/bugdatabase/view_bug?bug_id=8327860, "I've managed to narrow this down to this small reproducer:" section

And it's worth noting that while man mmap on macOS doesn't indicate what happens when the protections are violated (that is, if you try to read, write, or execute in violation of the set protections) the related function mprotect has this to say in macOS 14.3 (what I have available):

> When a program violates the protections of a page, it gets a SIGBUS or SIGSEGV signal.

(The Linux man pages for mmap and mprotect indicates SIGSEGV would be signaled.)

So the past use and assumption (SIGSEGV or SIGBUS) are consistent with the expectations of mmap and mprotect given the documentation provided.

You are of course completely correct.

However, I still stand by my pseudocode - I claim that it will give a fairly accurate impression of the basic concept of zero-cost access checks to a reader who isn’t familiar with low-level systems programming. (That said, I have updated my comment to make it clear it’s more of a metaphor than a literal description.)

A talk at FOSDEM this year [0] describes how the OpenJDK JVM relies on triggering SIGSEGVs in order to efficiently implement thread-local safepoint checks - I wonder if that would also be affected?

[0]: https://mostlynerdless.de/blog/2023/07/31/the-inner-workings...

> I also wonder what happens in the (extremely rare) case where the signal the JVM is trapping is a real segfault, and not an operating system signal.

Just an educated guess, but the JVM knows if a thread may expect a segfault at a given point or not. If no thread expects one, then I assume the segfault handler just writes out that a segfault happened with some useful info, and terminates the program. I mean, I’m sure about the effect as I have caused a JVM to segfault a couple of times with native memory, so it handles it as expected.