Hacker News new | ask | show | jobs
by blucoat 3663 days ago
>SIGSEGV is a very important signal. It happens when your program tries to access memory that it does not have. An appropriate reaction might be to

    allocate more memory
    read some data from disk into that memory
    do something with garbage collcetion (but what? I'm confused about this still.)
What? Are there any Real World Programs which do anything other than print a stacktrace and exit? I don't think this person gets what a segfault is.
4 comments

While I agree that it'd be unusual for regular applications to have to resort to using SEGVs to implement features, low level systems code, especially VMs often do, for performance reasons. The Hotspot JVM for instance uses SEGVs to force a thread into a safepoint. The JIT inserts a read instruction, among other places, at backward branches, which tries to read from a page in memory called the polling page. Said page is mapped during normal operation of the application. When the VM needs to bring threads to a "safe point", say to perform a GC, it does so by unmapping the polling page. This causes each of the active threads to fault on the read and enter the SEGV handler, which notices that the faulting address falls within the polling page and executes appropriate "safe point" actions. Libc implementations use a similar technique to commit pages for a thread's stack lazily.
Windows uses page faults in the stack guard page to lazily commit stack pages. Compilers allocating large structures on the stack need to generate loops touching each allocated page in turn to guarantee the allocation. On Windows the lazy allocation can be done entirely in user code - it doesn't need to be an OS feature. I believe pthreads uses the same technique on Linux; very far from sure though.

Generational GC can use segfaults to detect writes to older generations and mark pages that need scanning for references to younger allocations. They can also act as a way of triggering a safe point without polluting the branch prediction cache: unmap a page when you want an interrupt, and periodically touch the page in code that needs interrupting (loops etc.). Virtual machines for languages like Java can and do use these techniques.

If you had a Green threaded program and one of the threads segfaulted. You would probably want to catch Segv and kill that thread. (Not killing the OS thread running it).

I've also seen is used to implement a distributed malloc. When a segfault occurs, the handler messages the programs peers asking if they have the data for that address. If so the peers sends the page and the handler maps in a new page for that address with the correct data in it. This is essentially implementing a page fault handler in user space. (For some network backed memory).

Why would you only want to kill that green thread? On any thread implementation I'm aware of, an unhandled segfault kills the whole process. Anything else is disaster waiting to happen.
I've read that one of the original Unix shells (Thompson's or Bourne's) used a combination of sbrk()/brk() system calls and SIGSEGV to do dynamic memory allocation for itself. I can't find a reference to this via Google, as any information about old shells and SIGSEGV is swamped by modern people talking about bash and bad programs, or trapping SIGSEGV in scripts or some such. The "heirloom sh" code doesn't have anything like that, but it's clearly been tinkered with, as it uses sigaction(), a BSD innovation.

So, feel free to ignore this vague memory.