Hacker News new | ask | show | jobs
by scott_s 5410 days ago
On modern systems, there are two kinds of addresses. "Virtual" addresses and physical addresses. Virtual addresses are tracked by the operating system, and they can span the entirety of the addressable address space. So, on a 32-bit system that isn't playing any high-memory tricks, that's 0 - 2^32, or 0 - 4 GB.

But your system may not have 4 GB. So the operating system has a data structure called a page table that has the virtual to physical mapping for each process. The processor accesses this table (it caches it in something called a TLB) so that it can convert the virtual address to the physical address.

An example using small numbers. Your program has a pointer to data. That pointer may have value 800. Let's assume that the amount of memory on your system is only between 0 - 400. So the processor has to convert the value 800 to a value between 0 - 400. It's the operating system's job to maintain that valid mapping.

Why does this matter, and why is it so tied up with paging to and from disk? Let's say the OS pages out the page containing that data. Then, later, it's paged back in, but in a different physical location in memory. Your program still has the pointer value 800, but your program still works correctly because the operating system keeps track of where in physical memory 800, for your process, maps to.

People in the Windows world often say "virtual memory" when they mean "swap space" because Windows would call the amount of swap space "virtual memory size." But virtual memory is the technique described above. Read the Wikipedia entry from above, or read an operating systems textbook for a full discussion of it.

2 comments

That's not entirely correct. The MMU generally handles virtual to physical memory address translation and the OS is only ever involved if there is a page fault. Outside of OS architecture and very specific and intended application, virtual/physical memory is completely transparent. When I hear "virtual memory" I assume reference to swap space unless otherwise noted because the technical meaning has such a specific domain.
That's why I noted that the CPU caches the mappings in the TLB. On modern processors, the MMU is integrated with the rest of the processor, so I didn't see the need to introduce another TLA. It's a part of the processor just as much as, say, the floating point unit is. The whole point of my discussion with small pointer values was to demonstrate that the virtual to physical mapping is transparent.

When I hear "virtual memory," I think of the computer science meaning. However, I am a researcher in high performance computing systems.

Awesome explanation.

Still, an SSD makes this all better. When I was using virtual memory on a Ubuntu server and put the swap partition on an SSD, everything worked great. On a rotating platter, not so much.

Oh, no question, SSDs are an improvement. I'm just clarifying the systems concepts involved. SSDs everywhere may change some assumptions in the operating system. For quite a while now, we've considered paging to "disk" as performance death, and have gone through contortions to make good decisions about which pages should be paged out. If paging to "disk" everywhere gets several orders of magnitudes cheaper, we may want to do less up-front processing trying to determine good victim pages.