| > You’re hand waving away way too much complexity. Please do build this system. Keep in mind that addressing 63bits of memory with huge tables on will use up > 2 Tera worth of PTEs which translate to what, 16 Terabit worth of memory? This is simply an order of magnitude more than dedicated machines ship with. You’re certainly not getting an FPGA with that. The page table is itself stored in virtual memory, is a tree structure and it can be fully sparse, i.e. you only need to populate the PTEs that you use, basically. Keep in mind, as long as you enable memory overcommit or use MAP_NORESERVE in mmap, you can allocate 127 TiB (~2^47 bytes) worth of virtual address space on Linux x86-64, today, at zero cost. With 4K pages! In fact, I just did that on my laptop and memory usage has remained exactly as it was before. And on POWER10 you can map 2^51 bytes of virtual address space today, also at zero cost. > I think you’re failing to appreciate how large 2^63 bytes is. No, I do appreciate it. It's 65536 times larger than the maximum address space you can allocate today on Linux x86-64 at zero cost. With 4K pages. Or a factor of 2048 larger than POWER10 can do today, also at zero cost. In fact, with 1G HugePages, the maximum theoretical number of PTEs needed for 2^64 bytes of address space would be LESS than the number of PTEs needed for the 2^47 bytes you can allocate today, on Linux x86-64, with 4K pages, at zero cost (which I just did, on my laptop). The maximum amount of virtual address space you can allocate is only limited by how many bits the CPU and MMU are designed to address. If you don't believe me, you can try it yourself: #include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main() {
// 127 TiB
size_t size = 127ULL * 1024 * 1024 * 1024 * 1024;
printf("allocating %zu bytes of virtual address space...\n", size);
void *p = malloc(size);
if (p == NULL) {
perror("malloc");
exit(1);
}
printf("success: %p\n", p);
sleep(3600);
}
Be sure to do 'echo 1 > /proc/sys/vm/overcommit_memory' as root and then run the program: $ gcc -o alloc alloc.c -Wall
$ ./alloc
allocating 139637976727552 bytes of virtual address space...
success: 0xf29bb2a010
Then observe how memory usage on your system hasn't changed. |
Just to be clear, even if a PTE entry was just 1 pointer long (it's not), covering 63 bits of address space with 1 GiB PTEs would require >73 GiB just for the page tables. And those page tables ARE getting materialized if you're doing a binary search over that much data.
I'm not as imaginative in you to see a world in which you can sparsely map in 2^63 elements (9 exabytes if 1 byte per element) on one CPU and then the problem you're solving is a binary search through that data which is going to cause about log(n) to be mapped in to satisfy the search. 1 exabyte is probably the amount of RAM that Google has collectively worldwide. Now sure, maybe you're talking about mapping files on disk but again. 1 exabyte is a shit ton. It's probably several clusters worth of machines for storage. And even with 1 GiB pages, you're talking about 1 billion PTEs total and each lookup is going to need to materialize ~9 PTEs to search. And all of that is again a moot point because no CPU like that exists or will exist any time soon.