| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by vlovich123 1325 days ago

The system you describe simply doesn’t exist, standards or no. A 64-bit kernel can’t hand out 64-bits worth of addresses because no CPU built today supports it.

A 48-bit index to an array can represent >240TBytes of RAM minimum - if your records are > 1 byte, you have significantly higher storage requirements. The largest system I could find that’s ever been built was a prototype that has ~160TiB of RAM [1]. Also remember. To make the algorithm incorrect, the sum of two numbers has to exceed 64bits - that means you’d need >63-bits of byte-addressable space. That just simply isn’t happening.

Now of course you might be searching through offline storage. 2^63 bits is ~9 exabytes of an array where each element is 1 byte. Note that now we’re talking scales of about about the aggregate total storage capacity of a public hyperscaled cloud. Your binary search simply won’t even finish.

So sure. You’re technically right except you’d never find the bug on any system that your algorithm would ever run on for the foreseeable future, so does it even matter?

As an aside, at the point where you’re talking about 48-bits worth of addressable bytes you’re searching, you’re choosing a different algorithm because a single lookup is going to take on the order of hours to complete. 63-bits is going to take ~27 years iff you can sustain 20gib/s for comparing the keys (sure binary search is logarithmic but then you’re not going to be hitting 20gib/s). Remember - data doesn’t come presorted either so simply getting all that data into a linearly sorted data structure is similarly impractical.

1 comments

wizeman 1325 days ago

> The system you describe simply doesn’t exist, standards or no. A 64-bit kernel can’t hand out 64-bits worth of addresses because no CPU built today supports it.

"Today" being the important part. That could change tomorrow. I could implement a 64-bit CPU right now that would support it (on an FPGA). It's not an inherent limitation, it's just an optimization that current CPUs do because we don't need to use the full 64-bit address space, usually.

Also, address space doesn't necessarily correspond 1-to-1 with how much memory there is.

For example, according to the AddressSanitizer whitepaper, it dedicates 1/8th of the virtual address space to its shadow memory. It doesn't mean that you need to have 2 exabytes of addressable storage to use AddressSanitizer, or that it reads or writes to all that space.

As I said, memory overcommit and memory compression (and also page mapping in general, as well as memory mapping storage and storage compression and storage virtualization, etc) allow you to address significantly more memory (almost infinitely more) than what you actually have.

There are other tricks with memory, page mapping and pointers which could break your code if it's not standards-compliant. This could happen for security reasons or because of new compiler or kernel optimizations or new features.

So I agree that this isn't a problem right now, unless you're doing something very esoteric, but if you want to have standards-compliant code and be more future-proof then you cannot rely on that.

There is also the point that the Go code that we're discussing has nothing to do with arrays, memory or address spaces, because it's a generic binary search function that works for any function "f" passed as an argument.

For example, it can be used to do a binary search for finding the zero of a mathematical function (i.e. for finding which value of `x` results in `y` becoming zero in the equation `y=f(x)`) and this has nothing to do with address spaces.

link

vlovich123 1325 days ago

> I could implement a 64-bit CPU right now that would support it (on an FPGA). It's not an inherent limitation, it's just an optimization that current CPUs do because we don't need to use the full 64-bit address space, usually.

You’re hand waving away way too much complexity. Please do build this system. Keep in mind that addressing 63bits of memory with huge tables on will use up > 2 Tera worth of PTEs which translate to what, 16 Terabit worth of memory? This is simply an order of magnitude more than dedicated machines ship with. You’re certainly not getting an FPGA with that.

> For example, according to the AddressSanitizer whitepaper, it dedicates 1/8th of the virtual address space to its shadow memory. It doesn't mean that you need to have 2 exabytes of addressable storage to use AddressSanitizer, or that it reads or writes to all that space.

I think you’re failing to appreciate how large 2^63 bytes is.

> As I said, memory overcommit and memory compression (and also page mapping in general, as well as memory mapping storage and storage compression and storage virtualization, etc) allow you to address significantly more memory (almost infinitely more) than what you actually have.

See point above. Such a system is just not likely to exist in your lifetime.

> but if you want to have standards-compliant code and be more future-proof then you cannot rely on that.

All code has a shelf life. What’s the date you’re working on here? I’m willing to bet it’s not an issue by the end of this century.

link

wizeman 1325 days ago

> You’re hand waving away way too much complexity. Please do build this system. Keep in mind that addressing 63bits of memory with huge tables on will use up > 2 Tera worth of PTEs which translate to what, 16 Terabit worth of memory? This is simply an order of magnitude more than dedicated machines ship with. You’re certainly not getting an FPGA with that.

The page table is itself stored in virtual memory, is a tree structure and it can be fully sparse, i.e. you only need to populate the PTEs that you use, basically.

Keep in mind, as long as you enable memory overcommit or use MAP_NORESERVE in mmap, you can allocate 127 TiB (~2^47 bytes) worth of virtual address space on Linux x86-64, today, at zero cost. With 4K pages!

In fact, I just did that on my laptop and memory usage has remained exactly as it was before.

And on POWER10 you can map 2^51 bytes of virtual address space today, also at zero cost.

> I think you’re failing to appreciate how large 2^63 bytes is.

No, I do appreciate it. It's 65536 times larger than the maximum address space you can allocate today on Linux x86-64 at zero cost. With 4K pages. Or a factor of 2048 larger than POWER10 can do today, also at zero cost.

In fact, with 1G HugePages, the maximum theoretical number of PTEs needed for 2^64 bytes of address space would be LESS than the number of PTEs needed for the 2^47 bytes you can allocate today, on Linux x86-64, with 4K pages, at zero cost (which I just did, on my laptop).

The maximum amount of virtual address space you can allocate is only limited by how many bits the CPU and MMU are designed to address.

If you don't believe me, you can try it yourself:

    #include <stdio.h>
    #include <stdlib.h>
    #include <unistd.h>

    int main() {
        // 127 TiB
        size_t size = 127ULL * 1024 * 1024 * 1024 * 1024;

        printf("allocating %zu bytes of virtual address space...\n", size);

        void *p = malloc(size);

        if (p == NULL) {
            perror("malloc");
            exit(1);
        }

        printf("success: %p\n", p);

        sleep(3600);
    }

Be sure to do 'echo 1 > /proc/sys/vm/overcommit_memory' as root and then run the program:

  $ gcc -o alloc alloc.c -Wall
  $ ./alloc
  allocating 139637976727552 bytes of virtual address space...
  success: 0xf29bb2a010

Then observe how memory usage on your system hasn't changed.

link

vlovich123 1316 days ago

Yes you can allocate that sparsely. So? If you're doing a binary search, you have to touch those pages so the sparseness is pretty irrelevant. Try doing a binary search over a memory space like that and see where you get.

Just to be clear, even if a PTE entry was just 1 pointer long (it's not), covering 63 bits of address space with 1 GiB PTEs would require >73 GiB just for the page tables. And those page tables ARE getting materialized if you're doing a binary search over that much data.

I'm not as imaginative in you to see a world in which you can sparsely map in 2^63 elements (9 exabytes if 1 byte per element) on one CPU and then the problem you're solving is a binary search through that data which is going to cause about log(n) to be mapped in to satisfy the search. 1 exabyte is probably the amount of RAM that Google has collectively worldwide. Now sure, maybe you're talking about mapping files on disk but again. 1 exabyte is a shit ton. It's probably several clusters worth of machines for storage. And even with 1 GiB pages, you're talking about 1 billion PTEs total and each lookup is going to need to materialize ~9 PTEs to search. And all of that is again a moot point because no CPU like that exists or will exist any time soon.

link