Hacker News new | ask | show | jobs
by giovannibonetti 486 days ago
Storage can be further reduced if we think that, with a 64-bit processor, probably a 32-bit address space is enough for most applications (that require less than 4 GB of RAM).

Maybe we can go even deeper with 16-bit near/relative pointers. Perhaps data-oriented design fits well in this situation? With blocks of 64k elements and uint16 indices to address elements inside of them.

5 comments

> with a 64-bit processor, probably a 32-bit address space is enough for most applications

This is related to the Java virtual machine's use of 32-bit "compressed ordinary object pointers (OOPs)" on a 64-bit platform. The pointer is 8-byte-aligned though, so it can address 32 GiB of memory. There is also a non-compressed-OOPs mode that can address more than 32 GiB.

I see that you're being silly, but the problem is conflating pointers and indices. 16 bit indices are fine. 16 bit pointers are terrible.

Separately, 32 bit pointers are a good optimization in many situations. Java will save lots of space by using 32 bit pointers until you go above 32GB or 64GB or more (depending on what you set as minimum object alignment).

16 bit relative pointers are about as useful/terrible as 16 bit indices.
A relative pointer with limited range is something that should rarely exist, and all too easily it can end up with a base that is no longer guaranteed to be adjacent. There's a lot of bonus landmines compared to just having a 64k element limit.
Something like Rust's borrow checker can make sure that your references / pointers only live as long as they should.
By "landmine" I mean situations where it gets very unpleasant to use because the data isn't guaranteed to be close enough to fit into one of those pointers. Even if it's true when the code is first written, minor updates might change that. I don't think something like a borrow checker can do much to help with that.
Yes, you lose flexibility. There's a price to pay for the increased performance.
So essentially, you are using indices rather than pointers.

Indices have advantages: they can be more compact, and they are position-independent. But you need a base pointer, and a bit more computation. It also requires you to have your own allocator, as your typical heap allocator will just give you an arbitrary pointer to your object.

Re: data-oriented data structures... random (not XOR)...

I have been running multiple iterations of entire ipv4 space port scan. As of now only scanning top 25 used ports (nmap, masscan have toplists). I wanted to be able to do simple data analysis _very_ efficiently time-wise, being OK to sacrifice memory for that.

If you want (time-wise) O(1) item insert (into SORTED btw), fetch, lookup, and port status check to simply be a matter of bitshifting (similarly with counting), then:

1. Bitfield array to store status of up to 32 ports (1 bit for state (open/closed) => 32 bit bitfield)

2. ...that's it. Each scan result is to be found at `bitfields[(unsigned int) ipv4_address]`

In C:

``` // bitfield for port status, for each IP

  struct port_field {
    bool p1:1;
    bool p2:1;
    bool p3:1;
    bool p4:1;
    bool p5:1;
    // in C, gotta write it out - of course we could use a macro to generate this
    ...
    bool p32:1;
  };
```

This will use 16 GiB of memory for the whole mapped space:

```

  #define NUM_IPV4 (unsigned long) 4294967296L
  // ...
  // sizeof(struct port_field) => 4 bytes
  struct port_field *ip_space = calloc(NUM_IPV4, sizeof(struct 
port_field)); ```

When scanning (or reading in scan results):

```

  in_addr_t u32_ip; // unsigned 32 bit int
  struct port_field *p_target_bitfield;
  int *p_target_int;

  // ... to insert:
  if (!(u32_ip = inet_addr(token))) { // `token` is string with actual ip (e.g. from text file)
    printf("line %lu: IPv4 address not valid: %s\n", line_count, s_ip);
  } else {
    p_target_bitfield = &(ip_space[u32_ip]); // unsigned int ipv4 as 'index'
    p_target_int = (int *) ((void *) p_target_bitfield); // cast bitfield* to int\*

    // set bit at port_index:
    *p_target_int = ((1 << port_index) | *p_target_int);
    // now, port identified by `port_index` is at `(1 << port_index) | *p_target_int)`
    // where `p_target_int` is pointer to port status bitfield cast into signed int32
```

It works - pretty nifty :) i'm sure i could make it much more pretty tho.

But a kind of 'columnar-bitfieldish' in-memory O(1) for everything:)*

Wouldn't a columnar store be 25 512MB arrays? And that's probably a better layout for doing analysis with.

Also you might as well set the number of addresses to 224<<24.

> Wouldn't a columnar store be 25 512MB arrays?

Hm I guess you're right - I'm misusing the term. And yeah! Will experiment with it; what's neat is that it's not that much code to mess around with these basic notions...

Re: 224 << 24 - you're right; so many unusable actual addresses. It's just kind of neat to actually map out whole ipv4 space to memory. But yes lots of it unneeded, I'll see if I can add minimum-computation-possible mapping translation so that everything still stays ~kind of O(1).

Thank you for your comments! :)

edit P.S. 25 x 512MiB arrays - actually thank you, I thought of doing sth like that at first, but now forget why didn't start experimenting with that sort-of-actual-columnar-store from the beginning.. anyway, nice to quickly mess around with multiple base data layouts (I'll try that one next I think), would recommend anyone wanting to attain base knowledge on e.g. data layouts for data analysis...

So an 8086 with 64 address pins? Or did I miss the /s?