| Re: data-oriented data structures... random (not XOR)... I have been running multiple iterations of entire ipv4 space port scan. As of now only scanning top 25 used ports (nmap, masscan have toplists). I wanted to be able to do simple data analysis _very_ efficiently time-wise, being OK to sacrifice memory for that. If you want (time-wise) O(1) item insert (into SORTED btw), fetch, lookup, and port status check to simply be a matter of bitshifting (similarly with counting), then: 1. Bitfield array to store status of up to 32 ports (1 bit for state (open/closed) => 32 bit bitfield) 2. ...that's it. Each scan result is to be found at
`bitfields[(unsigned int) ipv4_address]` In C: ```
// bitfield for port status, for each IP struct port_field {
bool p1:1;
bool p2:1;
bool p3:1;
bool p4:1;
bool p5:1;
// in C, gotta write it out - of course we could use a macro to generate this
...
bool p32:1;
};
```This will use 16 GiB of memory for the whole mapped space: ``` #define NUM_IPV4 (unsigned long) 4294967296L
// ...
// sizeof(struct port_field) => 4 bytes
struct port_field *ip_space = calloc(NUM_IPV4, sizeof(struct
port_field));
```When scanning (or reading in scan results): ``` in_addr_t u32_ip; // unsigned 32 bit int
struct port_field *p_target_bitfield;
int *p_target_int;
// ... to insert:
if (!(u32_ip = inet_addr(token))) { // `token` is string with actual ip (e.g. from text file)
printf("line %lu: IPv4 address not valid: %s\n", line_count, s_ip);
} else {
p_target_bitfield = &(ip_space[u32_ip]); // unsigned int ipv4 as 'index'
p_target_int = (int *) ((void *) p_target_bitfield); // cast bitfield* to int\*
// set bit at port_index:
*p_target_int = ((1 << port_index) | *p_target_int);
// now, port identified by `port_index` is at `(1 << port_index) | *p_target_int)`
// where `p_target_int` is pointer to port status bitfield cast into signed int32
```It works - pretty nifty :) i'm sure i could make it much more pretty tho. But a kind of 'columnar-bitfieldish' in-memory O(1) for everything:)* |
Also you might as well set the number of addresses to 224<<24.