Hacker News new | ask | show | jobs
by judofyr 1182 days ago
> As a personal challenge, I strived to explicitly limit the amount of memory needed for solving each AoC problem to something that fits on the stack (typically a few MBs at most).

If the purpose is to "use limited amount memory" I would suggest to use a GeneralPurposeAllocator and setting "enable_memory_limit" and "requested_memory_limit": https://github.com/ziglang/zig/blob/8f481dfc3c4f12327499485e.... If the purpose is to "only use the stack", then "allocating a huge chunk and using it with a bump allocator" feels a bit like cheating to be honest...

Another potential challenge is to pre-allocate instead: Have an _initialize_ phase which is allowed to allocate memory and then an _execution_ phase where you're using the allocated memory. This pattern is very common in high-performance programs.

1 comments

Thanks for the pointers!

> use a GeneralPurposeAllocator and setting "enable_memory_limit" and "requested_memory_limit"

Interesting! I hadn't looked at GeneralPurposeAllocator too closely, but yes these seem like the right way to do things instead of abusing FixedBufferAllocator as I did.

> If the purpose is to "only use the stack"...

Not really, I just had to decide on some arbitrary upper bound on the mem usage, and the default stack size (8MiB) seemed like a decent choice. In retrospect, this challenge only took shape because my solution to Day1 used a FixedBufferAllocator backed by a buffer on the stack, and I realized how easy Zig made it to track allocs. I didn't fiddle too much with the general structure of the solution after that, and made it a "challenge" to see how far I could take it.

> Another potential challenge is to pre-allocate instead

Ah, that sounds much more difficult. This is also what TigerBeetle is doing [1]. But one thing I didn't understand even from that post, how would one deal with data structures that really depend on the input data, like the hashsets in TFA? Simplest way I can think of is to have an arbitrary upperbound on the allocated memory and then keep checking before every operation on any dynamic structure. That sounds tedious. Is there a better way?

[1]: https://tigerbeetle.com/blog/a-database-without-dynamic-memo...

I think you might be able to abuse an ArenaAllocator wrapping a FixedBufferAllocator. I haven't tested it, but IIRC, Zig's ArenaAllocator deallocates in reverse order once it's reset (it uses a singly linked list to keep track of allocations, so that's the natural way to do deallocation), so it might play nicely with the underlying FixedBufferAllocator's requirements.

If this is correct, it's likely an undocumented implementation detail, so probably not something you should rely on always being the case.

>Is there a better way?

Just let the kernel handle it. The virtual memory and mapped memory abstraction the kernel has makes your program's implementation simpler.

Zig is a low-level language. You might not have an MMU. You might not have a kernel. You might be the kernel.
Maybe if we went back a few decades. But in 2023 having access to a MMU and a kernel is normal.
No it is not for many applications. The entire embedded software engineering field is an example.
>The entire embedded software engineering field is an example.

This is simply false. Most devices are moving in this direction. Practically all phones people use are using a kernel that supports virtual memory. TVs now come with Linux too. All sorts of random devices use Linux now that powerful and cheap chips exist.

TigerBeetle writes to disk for long-term storage. Data over time is the part you can't fit into memory (eventually). :)
> TigerBeetle writes to disk for long-term storage

But how does it determine when it should write to disk? Does every write to a potentially OOM operation get preceeded by a check? Take the case of a HashAggregate. The DB clearly cannot know at compile time how many unique keys will be present in the hashtable; it needs to resize at runtime. So does that mean all the hashtables are still using some form of Bump/Arena allocators backed by the pre-allocated memory?

Maybe I should just read the source code :)

> But how does it determine when it should write to disk?

You write fixed sized number of key-value pairs to the file at a time. This is how LSM trees work, you chunk your data up into N sorted keys per chunk. I don't myself understand all the specifics but this is the gist.

> Does every write to a potentially OOM operation get preceeded by a check?

If you allocate memory upfront and don't allocate any more memory, you can't OOM after the initial allocation. That's what TigerBeetle does.

Zig has some nice standard library containers for adding items while asserting that there's capacity. If we miscalculate, it is caught during tests because assertions fail.