| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by exDM69 782 days ago

> Aw. Why not?

Because the global allocator API is defined as free(pointer address) and this used free(buffer handle).

It would require a reverse lookup structure from address to buffer handle, e.g. red-black tree. Maintaining it would no longer be O(1).

> the fact that the number of bits is constant doesn't make this O(1), it just makes it O(n) where n := 64

O(n) when n is constant is equal to O(1).

But lzcnt uses a fixed number of clock cycles so it's not really O(n) either.

4 comments

jsheard 782 days ago

GlobalAlloc is also required to be thread safe, and TLSF doesn't even attempt to handle that. I suppose you could get away with it on single-threaded platforms like microcontrollers or (vanilla) WebAssembly but it wouldn't generalize unless you wrapped it in a mutex, and then the performance would be terrible.

link

exDM69 782 days ago

Correct me if I'm wrong but can't you put a Rc<RefCell<OffsetAllocator>> in your std::alloc::Allocator implementation for interior mutability?

This make a non-thread safe allocator, with the (desired) side effect of making anything allocating from it (e.g. Vec) a !Send.

Or is there a requirement that Allocators must be Sync and Send?

link

LegionMammal978 782 days ago

This is about GlobalAlloc, the trait that actually exists in stable Rust. Only static variables can be marked as the #[global_allocator], and statics are always required to be Send + Sync. Of course, if you're really sure that only a single thread will ever exist, you can pinky-promise it by putting a couple unsafe impls on the allocator type. But then the language won't protect you if you do try to allocate or deallocate from multiple threads.

(More practically, you can just write a GlobalAlloc that forwards to an instance of a thread-local allocator. Though that runs the risk of thread-local deinitialization running in an unexpected order.)

link

couchand 782 days ago

Any uc worth programming on has interrupts!

link

TrueDuality 782 days ago

Even with interrupts there is almost always a single execution unit, and thus single-threaded. An interrupt is just a jump instruction to a predefined handler. The code that was running is entirely stopped until the system decides to resume from that point.

link

couchand 782 days ago

That doesn't mean that naive single-threaded code is necessarily interrupt-safe. If the allocator is interrupted and the interrupt service routine needs to make use of the allocator's data structures for any reason there could very much be a conflict.

This particular algorithm is focused on GPUs, so I'm not clear on the applicability. However I did want to clear up the idea that there are no concurrency concerns for single-threaded library code, an assumption that's been the source of many bugs.

link

loeg 782 days ago

The presence of interrupts is basically equivalent to not being single threaded, though. There are true single-threaded environments, it's just limited to userspace processes that don't handle signals.

link

gpderetta 781 days ago

It is even worse: re-entrancy safe is a different property than thread-safe. For example you can't use mutexes to guarantee mutual exclusions with interrupts.

Some algorithms are both thread safe and reentrancy safe, but generally this is not the case.

link

orlp 782 days ago

> It would require a reverse lookup structure from address to buffer handle, e.g. red-black tree. Maintaining it would no longer be O(1).

Well, another solution is enlarging each allocation by 8 bytes and storing the buffer handle in front of the returned allocation. So `free(ptr)` would get the handle as `*((ptr as *const u64) - 1)`.

link

exDM69 782 days ago

That works but...

This kind of allocators are usually used for suballocating GPU buffers, so hiding a few bytes of metadata "in-band" can mess up your alignment requirements and not all kinds of GPU memory are even accessible by CPU. Due to false sharing cache problems you would probably want a full cache line (64 bytes) to store the metadata.

For a CPU-only memory allocator your idea could work quite well. It can also be implemented on top of this code without any modifications.

link

Sharlin 782 days ago

OTOH, I think this would be a good fit for the "Store" proposal [1] which uses handles rather than addresses to refer to allocations.

[1] https://github.com/matthieu-m/storage/blob/main/etc/rfc.md

link

pbalcer 782 days ago

> It would require a reverse lookup structure from address to buffer handle, e.g. red-black tree. Maintaining it would no longer be O(1).

Not necessarily. If you are able to map a block of normal memory in a known location relative to the GPU memory that you are managing with an "offset allocator", it should be possible to directly calculate the metadata location for each "offset". This is how most allocators find arenas/buckets (whatever you want to call them) for an allocation by a pointer.

Something like this:

  +-------------------------+ 0x000000 (start of managed memory)
  | Metadata                | 
  |                         |
  |                         |
  | ...                     |
  +-------------------------+ 0x001000 
  | Padding ...             |
  +-------------------------+ 0x010000
  | GPU Memory Block        |
  |                         |
  |                         |
  |                         | ~2MB block
  |                         |
  |                         |
  |                         |
  +-------------------------+ 0x210000

With this layout, to get to a metadata for an allocation, all you need to do is to align down the allocation pointer and calculate the appropriate location in the metadata page.

This obviously won't work in all scenarios, but it's a simple and practical way around a map lookup.

link

exDM69 782 days ago

A good idea but not O(1).

If you have a large allocation you have to mark every "page" in metadata table which makes allocation O(size in bytes / page size).

The overhead might still be practically acceptable, even if it is not constant time.

link