Hacker News new | ask | show | jobs
by xtrapolate 2937 days ago
> "The library mustn’t call malloc() internally. It’s up to the caller to allocate memory for the library. What’s nice about this is that it’s completely up to the application exactly how memory is allocated. Maybe it’s using a custom allocator, or it’s not linked against the standard library."

OP's approach will indeed work for most "minimalist"/single-header libraries, but, I personally feel it pollutes the API you're exposing to your users.

Depending on the specific situation, I may sometimes choose to expose a MODULE_CreateObject() and a MODULE_CreateObjectEx(custom_allocator, custom_deallocator).

Internally, MODULE_CreateObject() calls MODULE_CreateObjectEx(), passing the module's default allocators and deallocators (ie. HeapAlloc and HeapFree). This strikes me as a more balanced approach.

One caveat here, is that you must enforce consistency across usage - you don't want some API calls to use malloc() for allocation, whilst others use HeapFree() for deallocation, that would be a recipe for disaster.

To ensure that, I would often set the allocators and deallocators once, when the object is first created. They may be set through the object's initialization function, and they persist as part of the object itself.

2 comments

> I personally feel it pollutes the API you're exposing to your users.

Pollution is in the eye of the beholder. There are many circumstances where a project or subset of a project needs to work without a heap, they just don't necessarily overlap with the "application layer code in a virtual memory process" world your intuition is calibrated against.

And sometimes this stuff needs to read a JSON object or decode base64 or utf8 too, and can't because the library is too thick.

> "Pollution is in the eye of the beholder. There are many circumstances where a project or subset of a project needs to work without a heap, they just don't necessarily overlap with the "application layer code in a virtual memory process" world your intuition is calibrated against."

That's an argument in favor of offloading allocation/deallocation to the library's users, which is exactly the core of my, and OP's, proposals. We're saying the same thing here - developers should be able to determine/control how memory is allocated and deallocated.

> "And sometimes this stuff needs to read a JSON object or decode base64 or utf8 too, and can't because the library is too thick."

I'm losing you here. I honestly feel that my proposal is all about keeping the API as simple as humanely possible, without compromising the library's flexibility when it comes to the scenarios your mentioned earlier.

In your case:

  BASE64DECODER_Decode(...)
  BASE64DECODER_DecodeEx(..., allocator, deallocator)
  BYTE * BASE64DECODER_GetDecodedBuffer(handle)
  BASE64DECODER_Free(handle)
But what if I don't have a heap? Not even a wrappable heap.

I could be an OS bootstrapping layer, a signal handler, an ISR, a process control project operating under strict 'No dynamic allocation!" rules, a thunking layer to get legacy code modes (BIOS says hi!), ...[1]

You're imagining a world where everything is Node or Python or Java, or at the worst C on top of the well-defined standard library. And I'm telling you that the world is bigger than that.

And more specifically, that those weird layers sometimes need library code too.

[1] (Edited to add) A malware payload, a tracing layer, a compiler-generated stub, a benchmarking hook that can't handle heap latency, ...

> "You're imagining a world where everything is Node or Python or Java, or at the worst C on top of the well-defined standard library. And I'm telling you that the world is bigger than that."

Why do you keep putting words in my mouth?

> "But what if I don't have a heap? Not even a wrappable heap."

I'm forced to repeat myself over again. At no point does my proposed API force you to rely on a heap. On the contrary, it lets you rely on whatever solution works best for you, in your specific case.

In your custom kernel project, your custom allocator() can return a buffer from a memory pool you handle yourself. Your custom deallocator() will reclaim that memory back into your custom memory pool.

In a different project, say a desktop app for Windows 10, the allocator() will simply call malloc(), and the deallocator() will call free().

This way, your allocator() can do whatever. Your deallocator() can do whatever. How is this restrictive in any way shape or form?

> In your custom kernel project, your custom allocator() can return a buffer from a memory pool you handle yourself. Your custom deallocator() will reclaim that memory back into your custom memory pool.

I don't have either. I have a statically allocated buffer big enough for one frame of data, and I need to guarantee that it never gets used twice. My code does not have a custom allocator. It does not allocate.

  void * your_custom_allocator(size_t size)
  {
      // Handle your locks.
      // Sanity checks, assertions, bounds checks, etc...

      void * result = g_your_buffer + g_position;

      // "Commit Memory" from your buffer.
      g_position += size;

      // Some more code...

      return result;
  }
Now, you can happily use:

  BASE64DECODE_DecodeEx(..., your_custom_allocator, your_custom_deallocator);
What else do you need? You seem very disturbed by the use of the word "allocator" here, feel free to rename to whatever works for you.
> In your custom kernel project, your custom allocator() can return a buffer from a memory pool you handle yourself. Your custom deallocator() will reclaim that memory back into your custom memory pool.

You realize you're arguing that a custom probably buggy heap implementation isn't a heap right?

> In your custom kernel project, your custom allocator() can return a buffer from a memory pool you handle yourself.

We're done. "It's OK, you can just write your own heap-like API!" is just not remotely responsive to the kind of problems I'm talking about, and that you think it is is sorely tempting me to put more words in your mouth.

If you don't think these libraries are useful, that's fine. Don't use them. Don't presume to understand the application realm before you've worked in it.

I also work in the resource-constrained / embedded native space and have had to work within the kinds of constraints you're describing. I think you're severely misunderstanding what the comment you're responding to is proposing.
But what's the alternative to passing custom allocators and deallocators if you want to tightly control the way a library manages memory? If you're running with such constraints, presumably you want to be in control of memory management and not just leaving the library to do its own thing.
At this level, surely it's better to leave it to the "user" of the library - who can always write a wrapper for all their used libraries to the same style API for use in the rest of the program
This is true for libraries that have a lot of variable sized types. There are however a lot of interesting problems where you don't need many of these apart from, say, one big buffer you work with. A good example of this is a parser, or this TLS library: http://bearssl.org/. It makes integrating with a library maybe a bit more tedious but it comes with so much more control. And you could always build a layer on top that does malloc for when you don't need the control. It's great for code that will be used in many different scenarios.
Choose the size representation that bests fits your use case:

  // Put this in header to help user calculate allocation needs but hide size from user
  size_t LIBNAME_alloc_size(param1, param2, ...);

  // Put this in the header to hide the size from user code but allow inlined size calculations
  extern const size_t LIBNAME_ALLOC_X;

  // Put this in the header to make size known to user (for static const allocation)
  #define LIBNAME_ALLOC_Y ((size_t)42)