Not an embedded systems developer so an honest question. What do you do instead of malloc? Have a large array on stack and manage memory within that manually?
I mostly allocate static areas in the BSS segment. That way, I know at compile time that I allocated my memory correctly, assuming that I have my stack under control.
Then I follow my two rules of embedded development:
- no recursion
- everything has to be O(1)
If I'm honest, I can't remember a project where I had to use even a pool allocator, which you would usually need if you were trying to do like, reorderable queues / lists / trees or so. I right now can't come up with a proper use case. If you do need to say, compute a variable length sequence of actions based on an incoming packet, then I would structure my code so that:
a) only the current action and the next action get computed (so that there is no pause in between executing them)
b) compute the next action when I switch over (basically with a ping-pong buffer)
c) verify real-time invariants
My most used structure is the ring buffer to smooth out "semi-realtime" stuff, and if the ring buffer overflows, well, the ring buffer overflows and it has to be dealt with. If I could have more memory I would just make the ring buffer bigger.
BSS is the section of your program's address space where all the un/zero-initialized memory lives, so just a global std::array<u64, 1024> foo{}; would be placed in BSS by the compiler.
BSS is also usually not included in the actual executable size (as it's marked as NOLOAD in the linkerscript), and needs to be zero-initialized by the C runtime if you want to guarantee that .
It's not done in the code itself, it's done in the compiler configuration. We specify a mapping of memory ranges (based on the hardware and how we want to use it) and the compiler assigns addresses to variables within those ranges as appropriate. gcc calls these files "linker scripts", armcc "scatter files" -- those are the keywords to look up for examples and documentation.
It's quite common in hard real-time systems, especially in aeronautics, to only allow malloc on startup if it's allowed at all. There are many problems with malloc() and especially free() - they typically don't have any maximum latency guarantees, and even worse, what happens when you can't get memory (e.g., due to leakage or poor packing)?
In many systems this isn't a problem. The number of engines, flaps, etc., don't change at run-time :-). If they change, you're on the ground in maintenance mode and can reboot.
Very small embedded systems tend to have a lot of short-lived items on the stack, and anything that lives longer then a function call exists in static memory at a fixed address. Memory pools are pretty common as well. Small systems tend to avoid a tradition heap, because they can get into trouble pretty easily.
Function variables, with scope and lifetime limited to the call, get their place on the stack as usual. Everything else -- i.e., constants, static function variables, and anything with higher scope -- is allocated its own memory at compile time. We have no heap. We use no variable-length arrays or other, more dynamic data structures. Anything that needs to grow and shrink does so within its own fixed-length buffer.
Then I follow my two rules of embedded development: - no recursion - everything has to be O(1)
If I'm honest, I can't remember a project where I had to use even a pool allocator, which you would usually need if you were trying to do like, reorderable queues / lists / trees or so. I right now can't come up with a proper use case. If you do need to say, compute a variable length sequence of actions based on an incoming packet, then I would structure my code so that:
a) only the current action and the next action get computed (so that there is no pause in between executing them)
b) compute the next action when I switch over (basically with a ping-pong buffer)
c) verify real-time invariants
My most used structure is the ring buffer to smooth out "semi-realtime" stuff, and if the ring buffer overflows, well, the ring buffer overflows and it has to be dealt with. If I could have more memory I would just make the ring buffer bigger.
I'm not sure how clear this explanation is :)