Hacker News new | ask | show | jobs
by _xnmw 760 days ago
Why? Why would someone reach for a linked list in a kernel, driver, or embedded system?
6 comments

No memory allocation/reallocation, preallocated resources managed in e.g. a free list. Also for things like packetized networks, lists are handy for filling as you progress down the stack while using fixed sized packet buffers, or reassembling fragments.

In embedded world, memory often needs to be exactly controlled, and allocation failures are fatal without a more complex MMU. In kernel world, I believe the main reason is that allocations can block.

In kernels, it's usually hard to get general-purpose allocation working reliably in all contexts. And you need that for resizable vectors. With lists, you just need to be able to grab an element-sized block. Quite often, it's even done with the memory page granularity.

In addition, a lot of data structures might be shared across multiple cores. Linked lists can be traversed and mutated concurrently (although with a bit of care).

I wonder how much of that is due to the kernel history, and the influence of C idioms, and not because of some inherent design superiority.

I'd be convinced once I see pure Rust kernels geared towards modern machines suddenly using linked lists everywhere. Otherwise I'm leaning towards it being a side-effect of the language choice and culture.

Also because I've seen the same kind of reasoning applied to compilers (e.g. "of course you need linked lists in compilers, they are extremely graph traversal heavy"). But one look at modern compilers implemented in Rust paint a very different picture, with index-based vectors, data-oriented design and flattened ASTs everywhere.

Getting a general memory allocator working in kernel contexts is a hard task. You need to make sure it can't block and is re-enterable, that it doesn't result in fragmentation, and that it can be used from multiple threads.

It can be solved (or worked around), but it's understandable that people don't _want_ to do that.

Intrusive lists are really powerful for those kinds of scenarios, and technically are linked lists. They're widely used in the kernel, IIRC.
O(n) iteration but pretty much guaranteed O(1) for every other operation. If that's the semantic you need, then linked lists are your friend.
Any time you have a computer interacting with the outside world in an asynchronous fashion you basically have to have some form of buffering which takes the form of a queue/fifo. A linked list is the most performant/natural way of modeling a queue in our ubiquitous computing infrastructure.

I/e in a DMA-based ethernet driver, the ethernet MAC receives packets asynchronously from the processor, perhaps faster than the processor can ingest them. So the mac interrupts the processor to give it new packets, and the processor can't sit processing the packets in the interrupt context, so it needs to put them into some ordered list for processing later when it has downtime. In a true embedded system, the memory for this list is going to be fixed or statically allocated, but you still don't really want to have an array-style list with fixed indexing, as you'll have to manage what happens when the index wraps around back to 0 etc, so instead you just construct a linked list in that pre-allocated memory.

I wouldn't say linked lists aren't really used in high-level applications, as I said they're used all over the place whenever you have external asynchronous communication, it's just that modern high-level frameworks/libs totally abstract this away from most people writing high level code.

Easier to avoid allocation errors, e.g. in the Linux kernel. I think Alice Ryhl mentioned it here - https://www.youtube.com/watch?v=CEznkXjYFb4
How do linked list prevent allocation errors? If anything it would seem to make them worse.

My experience in embedded, everything is hardcoded as a compile time constant, including fixed size arrays (or vectors of a fixed capacity)

Intrusive linked lists eliminate the allocation entirely. With a vector<Obj>, you have the Obj allocation and then potential vector-related reallocations. With an intrusive linked list, you only have the Obj allocation. So your code that adds/removes list entries does no additional allocation at all, it reuses a pointer or two that was allocated as part of the original Obj allocation. Often the list manipulation happens at a time when allocation failures are inconvenient or impossible to handle.
In more complex embedded software you are likely to see free lists used to manage pools of preallocated resources (like event structs etc) or pools of fixed sized memory buffers.
In embedded, you often need message queues.

A common way to implement these is to have an array of messages, sized for the worst case scenario and use this as the message pool.

You keep the unused messages in a single linked "free-list", and keep the used messages in a double linked queue or fifo structure.

That way you get O(1) allocation, de-allocation, enqueue and dequeue operations for your message queue.

Another example for this paradigm are job queues. You might have several actuators or sensors connected to a single interface and want to talk to them. The high level "business" logic enqueues such jobs and an interrupt driven logic works on these jobs in the background, aka interrupts.

And because you only move some pointers around for each of these operations it is perfectly fine to do so in interrupt handlers.

What you really want to avoid is to move kilobytes of data around. That quickly leads to missing other interrupts in time.