Hacker News new | ask | show | jobs
by andreasvc 3348 days ago
Why would one ever prefer a linked list over a dynamic array? I know about the asymptotic performance, but if you take actual memory performance into account (locality, cache, pointers are slow, etc), it seems dynamic arrays should be simpler and perform better.
10 comments

> Why would one ever prefer a linked list over a dynamic array?

(Singly) linked lists are trivial to implement in an immutable way (the article talks about the difficulty of doubly-linked lists, implemented in a mutable way).

It's trivial to have linked lists share a tail; this makes datastructures like cactus stacks really easy.

Linked lists are amenable to reasoning by induction (e.g. in Coq, Agda, Idris, etc.).

Linked lists are amenable to lazy algorithms (e.g. iterators/generators).

Linked lists can be heavily optimised, e.g. using stream fusion.

That's off the top of my head. Note that in many cases it might be preferable to use linked lists in the source, but have them compiled to some other representation like arrays (or, in the case of fusion, a single tight loop).

* Also, lists are a reasonable way of managing unrelated types.
What you say doesn't apply to the kernel, which uses doubly linked lists.
Because with intrusive lists, insertion/removal can't fail due to memory allocation failure - with dynamic arrays, it can.

In the kernel, where you have to handle every single error path, this is a big deal.

There is a lot of reason why you can prefer a linked-list over an array, even with the hardware changing they are still a lot more efficient in some cases :

  * In multi-threaded context insertion and deletion can be done lock-free which is not the case for arrays ;

  * For intrusive list, insertion and deletion can't fail due to memory allocation ;

  * If you do only insertion at the tail and removal at the head these operation need only a pointer change which is more efficient than an array ;

  * Merging two list is only one pointer operation ;

  * Intrusive list have better memory allocation properties ;

  * ...
If you look at the usage of linked-list in the kernel you will see that they are used for reasons.

The problem is that, on new hardware, linked list should be used in less cases that in the past and some people have generalized this to linked-list should never be used. This is false, if you care about performances you should keep thinking and testing what is the best data-structure in different cases. If you don't care about performances, you should use an abstraction who implement what you need and don't car about the underlying data-structure.

> In multi-threaded context insertion and deletion can be done lock-free which is not the case for arrays

I just love when multithreaded code frees the element I currently iterate over in a different thread.

> For intrusive list, insertion and deletion can't fail due to memory allocation ;

As long as you store your elements on the stack or use static values you mean? Heap allocation happens at some point, you just move the location where it happens.

> If you do only insertion at the tail and removal at the head these operation need only a pointer change which is more efficient than an array

Use a circular buffer. Until the array is full you can just modify the start/end pointers.

More important than any of the sibling arguments is that we're talking about a socket control block. I don't control where it is allocated or freed, so I can't dictate its location in memory.

Even if I did control it, there are probably several lists a socket might be a part of. Storing the socket in a dynamic array will only work for one of those lists. It's similar to a sparse database index - you can only have one of them. The rest of the lists have to be linked.

Not to mention, when items are shuffled around in a dynamic array, you lose the ability to have a consistent pointer to them.

Insertion and deletion are prime examples, often requiring you to move large chunks of memory. The same operation just requires pointer changes (after finding where you want to insert or delete). I think in this example it would be common to be removing structs from the center of the list.
Insertion and deletion _were_ prime examples. It's not necessarily the case with today's hardware, especially until you reach some threshold of the number of elements (usually hundreds of thousands).

You say "the same operation just requires pointer changes (after finding where you want to insert or delete)" like it's some trivial thing that doesn't enter the equation. But finding where you want to insert or delete in a non-contiguous linked list is not a consideration that should be an afterthought in brackets.

Furthermore, when you have found the location, it's still not _just_ pointer changes. You actually have to allocate the memory for the node, too.

Traversing pointers and allocating memory is a lot slower than working with a contiguous block of memory (assuming you don't have to re-allocate the contiguous block of memory, of course).

Either way, don't take my word for it. If you're a C++ programmer, just benchmark std::vector against std::list, or watch this video - https://youtu.be/YQs6IC-vgmo

You are right; all your arguments are valid in user space programming with normal (certain kinds) of data. But to be honest, you sound like someone who hasn't tried kernel or embedded development. You must have some experience with it to understand why these things don't always hold true.

> Furthermore, when you have found the location, it's still not _just_ pointer changes. You actually have to allocate the memory for the node, too.

This is not normally true. In embedded development and game engine memory managers, data nodes already include list pointers, and there are several ways the data nodes can come to exist that don't involve a separate allocation per node. Sometimes it's allocations of blocks of nodes. Sometimes it's slab allocations (to use the terminology from the article) - a memory manager that pre-allocates constant sized memory blocks.

Kernel lists don't normally involve hundreds of thousands of elements, and using dynamic arrays for small lists, the separate alloc & realloc time may not be even close to amortized away.

> just benchmark std::vector against std::list

This is unconvincing! Kernel & embedded code do not normally use std::list. There are very good reasons people warn against kernel development in C++! And performance benchmarks are the lowest item on the priority & constraint list. The least bad thing that can happen is going slower. Fragmentation is a bigger priority, and dynamic arrays have much worse fragmentation patterns because the allocation size changes over time.

Don't forget that there are valid uses of linked lists even in 2017. If there are more insertions / removals than scans, an array makes no sense.

Reallocating a big dynamic array is also a no-go in many time-constrained situations.

> You actually have to allocate the memory for the node, too.

The way this is done is by intrusive linking (the data must include list pointers). The data usually knows in which lists it is linked, so there is no point in having a version of the data without a list head.

Almost any insertion requires a scan.
Well, given that the pointers are contained in the nodes themselves, if a piece of code is already looking at a node when deciding if an insertion is needed, then no additional scan is required.
Many do, but there are also insertions in unsorted lists or insertions after a given element...
Cache means inserts into arrays are often faster than linked lists. Remember RAM is really really big and slow.
Not if the list is allocated within a slab, as described in the article. Imagine list elements allocated continuously (as in an array) but the ordering is decided by the pointers within each node, instead of assuming a fixed order. All the elements still are in the same cache line.
Linked lists take more memory than arrays, so even in your example it's not as obvious as your making it out to be. Further, your ordering assumes no deletes.
>... inserts into arrays are often faster than linked lists.

don't forget, deletes, as well.

Yea, it's really seeks being horribly slow on out of cache linked lists being horribly slow, and then inserts and deletes being fast once you find the right node.
I agree that in userspace, linked lists rarely are the way to go. Chasing pointers is too slow.

We're talking about kernelspace, though, and different rules apply there. Memory allocation works very differently, and you have to handle every possible error path.

I was trying to highlight that the need to find the target for insertion or deletion was still a notable operation, not as an afterthought. Sorry it came out that way! That said, you still might need to perform a similar operation to find the target for deletion when using an array. The same is true for allocations (as you pointed out).

I'm don't know how the slab cache allocation would benchmark against allocating larger chunks, I was just trying to give a classical answer to the question.

That's all well-and-good for benchmarks that don't need to delete elements, but suboptimal for kernel data-structures that don't require O(1) find but are mostly round-robinned like multiple sockets. Using reallocated arrays instead linked lists also increases memory bandwidth and puts kernel data-structures at increased risk of random bit flips (most machines lack Chipkill ECC), in addition to cache evictions.
> Insertion and deletion are prime examples, often requiring you to move large chunks of memory.

on modern cpu's with huge caches, i seriously doubt this claim.

fwiw, i have played around with synthetic problems where: i insert sequence of random integers into a sorted sequence, then remove those elements one by one as determined by a random sequence of positions. and almost always, vectors outperform lists by at least couple of orders of magnitude.

or to put it another way, it is almost never about either lists / vectors or something else, and boils down to couple of 'rules' e.g. access data predictably (avoid trashing the cache), keep data compact (more cache utilization) etc. etc.

> i insert sequence of random integers into a sorted sequence, then remove those elements one by one as determined by a random sequence of positions.

Which means that you scan the list twice for each node. First for insertion and then for deletion. Starting at maybe a hundred that would be much more efficient with a balanced search tree. They have a good Red-black tree in Linux.

But if it's only arrays vs lists - with linked lists, pointers to items are never invalidated. With dynamic arrays, they are.

Dynamic arrays are fine for plain old data arrays. But it's harder for data that has "identity" and is frequently mutated. And if a node is in multiple lists, I guess arrays are out. You'd need to store the data redundantly. Extra bad if you need atomic mutations.

Well on a 64-bit machine each node of a singly-linked list of integers will be 16 bytes (8 byte pointer + 4 byte integer + 4 byte for padding). That's 300% overhead. Becomes 500% for doubly-linked lists.

So it's not surprising that linked lists perform very poorly in such circumstances.

Because (doubly) linked lists can do something an array cannot? Like join operations in O(1)? Or removing an arbitrary element in O(1) once you have its handle?
if it's unordered (like you'd expect a list of sockets to be) you can get O(1) removal for an array by swapping the end item into the removed slot
...unless something else holds a pointer to the handle you're swapping with.
This is already a problem if you're removing an item, regardless of whether it's done with swap remove or not.
For the item you're removing, you are presumably removing it because all handles have been closed. But the item you're potentially swapping with may certainly have open handles.
True.
kernels don't tend to give userspace raw pointers into kernel memory
But kernels tend to give raw pointers to other kernel code.
a good chunk of the fun in kernel code is maintaining the various layers of self-referential lists and dealing with the cleanup
If you are using a slab allocator like in this case, you are effectively splitting the middle. Allocation of the nodes is contiguous in the slab, so locality of the nodes overall is likely better than malloc()ing each one. Allocation is cheap since you are likely just grabbing the head of a free list in the slab (or incrementing a counter if you never intend to free intermediate data), and you get less heap fragmentation/ higher utilization than a pure linked list. You still get fast insertion and removal anywhere in the list. There's a bit of storage overhead for pointers and pointer jumping on access, but ideally your list isn't super fragmented, so close to linear cache reads. If the list is long lived / has certain turnover patterns, some of these effects fall away. You could implement your own defrag/resort, but that assumes being able to relocate the data.

Also in some cases the linked list header may be embedded in other structures, so you'd have to pull the header out to go with a purely vector-based approach for when you need to shift (assuming you can't move the entire structure). And if the structure needs to reference itself in the list for quick removal, that's a problem if its index is changing. Otherwise you could just search the vector for references at linear cost.

> Why would one ever prefer a linked list over a dynamic array?

I thought this same thing when I started doing embedded console game development and saw a lot of linked lists going on. I came to realize there are a bunch of really good reasons to use linked lists.

The biggest difference in practice is that linked list pointers are commonly fields inside the data that you're linking, as opposed to being a separate data structure. With a dynamic array, it's usually a separate data structure. Neither of those is absolute or always, but that's how it normally plays out.

With that in mind:

Sometimes you're loading a block of pre-formatted binary data off disk or from a network that has space for linked list pointers. Rather than allocate anything else, you can fill in the list pointers.

Sometimes you have a lot of small lists to manage. If you're allocating space for your list data, and the dynamic array doesn't get big enough for amortizing to take over, then you're doing double allocations, one for the data and one for the dynamic array & potentially first couple of reallocs.

Fragmentation is worse with a dynamic array (when it's a separate data structure), and re-ordering events are much costlier in arrays than with linked lists.

> Why would one ever prefer a linked list over a dynamic array?

Consistent performance. Every insert and every delete requires the same amount of time, at the cost of a higher average traversal time. You can ameliorate some of the deletion reordering cost if you can change the order of the array, but not all use-cases allow for this.

Also, intrusively linked lists (that is, the same item represented in multiple lists) are just as poor, performance wise, when implemented with arrays.

I did a quick test of this at one point, comparing C++ std::list with a linked list; the performance of the dynamic array was significantly higher for most operations (about one order of magnitude for my test), but the worst-case was much, much worse than the linked list (somewhere close to two orders of magnitude).

No error handling code when adding an element to the set. Also, a list maintains elements in order, unlike an array pointing to objects that keep track of their position in said array.