| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by torusle 806 days ago

> Linked lists are taught as fundamental data structures in programming courses, but they are more commonly encountered in tech interviews than in real-world projects.

I beg to disagree.

In kernels, drivers, and embedded systems they are very common.

10 comments

saghm 806 days ago

Most people who take data structures courses or perform tech interviews don't end up working on kernels, drivers, or embedded systems though. To me, it sounds like the point being made is that there are a large number of programmers who have learned about linked lists but haven't run into many cases where they needed them in the world world, and I think it's accurate.

dmitry_dygalo 806 days ago

This was my intention

SoftTalker 806 days ago

Agree, I can't recall using anything more complicated than lists/arrays or hash tables (key/value stores) in practice, in many years of (mostly web application) programming. And even those I'm not coding from scratch, I'm using classes or functions that my programming language gives me. For anything more complicated than that, I'm using a database, which of course is using many data structures under the covers but I don't directly touch those.

sumtechguy 806 days ago

I used to use them all the time. However, now? I would be hard pressed to not use one of the many built in vector/list/dict/hash items in many languages now. I would have to be truly doing something very low level or for speed to use one.

josephg 806 days ago

As a counterpoint, I’ve been working on collaborative text editing. I ended up implementing a custom b-tree because we needed a few features that I couldn’t find in any off the shelf library:

- My values represent runs of characters in the document.

- Inserts in the tree may split a run.

- Runs have a size - 0 if the run is marked as deleted or the number of characters otherwise. The size changes as we process edits

- Every inserted character has an ID. I need to be able to look up any run by its ID, and then edit it. (Including querying the run’s current position in the tree and editing the run’s size).

It’s an interesting data structure problem, and it took a few weeks to have a good solution (and a few weeks more later rewriting it in safe rust & improving performance in the process).

I love this stuff. I think it’s pretty rare to find a reason to code your own collection types these days, but it certainly comes up from time to time!

sumtechguy 805 days ago

> I love this stuff. I think it’s pretty rare to find a reason to code your own collection types these days, but it certainly comes up from time to time!

Absolutely! That is one of the places you want to use that style of programming. As the base classes and built in structs do not really cover it yet.

Also as a counterpoint sometimes the built in ones have some very interesting degenerate cases. I had one in an old library that basically doubled its memory footprint every time you exceeded its buffer. That was a point to change it to be a fixed allocation or something else. If i had no idea of the fundamentals I would have been totally in the weeds and no idea why it was doing it.

samatman 806 days ago

You need a linked list to write hello world in any Lisp, though.

Seems like the glaring exception to the rule!

lispm 806 days ago

As source code, but not necessarily as running code.

SBCL:

    * (defun hello-world () (write-string "hello world"))
    HELLO-WORLD

    * (disassemble #'hello-world)
    ; disassembly for HELLO-WORLD
    ; Size: 36 bytes. Origin: #x100311C85C                        ; HELLO-WORLD
    ; 5C:       AA0A40F9         LDR R0, [THREAD, #16]            ; binding-stack-pointer
    ; 60:       4A0B00F9         STR R0, [CFP, #16]
    ; 64:       EAFDFF58         LDR R0, #x100311C820             ; "hello world"
    ; 68:       570080D2         MOVZ NARGS, #2
    ; 6C:       29EC80D2         MOVZ TMP, #1889
    ; 70:       BE6B69F8         LDR LR, [NULL, TMP]              ; WRITE-STRING
    ; 74:       DE130091         ADD LR, LR, #4
    ; 78:       C0031FD6         BR LR
    ; 7C:       E00120D4         BRK #15                          ; Invalid argument count trap

The actual code for this example is machine code (which references a string, which is a vector), here without linked lists.

kazinator 806 days ago

No, you don't need to use linked lists to send a string to the standard output port in most Lisps. You just call a function.

samatman 805 days ago

(write-string "hello world") has linked list semantics, the fact that compilers can be smart enough to ignore this is not the point I'm making.

(write-string (cdr '(write-string "hello-world"))) also has to work, so it's pretty easy to materialize that semantics at any point.

kazinator 805 days ago

> has linked list semantics

Nope; it has linked list syntax (that certainly isn't ignored even by very good compilers). Syntax isn't semantics.

The semantics is that a function write-string is called, with a string as its argument.

The second expression has linked list processing in its semantics because you stuck in a cdr, as well as a quote which makes a piece of the program available as run-time list datum. (This is semantics that could be easily optimized away in the executable form, but I would say that it has linked list processing in its abstract semantics.)

samatman 803 days ago

> Nope; it has linked list syntax (that certainly isn't ignored even by very good compilers)

We're looking at the same string and seeing different things. You're seeing `(write-string "hello world")` as a program, I'm seeing it as an expression.

It has linked list semantics, which you can preserve until runtime like this `'(write-string "hello world")`. Note that I didn't change the string, I changed its context. If the original were living in a string, and you called read on it, it would become a linked list. If you called eval on that list, it would become a function call. This is basic stuff which I'm well aware you know, so I'm not sure what all the quibbling is about.

You literally need a linked list to write a program in a language in which the code becomes linked lists. And you're going to have a bad time writing Lisp if you don't get the hang of cons cells, early and often.

Is "code is data" true, or false? You're trying to have it both ways here.

The "large number of programmers who have learned about linked lists but haven't run into many cases where they needed them in the world world" include approximately zero programmers who have wielded Lisp in anger, is my point. I thought that was pretty clear from context, but I guess not.

ComputerGuru 806 days ago

Really only because they’re so goddamn easy. I find myself using linked lists a lot less since adopting rust for embedded code (even with no_std and no allocator, but especially when alloc-only std data structures are within reach).

wesnerm2 806 days ago

Linked lists were heavily used in application software before the appearance of standard libraries and Java, which is when dynamically sizable array-based lists become common. There also wasn't a gap between the performance of linked lists and arrays before CPU became significantly faster than RAM.

hi-v-rocknroll 806 days ago

Modern processor and cache performance lend themselves to vectors and SSA. Linked lists just don't scale well outside of niche uses.

ziddoap 806 days ago

>In kernels, drivers, and embedded systems they are very common.

Out of all the programmers in the world, what percentage of them do you think work in the kernel/driver/embedded spaces?

sfink 806 days ago

First, my only guess is that everyone's guesses are going to be wildly wrong. People who work in such spaces will greatly overestimate. People who don't will greatly underestimate. (This is mostly due to how many comments I've read on HN that implicitly assume that most people's problems and perspectives are the same as the commenter's.)

Second, linked lists are useful in a lot more places than that. Probably a better proxy would be low-level coders. You almost always want a linked list somewhere when you're dealing with memory addresses and pointers. Maybe not for the primary collections, but there are always secondary ones that need to maintain a collection with a different membership or ordering, and vectors of pointers don't have many clear advantages over intrusive linked lists for those secondary collections.

josephg 806 days ago

Yeah intrusive collections in C is the biggest use I’ve seen. I played with a physics engine a few years ago (chipmunk2d) which made heavy use of intrusive linked lists to store all the objects in the world model. I suspect there’s some clever data structures out there that might have better performance, but the intrusive linked list approach was simple and fast!

9659 806 days ago

1%

coldtea 806 days ago

More like 0.01% -- if we consider enterprise programmers, web programmers, and application/game programmers which I'd expect to be the largest groups...

hi-v-rocknroll 806 days ago

Yep. There aren't many software developers I know who have ever touched {Linux, macOS, FreeBSD, Windows} kernel code except for embedded devs, driver devs, security researchers, hobbyists, and SREs/PEs.

The % who have touched kernel bits, wrote a triangle engine scene renderer, wrote a compiler, touched server metal in production, have worked on ASICs, and can put together ML/AI building blocks shrinks way, way down to a handful of living humans.

9659 805 days ago

if the value really is 0.01%, then the education pipeline needs to be revised. 'blue collar' programmer positions should be the majority.

coldtea 804 days ago

This not about blue collar vs white colar. After all corporate programmers and web programmers can both be blue colar, and systems programmers can be white colar (if we're using "blue colar" to mean smaller salaries and fewer percs - otherwise programming is a white colar job anyway).

This is about how many work in kernels/embedded systems/etc vs more common programming gigs. And that's less about how many are trained to do so, but rather how many are needed.

TheCondor 806 days ago

There are plenty of good uses for linked list and their variants. Like LRU lists come to mind; I couldn't bet that it's the most efficient way to implement them but they're pretty darn good. Then obviously things like breadth first search need a type of queue data structure. It often can come down to memory pressure, if you've got Gigs to spare, then allocating a contiguous block of memory for a list of something isn't a big deal, if memory is tight and maybe fragmented, linked lists can and will get it done. They have their places.

I did start to encounter some fresh grads with degrees that said "computer science" on them that couldn't answer some basic linked list questions. I was beginning to think it was a bad set of questions until I hit those kids. If you claim to know "computer science" and don't know what a linked list is, especially beyond some text books stuff, I'm probably not interested.

_xnmw 806 days ago

Why? Why would someone reach for a linked list in a kernel, driver, or embedded system?

sratner 806 days ago

No memory allocation/reallocation, preallocated resources managed in e.g. a free list. Also for things like packetized networks, lists are handy for filling as you progress down the stack while using fixed sized packet buffers, or reassembling fragments.

In embedded world, memory often needs to be exactly controlled, and allocation failures are fatal without a more complex MMU. In kernel world, I believe the main reason is that allocations can block.

cyberax 806 days ago

In kernels, it's usually hard to get general-purpose allocation working reliably in all contexts. And you need that for resizable vectors. With lists, you just need to be able to grab an element-sized block. Quite often, it's even done with the memory page granularity.

In addition, a lot of data structures might be shared across multiple cores. Linked lists can be traversed and mutated concurrently (although with a bit of care).

dist1ll 806 days ago

I wonder how much of that is due to the kernel history, and the influence of C idioms, and not because of some inherent design superiority.

I'd be convinced once I see pure Rust kernels geared towards modern machines suddenly using linked lists everywhere. Otherwise I'm leaning towards it being a side-effect of the language choice and culture.

Also because I've seen the same kind of reasoning applied to compilers (e.g. "of course you need linked lists in compilers, they are extremely graph traversal heavy"). But one look at modern compilers implemented in Rust paint a very different picture, with index-based vectors, data-oriented design and flattened ASTs everywhere.

cyberax 805 days ago

Getting a general memory allocator working in kernel contexts is a hard task. You need to make sure it can't block and is re-enterable, that it doesn't result in fragmentation, and that it can be used from multiple threads.

It can be solved (or worked around), but it's understandable that people don't _want_ to do that.

kevingadd 806 days ago

Intrusive lists are really powerful for those kinds of scenarios, and technically are linked lists. They're widely used in the kernel, IIRC.

akira2501 806 days ago

O(n) iteration but pretty much guaranteed O(1) for every other operation. If that's the semantic you need, then linked lists are your friend.

vineyardlabs 806 days ago

Any time you have a computer interacting with the outside world in an asynchronous fashion you basically have to have some form of buffering which takes the form of a queue/fifo. A linked list is the most performant/natural way of modeling a queue in our ubiquitous computing infrastructure.

I/e in a DMA-based ethernet driver, the ethernet MAC receives packets asynchronously from the processor, perhaps faster than the processor can ingest them. So the mac interrupts the processor to give it new packets, and the processor can't sit processing the packets in the interrupt context, so it needs to put them into some ordered list for processing later when it has downtime. In a true embedded system, the memory for this list is going to be fixed or statically allocated, but you still don't really want to have an array-style list with fixed indexing, as you'll have to manage what happens when the index wraps around back to 0 etc, so instead you just construct a linked list in that pre-allocated memory.

I wouldn't say linked lists aren't really used in high-level applications, as I said they're used all over the place whenever you have external asynchronous communication, it's just that modern high-level frameworks/libs totally abstract this away from most people writing high level code.

dmitry_dygalo 806 days ago

Easier to avoid allocation errors, e.g. in the Linux kernel. I think Alice Ryhl mentioned it here - https://www.youtube.com/watch?v=CEznkXjYFb4

SJC_Hacker 806 days ago

How do linked list prevent allocation errors? If anything it would seem to make them worse.

My experience in embedded, everything is hardcoded as a compile time constant, including fixed size arrays (or vectors of a fixed capacity)

sfink 806 days ago

Intrusive linked lists eliminate the allocation entirely. With a vector<Obj>, you have the Obj allocation and then potential vector-related reallocations. With an intrusive linked list, you only have the Obj allocation. So your code that adds/removes list entries does no additional allocation at all, it reuses a pointer or two that was allocated as part of the original Obj allocation. Often the list manipulation happens at a time when allocation failures are inconvenient or impossible to handle.

sratner 806 days ago

In more complex embedded software you are likely to see free lists used to manage pools of preallocated resources (like event structs etc) or pools of fixed sized memory buffers.

torusle 806 days ago

In embedded, you often need message queues.

A common way to implement these is to have an array of messages, sized for the worst case scenario and use this as the message pool.

You keep the unused messages in a single linked "free-list", and keep the used messages in a double linked queue or fifo structure.

That way you get O(1) allocation, de-allocation, enqueue and dequeue operations for your message queue.

Another example for this paradigm are job queues. You might have several actuators or sensors connected to a single interface and want to talk to them. The high level "business" logic enqueues such jobs and an interrupt driven logic works on these jobs in the background, aka interrupts.

And because you only move some pointers around for each of these operations it is perfectly fine to do so in interrupt handlers.

What you really want to avoid is to move kilobytes of data around. That quickly leads to missing other interrupts in time.

dmitry_dygalo 806 days ago

I'd say most developers don't write kernels/drivers or embeds, at least from what I've seen. I am not saying that there are not many devs like this, but rather that there are fewer kernel devs than web devs.

hi-v-rocknroll 806 days ago

I beg to disagree^2. Tasks, threads, and processes are often structured as rings where there is always a "next" to maintain simplicity of task switching. The overall architecture of resources is modelable as cyclic graphs but implemented as rings, deques, single LLs, and other data structures.

davexunit 806 days ago

I don't do any of those things and I still use lists constantly. Kinda strange to learn that many others don't use them at all it seems.

nequo 806 days ago

What kinds of things are you using them for usually? Is it mostly in C/C++?

waynesonfire 806 days ago

linked lists shine when you can perform a O(1) remove operation if you have a reference to an object on the list. This is very common when using C structs and not possible in Java for example.

adgjlsfhk1 806 days ago

these cases are usually cases where you want to use a (hash) Set. If you're Ok changing everyone's indexing, the indexing didn't matter.