Hacker News new | ask | show | jobs
by alkonaut 2273 days ago
Why is there such little emphasis on “traditional” testing, that is, regular unit tests? At least some portion of the code base is surely suitable for normal unit tests. For example data structures, scheduling algorithms, file systems, ...
5 comments

I think the main reason is that the Linux kernel (and similarly the *BSD kernels) are written in a programming language (C) that doesn't make it easy to do that. Code is often directly built on top of other kernel subsystems without any dependency injection whatsoever. This means that it's still possible to do unit testing of parts of the kernel, but it takes a crazy amount of effort, such as overriding symbols, overriding include paths and provide stub headers, etc..

I am well aware that it's also possible to have dependency injection in C by using structs with function pointers, but I think we can all agree that it's a lot less pleasant to use than C++ abstract base classes, Go interfaces or Rust traits. This is why the Linux kernel only tends to use this sparingly (e.g., inode operations).

This is probably the reason. I work at a place where most software is written in C, and I see the same thing here: literally all tests are either manual or integration tests. Unfortunately, this also means that it takes about about half a day to 'run the tests'.
I work on IOS-XR, a router OS written in C.

I agree: UT is a pain to write precisely because you need to spend quite a bit of effort to stub out your dependencies. And when you do stub things out, they usually end up being “dumb” stubs where the function just returns EOK. Thankfully, there has been a recent effort in XR to leverage the Cmocka test framework to make stub functions a bit smarter.

Even if you have great UT, there is a bigger issue: the UT only tests and validates your code, but does not validate interactions with other components. With a system as complex as IOS-XR, there are non-trivial situations that you simply cannot trigger with UT.

This is where IT shines, imo: you can bring up a full router and test all known interactions at the system level. The test runtime is much longer, of course, but in my experience, it’s worth the wait to avoid hitting the issue down the line.

You don’t need a dependency injection framework to write unit tests, you just need cleanly separable units with well defined interfaces.
Note that I am not saying you need a dependency injection framework (like Google Guice/Dagger for Java); I’m merely talking about dependency injection as a concept.

Abstract base classes, interfaces and traits allow you to add dependency injection with relatively little code. In C it is simply more of a hassle, which is why folks don’t tend to do it.

And, in the absence of a dependency injection framework, it's likely that the units are not cleanly separable - because, without a DI framework, all classes are (presumably?) instantiating their dependencies directly.

Unless I've missed something? I've only ever worked in Java so maybe things are different in C-world,

> without a DI framework, all classes are (presumably?) instantiating their dependencies directly. ... I've only ever worked in Java so maybe things are different in C-world,

Well, for one thing, there are no classes in C. :) It is possible but unfun to emulate them with function pointers. Iiuc, little of the Linux kernel is written in that style.

Also, FYI, for many years we did DI without frameworks, using the factory pattern and other techniques. It wasn't always fun but it can certainly be done without Spring or whatever the new thing on the block is.

> Well, for one thing, there are no classes in C. :) It is possible but unfun to emulate them with function pointers. Iiuc, little of the Linux kernel is written in that style.

object-structs with function-pointers-for-methods are super-common in the Linux kernel and basically used everywhere for everything where modules can plug something into the kernel (e.g. virtually all drivers have at least one of these).

Thanks for the correction. I was going off the little bit of Linux code I've read, which seems to call most functions directly. And also another comment on this story. I don't know what to think now.
This is a major advantage of NetBSD's Rump kernels that are used for automated testing. Some people have tried doing the same for Linux but I'm not sure if any such efforts are still in progress.

  > it takes a crazy amount of effort
I agree with basically everything you've said but I don't buy that it takes a crazy amount of effort to do anything. You have C. If it's hard to do in C, you have a Makefile. If it's hard to do with a Makefile, you can run a script during the build process. Anything can be streamlined.

  > it's also possible to have dependency injection in C by using structs with function 
  > pointers, but I think we can all agree that it's a lot less pleasant to use than C++ 
  > abstract base classes
I hate function pointers, and void* context pointers even more, so I wrote macros to do binary search and sorting so I didn't have to pass a void* to qsort(3) and bsearch(3) (also, bsearch(3) doesn't tell you the insertion point of a missing element)

If you want to sort an array:

  int arr[] = {5, 10, 15, 17, 20};
  size_t size = sizeof(arr) / sizeof(*arr);
  QSORT(arr, size, arr[a] < arr[b]);
If you want to find the value 5 in that array:

  ssize_t index;
  BSEARCH_INDEX(index, size, arr[index] - 5);
  // Now 'index' has the result.
With regards that anything can be streamlined: sure, but it’s also about the amount of investment that would take. You could spend days or weeks to automate all of this for C. Meanwhile for Go there exists a tool called ‘mockgen’ (https://github.com/golang/mock) that can automatically stomp out mocks for any interface type declared in code. Not just for the ones in your codebase, literally arbitrary ones: interfaces part of the Go standard library, ones that are declared in third-party dependencies.

The fact that you hate function pointers and void* context pointers is an exact confirmation of my premise: people think it’s too much of a hassle.

  > With regards that anything can be streamlined: sure, but it’s also about the amount of investment 
  > that would take.
Yes, I can't deny there is more up-front cost in C for some things.

  > The fact that you hate function pointers and void* context pointers is an
  > exact confirmation of my premise: people think it’s too much of a hassle.
My point was that there's usually a better way to get around a language's (in this case, C) limitations, and it's not necessarily macros every time. At least for the problem of abstract base classes, I rather liked your hinting of the linker swapping out the desired implementation for test binaries. That makes sense, since I think I've never seen an abstract base class (which is abstract for testing purposes) have more than one implementation per binary.

As for mocks, the fact that they're hard to do in C may be a feature in disguise...

Linux kernel project predates what you call traditional unit testing practices†. Its success in the first decade and a half coupled with pragmatics of hardware testing make the flow what it is now.

† Regression testing was certainly known then, but it was not a dogmatic movement yet.

I understand there are hurdles due to legacy, language, low level etc. But if 1 or 2% of a huge code base is easily testable, shouldn’t it be? At least if/when regressions are found in functionality that can be easily testable (pure functions etc) it would seem prudent to add regression tests to prevent the thing from happening again. Even in a 30 year old C code base.
You could test data structures (which tend to be implemented using macros), but any nontrivial subsystem of a monolithic kernel lives in a web of dependencies with other subsystems. This is especially true of Linux.

To solve this, you would need to either use a hierarchical decomposition of subsystems or do some crazy mocking to run subsystems outside of the full kernel.

There is a recent project (KUnit) to add a unit testing framework to the Linux kernel, but it remains to be seen how much adoption it will get.

NetBSD has their rump kernel tech, which specifically exists to run chunks of kernel code independently. It can do a lot of neat things (I liked the "run netbsd drivers on other OSs" trick, personally), but one of the big uses they've mentioned is that it helps testing and development.
A lot of the linux kernel is implemented as modules, which does allow said modules to be run under different OSes. You could probably use the module interface to implement unit testing.
I think many kernel developers view that as a problem as well. KUnit hopefully will change this when it becomes more widely used. I recommend more kernel developers check it out, it's quite nice even in its current form!
How do I test a driver for which I do not possess the hardware?
You write the software in such a way that instead of just reading and writing registers or memory you exercise some set of functions. In normal operation you pass the driver a set of real functions that read and write real registers. In testing you pass functions that do other things. This makes it quite easy to exercise the features of the hardware that are rarely seen in the wild. For example most IO adapters and NICs have some kind of signal that they are overheating. Most Linux drivers simply ignore or malfunction when these conditions are raised, because the author of the driver never got a chance to manually exercise that feature.

This is basic design for unit testing but it's impossible in Linux because Linux lacks a zero-cost abstraction that would let you mock out a device. C only has costly abstractions such as tables of function pointers.

You can compile object file in isolation and provide mocked implementations for all imported functions.
You test the logic in unit tests with any hardware interaction mocked. Even if you can't test a lot of the driver, there is surely some logic (data structure manipulation, buffer construction, etc) that you have factored out into testable functions.
What if there's no mock for the hardware? Whoever wrote the driver didn't supply one. Is it better to not accept drivers unless there's mocks? Do you know how few drivers Linux would have in that case?

FWIW, I agree 100% with you. It's just simply not the way the world works.

You write your own simple mocks?
I should add that AWS is the only thing I’ve ever mocked that had third party mock tools available, everything else I’ve ever worked on required us to write our own. I’ve never written device drivers, so I’m not arguing that it’s easy or common to do, just that’s what I would do at least as much as possible.
Write hardware emulation for VM with any behaviour you want to test.
Then I end up with a driver that conforms to emulated hardware and not it’s real counterpart
It seems like that's a concern with any testing strategy that mocks out some part of the system. Obviously there's no getting around actually testing against the hardware, but it seems like it could still be useful for the same reason tests with mock implementations are useful generally.
That's how unit tests works.
You port the driver to Rust, obviously. Then run the driver in a docker container that communicates with a serverless unit testing framework written in node.js va JSON commands.