KUnit and Assertions
KUnit has been seeing a lot of use and development recently. It's the kernel's new unit test system, introduced late last year by Brendan Higgins. Its goal is to enable maintainers and other developers to test discrete portions of kernel code in a reliable and reproducible way. This is distinct from various forms of testing that rely on the behavior of the system as a whole and, thus, do not necessarily always produce identical results.
Lately, Brendan has submitted patches to make KUnit work conveniently with "assertions". Assertions are like conditionals, but they're used in situations where only one possible condition should be true. It shouldn't be possible for an assertion to be false. And so if it is, the assertion triggers some kind of handler that the developer then uses to help debug the reasons behind the failure.
Unit tests and assertions are to some extent in opposition to each other—a unit test could trigger an assertion when the intention was to exercise the code being tested. Likewise, if a unit test does trigger an assertion, it could mean that the underlying assumptions made by the unit test can't be relied on, and so the test itself may not be valid.
In light of this, Brendan submitted code for KUnit to be able to break out of a given test, if it triggered an assertion. The idea behind this was that the assertion rendered the test invalid, and KUnit should waste no time, but proceed to the next test in the queue.
There was nothing particularly controversial in this plan. The controversial
part came when Frank Rowand noticed that Brendan had included
a call to BUG()
,
in the event that the unit test failed to abort when instructed to do so. That
particular situation never should happen, so Brendan figured it didn't make
much difference whether there was a call to BUG()
in there or not.
But Frank said, "You will just annoy Linus if you submit this." He pointed out
that the BUG()
was a means to produce a kernel panic and hang the entire
system. In Linux, this was virtually never an acceptable solution to any
problem.
At first, Brendan just shrugged, since as he saw it, KUnit was part of the
kernel's testing infrastructure and, thus, never would be used on a production
system. It was strictly for developers only. And in that case, he reasoned,
what difference would it make to have a BUG()
here and there between friends?
Not to mention the fact that, as he put it, the condition producing the call to
BUG()
never should arise.
But, Frank said this wasn't good enough. He said that whether you felt that KUnit
belonged or didn't belong in production systems, it almost certainly would find
its way into production systems in the real world. That's just how these things
go. People do what isn't recommended. But even if that were not the case, said
Frank, non-production systems likewise should avoid calling BUG()
, unless
crashing the system were the only way to avoid actual data corruption.
Brendan had no serious objection to ditching the call to BUG()
, he was just
posing questions, because it seemed odd that there would be any problem. But, he
was fine with ditching it.
So the feature remains, while the error handling will change. An interesting thing about this particular debate is that it underscores the variety of conflicts that can emerge with so many debugging and error-handling aspects of the kernel. All sorts of conflicts and race conditions might emerge.
For example, a developer might write a new driver and want to test how it behaves under heavy load. So they'll run a memory-intensive process while using their driver, only to discover that the kernel's out-of-memory (OOM) killer kills the process generating the load, before the key test situation can be triggered within the driver.
It's amazing to consider the sheer quantity of testing and debugging features that have encrusted themselves on every aspect of the Linux kernel development process. Even git itself, the revision control system created by Linus Torvalds specifically to host kernel development, is itself a debugging tool that ensures it is possible to identify and possibly revert changes that turn out to cause a problem. In addition to everything else, there also are a wide array of automated systems running within a variety of private enterprises. Some of those load up running systems with particular workloads; some read the source code directly, looking for patterns. It's impossible to know the full variety and extent of testing that the Linux kernel receives on a daily basis.
Note: if you're mentioned above and want to post a response above the comment section, send a message with your response text to ljeditor@linuxjournal.com.