Types of Testing
The difference between unit, functional, and system tests
October 19, 2018Over the years, I've seen many discussions of testing get hung up on discussions of what the different kinds of testing are, what they're for, and which ones are more important. I'm not going to try to define these terms for anyone else but, as context for other things I'm likely to write in the future, I'm going to explain how I tend to think about them.
- A unit test is about testing many code paths through a single coherent body of code - such as a library or small module within a larger system. A good unit test ensures that the code being tested will continue to function correctly even when other code starts calling differently.
- A functional test is about testing fewer but longer code paths through multiple components - but not necessarily the whole system. A good functional test ensures that the same operation will continue to work end to end, even if other operations involving the same components might fail (hopefully they're covered by other functional tests).
- A system test is about two things. First, in keeping with the theme developed so far, it's about testing many long code paths, with high levels of concurrency and repetition (and sometimes plain old wall-clock time) to catch problems that unit and functional tests can't. Second, it's about testing the real system - real hardware, real scale, real workloads, etc. If it's unreal, it's only because the test is working the system harder than any user is ever likely to (though they could).
An important point is that all three are important. The importance of system tests should be obvious, but by their nature they tend to be very time consuming and resource intensive. Unit and functional tests are supposed to provide orthogonal cross-sectional views of the code, and do so quickly, to avoid those sunk costs. People's time and attention being finite, there often seems to be a tension between the two, but foregoing either is a bad idea. Either way, you're likely to end up with errors being caught in system test that should have been caught sooner and more cheaply.
The one remaining question is: how real should the environment be for functional tests? Should it be more like unit tests, with lots of things mocked or stubbed out? Or should it be more like system tests, on real (usually meaning distributed) hardware? That's why I haven't mentioned integration tests yet. Some people use it as a synonym for system test. Some people use it for a set of functional tests at that end of the spectrum. I don't use it, because of that ambiguity.
I don't think there has to be only one answer to the question of how real a functional test should be, but in keeping with the intent that they be quick and cheap I tend toward "as real as they can be on a single system" as the default answer. Maybe ordering/timing conditions that would occur in a more realistic environment won't happen on a single system, but even a full system test can miss those. Doing the extra work to force ordering and timing is an important part of uncovering race conditions, and it's entirely possible even in a quick-turnaround single-system functional test.