A few words about (unit) testing

Purpose of testing

I think every developer knows something about different types of testing (unit, integration, functional / acceptance, regression, etc.). Add to this performance testing, black-box testing and you get sizable family of possible tests to write. If you start differentiating UI testing as a separate functional testing, the family grows even more. First question most non-technical fellows will ask – why bother? They would claim that testing increases development cost (by increasing time) without bringing much value. To me the answer is: confidence. If the product is not tested, I have no confidence when introducing fixes & improvements. Yes, there are tests that are faster to write and more complex one (how do you test if the web page looks as expected? if the style is correctly applied, etc). There is always a trade off, and in some applications (or should I say in most applications), you will have to have some kind of manual testing. However, almost always unit testing + a bit of integration testing (test interactions between elements in your code) is a right thing to do.

I especially don’t like differentiating regression testing, because to me, any test is a regression test itself.

Testing is great because:

  • It protects against unwanted regression.
  • Is the most powerful documentation. People underestimate the power of the tests. When I start reading the method, I usually start with reading tests for it. If I wonder how it behaves in border case, I look in the test. If I am not sure of behavior in specific condition – I would add the test for this condition (so that the next person reading the code and having the same question could simply take a look at the test).
  • It allows us to be confident, when doing refactoring. It should also give us power to remove old code. We live in the world of source control systems, and people are still afraid of removing old, not used code. The problems with old code is that it increases maintainability cost and creates confusion.

And remember: “testing can only show the presence of errors, not their absence” (Dijkstra)

This article comments on couple of issues we often forget about.

Bug? Start with failing test!

I cannot stress it out enough. Whenever you implement a bug fix, you are often trying to reproduce it. Why not reproducing it as a unit test? By doing so you might prevent further regression. By writing a unit test you are making it obvious to other developers what is the border case / bug scenario, because they can simply look into the test. It also makes code review easier (you are doing code reviews, aren’t you?) – you don’t need to write a comment describing specific scenario in details – write a general comment, and describe a specific scenario as a unit test. Code speaks a thousand of words.

I will never trust the test I have not seen failing. (Martin Fowler)

This is a very true sentence. It is not about tests that are already in the codebase. You should never trust the test that you have written and it passed, without seeing it failing first. There are many reasons for that, but the primary one is that people make mistakes. Let’s say you are fixing a bug, and after that you are writing a regression test for a bug. Do the other way round. Start with the test that shows the bug, and then, and only then, write a code that fixes it. Sometimes I catch myself to do the opposite – fix a bug and then write a test (this only happens when I think the bug is obvious). What I do in such scenarios is that after I fix the bug and write a test, I revert the fix, run the test to make sure it actually shows the bug (meaning that it is failing), and only then apply the fix again. You would be surprised how many times the test actually passes even without a bugfix, because of some teeny tiny omission. It is easy, especially in big codebase, to forget about something, or to not know about something.

Always write a failing test first, and then make a fix. Doing other way round doesn’t guarantee that the test actually tests your bug fix, or any of your code.
This article is written from the point of view of platform building, however I think most of the rules apply to all types of projects

TDD

Writing about TDD would be a separate story. But basically, if you are following the point above – writing failing unit test before writing a code that implements it, you made a great step towards TDD. TDD – Test Driven Development (or it’s slightly more customer oriented younger brother: Behavior-driven development) is about stating expectations and only then implementing a change. It leads to more maintainable code (because you work in small chunk of code, and by making it easily testable you are making it actually greatly extensible) that is well tested. Most of the problems mentioned in this article can be avoided by applying TDD. Give TDD a chance!

Code coverage – a holy grail or great lie?

Code coverage has many variations. Function, Line (statement) coverage, branch coverage, condition, etc. Examples below will be focused on the “line coverage”, but some of them are still valid when branch coverage is in mind.
On one side, code coverage is great – it forces the most stubborn developers to write tests. However, there is also harm done – we get too confident when we see high code coverage and also developers who are forced to write tests, do not generally do a great job when writing them.

Whenever changing a method, you have to change or add the test

If you don’t, it means that your change is not tested. You can claim the coverage is still the same (see examples below), however the truth is, that your code is not truly tested. I like to ask developers, what will happen, if I remove the line that they just have added. Would tests still pass? I don’t like untested code – I often do refactoring and remove unused code, or improve code that is not very readable or inefficient (it is often my own code!).

Let’s say you have following code:

public static bool IsPrimeBuggy(int n) {
    for(int i = 2; i < Math.Sqrt(n); i++) {
        if(n % i == 0) {
            return false;
        }
    }

    return true;
}

There are couple of issues with this code, but let’s focus on one. Let’s say we have following test cases:

[TestCase]
public void IsPrime()
{
    Assert.True(PrimeHelper.IsPrimeBuggy(2));
    Assert.True(PrimeHelper.IsPrimeBuggy(3));
    Assert.True(PrimeHelper.IsPrimeBuggy(5));
}

[TestCase]
public void IsNotPrime()
{
    Assert.False(PrimeHelper.IsPrimeBuggy(6));
}

The moment you realise, that it doesn’t work correctly for 4, you are tempted to change for loop to

for(int i = 2; i <= Math.Sqrt(n); i++) {

All of your tests will still be passing, you coverage number won’t change, will be 100%, so you are ready to submit your code to repository, aren’t you?
To me however, you could as well not introduce the change at all! What you should have done was to add a test:

Assert.False(PrimeHelper.IsPrimeBuggy(4));

Always do border case testing

This truth is well known, yet often not applied to real world. In above IsPrimeBuggy implementation, we didn’t cover scenarios for n = 0 and n = 1. Adding tests for those scenarios might not affect code coverage, but will make it clear what is the expected output. This is important as a documentation, but also this is a must have from correctness point of view.

Transitiveness of testing

Another argument I often hear is that “this code is covered” even when there are no explicit tests for it. It gets even worse, if the code is covered in tests of another assembly.
Let’s say you write utility function, that returns the longest element of the string enumerable (don’t judge me here by O() complexity of this method):

public static string Longest(this IEnumerable<string> elements)
{
    return elements.OrderByDescending(x => (x ?? string.Empty).Length).FirstOrDefault();
}

and that you have another function, that uses it, for instance:

public static string LongestOfTwoCollections(IEnumerable<string> elements1, IEnumerable<string> elements2)
{
    return elements1.Concat(elements2).Longest();
}

and test for this method:

[TestCase]
public void LongestOfTwoCollections()
{
    Assert.AreEqual("foo", EnumerableExtensions.LongestOfTwoCollections(new[] { "foo" }, new string[0]));
}

There are multiple issues with this code, if this is your only test:

  • If you decide to remove the LongestOfTwoCollections method, your Longest method will not be tested at all.
  • LongestOfTwoCollections is not thoroughly tested – again, even though code coverage is 100%, we are not even testing if it can return element from second collection.
  • Code coverage for Longest is 100%, even though it is not unit tested at all! It is only tested, as I call it, transitively – because the fact if it is tested depends on the test of other method. The author of other method might realise, that there is a more efficient way to implement given function, and he rewrites an implementation of the function. Because the function was tested, all he needs to ensure is that all tests are still passing. However, the implementation of the original method is not tested any longer. This is a serious issue, and is very often neglected by developers.
  • What should be the result if collection is empty?

I usually assume, that each public method should be tested.

Each test is a regression test

This has been already stated in the introduction, but each test you write automatically becomes a regression test. If you cover border cases, you place you and your team in a great position of well maintained code. Code cleanups, refactorings and optimizations will be relatively safe to do.

Don’t underestimate the power of assertions

Developers spend lot of time running a debug build of your product, instead of retail (release) build. They often run it with debugger attached, and they understand the system. I like putting assertions for invariants, especially where complex interactions occur. Thing to bear in mind that assertions should never replace tests. Sometimes however, it is impossible to test all cases (or you might think you tested them all, but it might be possible that some paths in your method are still not covered). By making assertions on things that should always be true (invariants), you are essentially running more testing in your applications – because you test it on real data. One can ask: why would you put Debug.Assert instead “if” statement and casting exception? There are two reasons for that: if assertion truly checks the invariant, covering given scenario in unit test and verifying exception can be extremely hard, and you would probably be writing dead code (code that is never executed). Second reason is that assertions can introduce performance hit. I found myself writing assertion that was comparing content of two arrays, or iterating through string to ensure that some sequence of characters was not found.

See more

Takeouts

  • Always write tests that shows the bug you are fixing. Do it before fixing the bug.
  • Write those bugs-showing tests before actually fixing the issue!
  • Write unit tests for your code. Do not rely on other code calling your code as a test. That code can be changed or removed, and your code will not be tested any more
  • Use code coverage with caution – it is a great tool, but it also can lie to your face
  • Test border cases. Always.
  • Treat failing test as failing build. Having a failing test in a code base impacts other developers. They cannot reliably write new code and run tests. If possible, consider making tests part of the build in your continuous integration server. If it is possible, have a “trunk” branch that is always stable, and changes are merged automatically to it only when they passed build on CI server.
  • Consider having code coverage gating. Keep in mind it is not ideal, but it forces the most stubborn ones.

You can find code samples in the github project

Leave a Reply