Wednesday, March 10, 2010

TDD Tidbits: Your unit tests suck (and so does your design)!

TDD is a design process but there are skills required outside of its discipline that you need to master as well. One such skill is writing proper unit tests. Prior to TDD I thought I was unit testing. I really wasn't for a number of reasons. The simple answer is that my design wasn't permitting unit testing and what I was calling "unit" tests were actually integration tests. Classes were tightly coupled, methods were more procedural than OO, I didn't do a good job of separating responsibilities/concerns, etc.

A unit test has some simple requirements you need to fulfill. I'll rattle off a list of them off the top of my head.
  • The test must be fast
  • The test must be repeatable
  • The test must be predictable
  • Only one assertion per test
  • The test must isolate the behavior under observation
Easy, right? You'd think so. I'd heard these in some shape or form before I began to do TDD but it never really hit home. I've seen a couple blog posts with lists like this one so I'll try to describe my thoughts on each of the items above. I'm sure I could go on for hours about them and more so I'll try to keep it brief.

DISCLAIMER: The following code examples are extremely naive and serve only to demonstrate.

The test must be fast

Unit tests are meant to be run frequently. You have them local to your development workstation and they (should) run as part of your continuous integration. Unit tests are regression tests and your first line of defense. They report to you when something is no longer behaving the way it's supposed to. You need that feedback immediately and there's absolutely no reason for them to be slow.

For any system of significant complexity you don't want to wait 30 or even 10 minutes to find out that a bug was introduced. Unit tests should execute in milliseconds and running a whole suite of them should execute within seconds. For example, a project I had at work had around 300 tests that would execute in 4 to 5 seconds. That's the type of speed you should be aiming for.

Having tests that run in the timespan of minutes immediately introduces context switching. What happens when you kick off a full system build for a project that may take a minute or two or more to complete? You open up your browser and see what's going on in the Twitterverse, Facebook, StackOverflow, etc. You don't want that to happen for your testing. It's good to keep focused on the task and to plug along without too much interruption.

The test must be repeatable

This one is simple. I can run a unit test as many times as I want and it will not fail. Take the following for example of what would not be a repeatable test. This is a domain of rabbits. You have some rabbits, you add them to a collection of rabbits. It's a cruel domain and no two rabbits may have the same name. When you run this test a second time it will fail since it's picking up state from the last run. "Thumper" will already be in the database so adding him again will cause an error.


[TestFixture]
public class RabbitFixture
{
[Test]
public void Should_create_rabbit()
{
var rabbitRepo = new RabbitRepository();

rabbitRepo.Add(new Rabbit("Thumper"));

var thumper = rabbitRepo.GetByName("Thumper");

Assert.IsNotNull(thumper);
}
}

public class Rabbit
{
public Rabbit(string name)
{
Name = name;
}

public string Name { get; private set; }
}

public class RabbitRepository
{
private readonly ISessionFactory _sessionFactory;

public void Add(Rabbit rabbit)
{
using (var session = _sessionFactory.OpenSession())
{
if(Exists(rabbit))
{
throw new Exception("Rabbit of the same name already exists!");
}

session.Save(rabbit);
}
}

private bool Exists(Rabbit rabbit)
{
return GetByName(rabbit.Name) != null;
}

public Rabbit GetByName(string name)
{
using (var session = _sessionFactory.OpenSession())
{
return session.Linq<Rabbit>()
.FirstOrDefault(r => r.Name == name);
}
}
}

There are various tricks to make this test repeatable but not without violating some of the other guidelines like test isolation and speed. Unit tests shouldn't rely on external state and likewise they should not be creating any state that will last beyond the life of the test itself. It will be hard to keep the test repeatable and is brittle.

The test must be predictable

1 + 1 will always equal 2, right? Conceptually, your test should do the same. The inputs will always yield an expected output. I don't think this needs much more explanation.

Only one assertion per test

This is one of the more misunderstood practices of unit testing. Don't mistake mapping the word "assertion" to the Assert function that is called within a unit test. In this instance, they are not the same. You may call the Assert function multiple times to assert a single behavior. Take the following example.


[TestFixture]
public class RectangleFixture
{
[Test]
public void Should_resize_rectangle()
{
Rectangle rectangle = new Rectangle(40, 20);

rectangle.ReduceSizeByPercent(50);

Assert.AreEqual(20, rectangle.Length);
Assert.AreEqual(10, rectangle.Width);
}
}

public class Rectangle
{
public Rectangle(int length, int width)
{
Width = width;
Length = length;
}

public double Length { get; private set; }

public double Width { get; private set; }

public void ReduceSizeByPercent(int percent)
{
Length *= (percent * .01);
Width *= (percent * .01);
}
}

There're two calls to Assert to verify that the rectangle's size was reduced by 50%. That's what we're talking about when we say a test makes only one assertion. It's asserting the one behavior. Don't hate yourself if you call Assert more than once and don't feel the need to create another test.

The test must isolate the behavior under observation

Now that we're warmed up let's look at the one guideline to rule them all. This is the one that will have the greatest impact on the design of your code. You have to isolate the behavior you wish to test.

You can read this another way. The unit test cannot fail for any reason other than the implementation of the behavior being tested is incorrect. I have read blogs that mandate that a unit test cannot touch the file system, a database, a network, etc. That's not just because of how slow that may be. It's because now your unit test may fail for any reason related to an external dependency that has absolutely nothing to do with what you're unit testing. When you get the red light that a unit test fails it shouldn't be because you didn't install a database, it shouldn't be because you neglected to include a configuration file, it shouldn't be because your network cable isn't plugged in. Those have nothing to do with the unit test at hand.

Following this guideline will teach you how to detach your class from its external dependencies. The external dependencies aren't just things like databases and LAN access. They can be other classes that have rules of their own that need to be satisfied. You do not want your test to fail if some rule in a class that you consume isn't satisfied. That's outside of the scope of the behavior you're testing.

This is where people start to introduce language like "mocking" or "faking" your external dependencies. These are ways of allowing you to truly isolate the behavior you're trying to test. You're replacing pieces of your class that would cause the test to fail for reasons outside of the class's cares or awareness.

When you first start to really isolate the code you're trying to unit test the first thought that may pop into your mind is that "I'm just writing it this way so that I can test it!" Yes, you are but there is a truth you do not yet realize (and may take some time to settle in). Well designed code is easily testable and easily testable code is well designed. Writing proper unit tests can expose major (or minor) issues in the your design.

For me, this is where I learned the most lessons and what made me truly appreciate things like the SOLID principles. It leads you there and, in my opinion, you would have to go way out of your way to succeed in isolating your tests but avoid good design. It puts bumpers on your bowling lane so to speak.

Summary

Unit testing in and of itself is huge to the design process. Without even engaging in TDD, it is going to highlight deficiencies in your design. There is a lot of talk that TDD isn't about testing, it's about design. Unit testing is also about design so it's a perfect marriage.

Tuesday, March 9, 2010

TDD Tidbits: Red, Green, REFACTOR!!!

So I was a TDD noob at one point. I had to start somewhere so I picked up some recommended reading, Kent Beck's Test Driven Design: By Example. I'll be honest; I didn't like it. I actually returned the book. Was I stupid? No! It just didn't seem to add up.

To Mr. Beck's benefit, I wasn't rejecting his book but rather the notion that I should put myself on a leash when I code and do "stupid" stuff like return static data to satisfy a test. I was feeling the discomfort a lot of developers feel when they first dive into this stuff. It's a whole different way of writing code. I would never fault anyone for having reactions like my own.

The idea of returning junk data to satisfy a test was probably the hardest thing for me to overcome. I spent many frustrating sessions trying to figure out when to stop returning dummy data or why I was even doing it. It was frustrating to know how to code things but having to hold myself back and continue with the TDD process. It won't let you wander off and start implementing all sorts of code.

Why the hell would I return dummy values in my production code? Why would I fake out my code at all? I could write code all day that would return dummy data. Hey look, my code returns zero! On the next unit test I can have it just return one and fake that out as well! What the hell is this proving? I'm not writing any meaningful code! I KNOW what the answer is. Why do I have to go about it this way? What a waste of time! *head explodes*

To speak to my frustrations I didn't fully understand the process and I didn't (immediately) see the benefits. I was missing the important step of refactoring my code after the test passes. Sure, you may write a test that returns a static value of zero. Then you write another test that expects the code to behave a new way and return another value. At this point, feel free to refactor your guts out. You don't need permission to write code. Don't let the dogma of "only write code when you have a failing test" fool you. It comes in two stages. You write code to get the test to pass and then you write code again to refactor! Your test passes but that doesn't mean all hands have to come off the keyboard!

For example, here's a bucket. You put apples in the bucket and then you want to know how many apples you have. The first test and implentaton may look like this.


public class BucketFixture
{
[Test]
public void Should_have_five_apples_in_bucket()
{
Bucket bucket = new Bucket();
bucket.AddApples(5);

bucket.TotalAppleCount.ShouldEqual(5);
}
}

public class Bucket
{
public void AddApples(int appleCount)
{
}

public int TotalAppleCount
{
get
{
return 5;
}
}
}

Then you want another test to hammer out more of the behavior of this awesome bucket.


public class BucketFixture
{
[Test]
public void Should_have_five_apples_in_bucket()
{
Bucket bucket = new Bucket();
bucket.AddApples(5);

bucket.TotalAppleCount.ShouldEqual(5);
}

[Test]
public void Should_have_no_apples_in_bucket()
{
Bucket bucket = new Bucket();

bucket.TotalAppleCount.ShouldEqual(0);
}
}

The new test breaks as expected but now you can't pass back dummy data. What code can you write to satisfy both tests?


public class Bucket
{
int _totalAppleCount;

public void AddApples(int appleCount)
{
_totalAppleCount += appleCount;
}

public int TotalAppleCount
{
get
{
return _totalAppleCount;
}
}
}

Huzzah! But I don't necessarily have to stop there. I can still refactor if my heart so desired. I don't need a failing test. What if I wanted to be super cool and use auto properties? Don't bother to write a new test for it. Just do it.


public class Bucket
{
public void AddApples(int appleCount)
{
TotalAppleCount += appleCount;
}

public int TotalAppleCount
{
get; private set;
}
}

No failing test required! But bear in mind this is refactoring; we aren't changing behavior, only structure. The code must continue to behave the same. The tests will verify that.

Another thing to point out is that if the only behavior the bucket exhibited was that it had 5 apples then the test and implementation stops immediately. This is when it's called out that you're trying to do the simplest thing to satisfy the requirements. Even when new behavior is added, you're taking the shortest route to functional without being a complete chimp about it. Add patterns where applicable and such.

Last point I'll make is that now I view my first test as probably the most important to my API. The first test is figuring out the names of my classes, interfaces, properties, methods and arguments. It's the first stab at the design of the class that you're making. This is where I'm extra sensitive to the needs of the client code (typically called dog fooding). It's a big deal to me that my code demonstrates its intent and usage without the need for lengthy comments.


So there you go kids. Take advantage of the refactoring step! Do all the stuff that you want to do when you think that the TDD process has put you on a leash.

TDD Tidbits

I've been practicing TDD for two years and when I take a step back and look at what it's done for me I see I've learned a lot. I'm a far better coder because of it. Had I not picked it up, I doubt I'd be as well off. I began writing a post about it but it became monolothic. There's too much to be said. I decided to break it down into individual posts to make them easier to digest.

Just doing superficial searches on TDD in google, I don't see much in the way of individual experiences on TDD practices. I'm sure it's sprinkled about in blogs but most of what I find are the howto's, tutorials and videos surrounding it. I think what I plan to write is a mixture of tips, insight and experiences in TDD. I'm hoping it has value for the novice, the intermediate and the seasoned veteran. It's probably going to be in C# but I'll see if I can't do examples in other languages (feel free to comment/request one for all 3-4 of my possible readers).