Thursday, February 2, 2012

The one best way I know of to write software tests

Early in 2011 I had a prophetic conversation with fellow Baltimore hacker Nick Gauthier that radically changed the way I think about testing web applications. He described a methodology where you almost exclusively write high-level acceptance or integration tests that exercise several parts of your code at once (vs. the approach I had previously used, writing unit tests for every component backed up by a few integration tests). For a Ruby app this means using something like Capybara (depicted below), Cucumber, or Selenium to test the entire stack, the way a user would interact with your site.

These tests aren't meant to be exhaustive - you don't test every possible corner case of every codepath. Instead you use them to design and verify the overall behavior of the system. For example, you might write a test to make sure your system can cope with invalid user input:

describe "Adding cues" do
  let(:user) { Factory(:user) }
    before { login_as_user(user) }

    it "handles invalid data" do
      visit "/cues/new"
      select "Generic", from: "Type"
      expect { click_button "Create Cue" }.not_to change(Cue,:count)
      should_load "/cues"

Usually with this technique you would not write a separate test for each type of invalid data since these tests like these are fairly expensive. Instead, you combine the test above with a series of unit tests which examine the components involved in the above behavior in an isolated fashion. Typically these tests will run much more quickly because they don't involve the overhead of setting up a request, hitting the database, etc.

In the above example we could cover all of the invalid cases with a model unit test that looks like this:

describe Cue do
  it { should validate_presence_of :name }
  it { should validate_presence_of :zip_code }

What you end up with is a small number of integration tests which thoroughly exercise the behavior of your code combined with a small number of extremely granular tests that run quickly and cover the edge cases.

One Criticism of This Approach

This idea has been working wonderfully for me. I feel like it gives me excellent code coverage without creating a massively-long running test suite. But I did notice Nick Evans critiquing this style of testing awhile ago on Twitter:
lots of integration tests + very few unit tests => a system that mostly works, but probably has ill-defined domain models and internal APIs.
The fact that it got retweeted and favorited a number of times makes me think he's onto something, though I haven't run into this problem yet, and I'm rigorous about keeping domain models and APIs clean. I have no problems refactoring in order to keep my average pace of development high. In my experience adhering to a strict behavior-driven development approach has kept me from running into the problem he describes, but that might not hold if I was part of a team. Time will tell.


Unknown said...

I do exactly the same thing.

The way I look at it is that I'm using the integration tests to follow both the positive and negative paths that a user could take. Then I'm using the unit tests for the models and controllers to enumerate each of the possible wrong actions.

I follow this approach to the point that if I see more than 2 paths in my integration tests I consider it a code smell.

To me this seems to result in a few integration tests with lots of unit tests.

Unknown said...

I couldn't resist coming back and sharing a link to an example.

I substituted user, foo and bar to hide the actual domain but I'd estimate nearly 90% of my cucumber features follow the form shown in this gist.

Mike Subelsky said...

That's funny that you end up with lots of unit tests. So far for me it's still the opposite but you may be working on larger/more complex projects. Glad you find this approach useful also!

johnbintz said...

I had a similar talk with Nick that had similar results on my testing style, but I'm also a dirty mockist, so I still have a lot of unit tests, but they test as much in isolation as possible. The integration tests cover the actual traversal through the real codebase, and sometimes I (gasp!) write integration tests for API-level things in apps, working with real objects where it seems to make sense (for things that, if they were to ever be re-used, would be a separate Gem).

I'm still experimenting with this, but it seems to be working for me so far, with nowhere near as much integration pain as I was having with just unit tests or just integration tests. And it's still pretty fast-running, since the only DB work is done in the integration tests. We'll see what song I'm singing in the next six months when I have to go back and work on something I write now. :)

Anonymous said...

I do the same thing. The only difference is I do controller tests for API related functionality and negative responses.

Knowing what type of tests to use for any given functionality is what makes the difference between a good developer and a bad one.

Bad tests actually slow down work and cause massive refactors to have to be made in my experience. Although, I might just be sensitive to that last statement because I'm in the middle of developing production code while refactoring bad tests and code from another developer.

Jo Liss said...
This comment has been removed by the author.
Jo Liss said...

I more-or-less use the same system.

Regarding arriving at "ill-defined domain models and internal APIs": I'm not a fan of test-driven design, and I find some anecdotal support for that notion here:

In other words, I think you should think carefully about (and occasionally refactor) your data model, and not be forced to do so by the tests.