Test isolation in nose

7 12 2006

NosePython has a built-in testing framework called unittest. As time went by Python programmers looked for better-suited solutions. Some of them created a new quality (like doctest) and others build on top of unittest. One of frameworks which intends to replace unittest is nose. I use it constantly during development of Cheesecake and I must admit it’s been very helpful, being easy to setup, integrate and extend to my needs. But it has to be said that among many of its features it also have slightly different philosophy than unittest. In this short article I’m going to describe one of the main issues that nose users may accidentally step into – weak test isolation.

Unittest, which nose is a great successor, was based on JUnit. One of the core priciples of JUnit was that (quoting Martin Fowler) no test should ever do anything that would cause other tests to fail. Nose doesn’t give you such certainty, because (for performance reasons) it uses one interpreter for all tests. We’ll look at the examples of possible bugs that this approach can cause.

This post is not meant to be a rant of any sort. Actually, by learning from this cases I’ve improved my understanding of testing in general. I’d be happy if at least one of the readers will learn something new from my experience. To those of you seeking a quick fix, I inform you that this issue has been bureported and there is a solution under way. Please remember that tests isolation won’t be a default though.

Just a note: last time I checked py.test had the same kind of flaw.

External state

Problem

State of tested modules preserve from one test to another.

Example

Imagine you have a “Hello world” module with following contents:

message = "Hello world!"

def show():
    return message

And two test files which exercise two different possible uses. Test A uses a custom message:

import hello

class TestHelloMessage:
    def setup(self):
        hello.message = "Bye world!"

    def test_hello_message(self):
        assert hello.show() == 'Bye world!'

while test B uses the default:

import hello

class TestHelloDefault:
    def test_hello_default(self):
        assert hello.show() == 'Hello world!'

All files are in the same directory, so we’re ready to execute a test runner:

$ nosetests
.F
======================================================================
FAIL: test_hello_b.TestHelloDefault.test_hello_default
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/nose-0.9.1.dev_r101-py2.4.egg/nose/case.py", line 129, in runTest
    self.testCase(*self.arg)
  File "/home/ruby/nose/ext/test_hello_b.py", line 5, in test_hello_default
    assert hello.show() == 'Hello world!'
AssertionError

----------------------------------------------------------------------
Ran 2 tests in 0.007s

FAILED (failures=1)

Ouch, you didn’t expect that, right? Where’s the bug? Before you spend your evening staring at the code, try naming your test files differently:

$ mv test_hello_a.py test_hello_c.py

and both tests will pass:

$ nosetests
..
----------------------------------------------------------------------
Ran 2 tests in 0.005s

OK

Why is that? Nose reads tests from the directory in lexical order, so test_hello_a.py gets read and executed before test_hello_b.py. Nose runs both tests in the same environment, so hello module is read and initiated only once. This means any changes made by test A to the module hello will be present during run of test B.

Resolution

This particular example seems like an error on the test runner side, because we have two different scripts that interfere each other in a way that wouldn’t happen if you run them in separate runs. On the other hand, it may pinpoint a bug in your setup/teardown logic. If you’re changing global state during setup to exercise behaviour of objects in certain environment you should revert to the original state during teardown. Having this in mind, test A should look more like this:

import hello

class TestHelloMessage:
    def setup(self):
        self.original_message = hello.message
        hello.message = "Bye world!"

    def teardown(self):
        hello.message = self.original_message

    def test_hello_message(self):
        assert hello.show() == 'Bye world!'

Reverting changes may seem impossible in some cases, but you should try hard to do it. Remember what they say: Your code sucks if it isn’t testable. And it isn’t really testable if you can’t isolate a state you’re interested in testing.

Mocking errors

Problem

State of builtin modules preserve from one test to another.

Example

This is similar to the first example, but approaches the problem from a different direction. Now we’re writing a simple locking mechanism that uses files as locks. Currently code looks like this:

import os

class Locker:
    def __init__(self, lock_file):
        self.lock_file = lock_file

        if self.is_locked():
            self.unlock()

    def is_locked(self):
        return os.path.exists(self.lock_file)

    def lock(self):
        file(self.lock_file, 'w').close()

    def unlock(self):
        os.remove(self.lock_file)

We also have a test for an unlocked locker:

import locker

class TestUnlockedLocker:
    def setup(self):
        self.locker = locker.Locker('/path/to/lock/file')

    def test_that_is_locked_is_false(self):
        assert self.locker.is_locked() is False

It worked well until we added a new test case:

from mock import Mock

# class TestUnlockedLocker:
#    ...

class TestLockedLocker:
    def setup(self):
        # Mock file(), os.path.exists() and os.remove() for locker.
        locker.file = lambda p, *a: Mock({ 'close': None })
        locker.os.path.exists = lambda p: True
        locker.os.remove = lambda p: None

        self.locker = locker.Locker('/path/to/lock/file')
        self.locker.lock()

    def test_that_is_locked_is_true(self):
        assert self.locker.is_locked() is True

Now we’ll get a failing test for unlocked locker:

$ nosetests
.F
======================================================================
FAIL: locker_test.TestUnlockedLocker.test_that_is_locked_is_false
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/nose-0.9.1.dev_r101-py2.4.egg/nose/case.py", line 129, in runTest
    self.testCase(*self.arg)
  File "/home/ruby/nose/mock/locker_test.py", line 9, in test_that_is_locked_is_false
    assert self.locker.is_locked() is False
AssertionError

----------------------------------------------------------------------
Ran 2 tests in 0.005s

FAILED (failures=1)

Does the TestLockedLocker leave a lock file? No. But it leaves mocked version of os.path.exists, which always return True. Builtin modules doesn’t get reloaded for each test, so any changes you do to them will remain during execution of other tests. The same goes for builtins.

Resolution

State of the interpreter should be considered external, just like the state of a filesystem or environmental variables: If you’re making any changes to it, remember to revert this change at the end. This especially affect __builtins__ and standard library, where change of one function/variable can affect majority of other tests.

Import clashes

Problem

Subdirectories you keep your tests in cannot contain modules with the same names.

Example

As your set of test cases grow, you will probably start arranging them in a hierarchical structure. First obvious level of partitioning is placing your unit tests and functional tests in separate directories. That is what Grig done for Cheesecake project and this is convention I’ve followed. As I was writing more unit and functional tests a bit of redundancy in code started to show up. When the time to refactor came, I took common functionality and placed in a handy module helper.py. That’s when nose came into my way. My directory structure looked like that:

tests/
    unit/
        helper.py
        #test cases#
    functional/
        helper.py
        #test cases#

It was the common name for helper module that caused problems. When run separately, tests had no problem running:

$ nosetests -i unit
.
----------------------------------------------------------------------
Ran 1 test in 0.003s

OK
$ nosetests -i functional
.
----------------------------------------------------------------------
Ran 1 test in 0.003s

OK

On the other hand, when I tried to run them at once, at least one of test will always fail due to difference in unit/functional helper contents.

$ nosetests -i unit -i functional
.E
======================================================================
ERROR: test_this.TestThis.test_this
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/nose-0.9.1.dev_r101-py2.4.egg/nose/case.py", line 129, in runTest
    self.testCase(*self.arg)
  File "/home/ruby/nose/import/unit/test_this.py", line 5, in test_this
    helper.unit_help()
AttributeError: 'module' object has no attribute 'unit_help'

----------------------------------------------------------------------
Ran 2 tests in 0.005s

FAILED (errors=1)

In the example above, unit test test_this tried to access unit_help function, which exists in unit/helper.py, but doesn’t in functional/helper.py. Apparently functional helper got imported first, and it was saved as ‘helper‘ in sys.modules, thus inhibiting “import helper” in unit test (remember that all tests share the same interpreter).

Resolution

Name your helper modules differently or use reload().

Conclusion

Next release of nose will contain an isolation plugin, which responsibility will be to purge sys.modules list between running different test files. This will fix some of isolation problems, but not all of them. You will still be able to kill the runner or modify builtins in a way that breaks other tests. Complete solution would involve spawning a new Python process for each test, which is unacceptable from a performance point of view. Current state is just good enough, so with a bit of nose-specific knowledge and common sense you can get the best of both worlds: testing speed and stability.

About these ads

Actions

Information

2 responses

7 12 2006
Grig Gheorghiu

Very nice! Should go into the official nose documentation!

29 06 2007
Brian Kerr | links for 2007-06-29

[...] Mousebender | Test isolation in nose (tags: python nose unit testing unittest interpreter test isolation desolation) [...]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

%d bloggers like this: