test-standard.tex [plain text]

\section{Unit Tests in Twisted\label{doc/howto/policy/test-standard.xhtml}}


Each \begin{em}unit test\end{em} tests one bit of functionality in the     software.  Unit tests are entirely automated and complete quickly.     Unit tests for the entire system are gathered into one test suite,     and may all be run in a single batch.  The result of a unit test     is simple: either it passes, or it doesn't.  All this means you     can test the entire system at any time without inconvenience, and     quickly see what passes and what fails.

\subsection{Unit Tests in the Twisted Philosophy}


The Twisted development team     adheres to the practice of Extreme     Programming\footnote{http://c2.com/cgi/wiki?ExtremeProgramming} (XP), and the usage of unit tests is a cornerstone     XP practice.  Unit tests are a tool to give you increased     confidence.  You changed an algorithm -- did you break something?     Run the unit tests.  If a test fails, you know where to look,     because each test covers only a small amount of code, and you know     it has something to do with the changes you just made.  If all the     tests pass, you're good to go, and you don't need to second-guess     yourself or worry that you just accidently broke someone else's     program.

\subsection{What to Test, What Not to Test}


You don't have to write a test for every single     method you write, only production methods that could possibly break.

-- Kent Beck, Extreme Programming Explained, p. 58.

\subsection{Running the Tests}


\subsubsection{How}
\begin{verbatim}
$ Twisted/admin/runtests
\end{verbatim}


You'll find that having something like this in your emacs init     files is quite handy:\begin{verbatim}
(defun runtests () (interactive)
  (compile "python /somepath/Twisted/admin/runtests"))

(global-set-key [(alt t)] 'runtests)
\end{verbatim}


\subsubsection{When}


Always always \begin{em}always\end{em} be sure all the      tests pass\footnote{http://www.xprogramming.com/xpmag/expUnitTestsAt100.htm} before committing any code.  If someone else      checks out code at the start of a development session and finds      failing tests, they will not be happy and may decide to \begin{em}hunt      you down\end{em}.

Since this is a geographically dispersed team, the person who     can help you get your code working probably isn't in the room with     you.  You may want to share your work in progress over the     network, but you want to leave the main CVS tree in good working     order.  So use a branch\footnote{http://www.cvshome.org/docs/manual/cvs\_5.html},     and merge your changes back in only after your problem is solved     and all the unit tests pass again.

\subsection{Adding a Test}


Please don't add new modules to Twisted without adding tests     for them too.  Otherwise we could change something which breaks     your module and not find out until later, making it hard to know     exactly what the change that broke it was, or until after a     release, and nobody wants broken code in a release.

Tests go in Twisted/twisted/test/, and are named \texttt{test\_foo.\linebreak[1]py},     where \texttt{foo} is the name of the module or package being tested.     Extensive documentation on using the PyUnit framework for writing     unit tests can be found in the \textit{links     section}\loreref{doc/howto/policy/test-standard.xhtmlHASHlinks} below.

One deviation from the standard PyUnit documentation: To ensure     that any variations in test results are due to variations in the     code or environment and not the test process itself, Twisted ships     with its own, compatible, testing framework.  That just     means that when you import the unittest module, you will \texttt{from twisted.\linebreak[1]trial import unittest} instead of the     standard \texttt{import unittest}.

As long as you have followed the module naming and placement     conventions, \texttt{runtests} will be smart     enough to pick up any new tests you write.

\subsection{Skipping tests, TODO items}


Trial, the Twisted unit test framework, has some extensions which are designed to encourage developers to add new tests. One common situation is that a test exercises some optional functionality: maybe it depends upon certain external libraries being available, maybe it only works on certain operating systems. The important common factor is that nobody considers these limitations to be a bug.

To make it easy to test as much as possible, some tests may be skipped in certain situations. Individual test cases can raise the \texttt{Skip\linebreak[1]Test} exception to indicate that they should be skipped, and the remainder of the test is not run. In the summary (the very last thing printed, at the bottom of the test output) the test is counted as a ``skip'' instead of a ``success'' or ``fail''. This should be used inside a conditional which looks for the necessary prerequisites:\begin{verbatim}
def testSSHClient(self):
    if not ssh_path:
        raise unittest.SkipTest, "cannot find ssh, nothing to test"
    foo() # do actual test after the SkipTest
\end{verbatim}


You can also set the .skip attribute on the method, with a string to indicate why the test is being skipped. This is convenient for temporarily turning off a test case, but it can also be set conditionally (by manipulating the class attributes after they've been defined):\begin{verbatim}
def testThing(self):
    dotest()
testThing.skip = "disabled locally"
\end{verbatim}
\begin{verbatim}
class MyTestCase(unittest.TestCase):
    def testOne(self):
        ...
    def testThing(self):
        dotest()

if not haveThing:
    MyTestCase.testThing.im_func.skip = "cannot test without Thing"
    # but testOne() will still run
\end{verbatim}


Finally, you can turn off an entire TestCase at once by setting the .skip attribute on the class. If you organize your tests by the functionality they depend upon, this is a convenient way to disable just the tests which cannot be run.\begin{verbatim}
class SSLTestCase(unittest.TestCase):
   ...
class TCPTestCase(unittest.TestCase):
   ...

if not haveSSL:
    SSLTestCase.skip = "cannot test without SSL support"
    # but TCPTestCase will still run
\end{verbatim}


\subsubsection{.todo and Testing New Functionality }


Two good practices which arise from the ``XP'' development process are sometimes at odds with each other:\begin{itemize}
\item Unit tests are a good thing. Good developers recoil in horror when   they see a failing unit test. They should drop everything until the test   has been fixed.
\item Good developers write the unit tests first. Once tests are done, they   write implementation code until the unit tests pass. Then they stop.
\end{itemize}


These two goals will sometimes conflict. The unit tests that are written first, before any implementation has been done, are certain to fail. We want developers to commit their code frequently, for reliability and to improve coordination between multiple people working on the same problem together. While the code is being written, other developers (those not involved in the new feature) should not have to pay attention to failures in the new code. We should not dilute our well-indoctrinated Failing Test Horror Syndrome by crying wolf when an incomplete module has not yet started passing its unit tests. To do so would either teach the module author to put off writing or committing their unit tests until \begin{em}after\end{em} all the functionality is working, or it would teach the other developers to ignore failing test cases. Both are bad things.

``.todo'' is intended to solve this problem. When a developer first starts writing the unit tests for functionality that has not yet been implemented, they can set the \texttt{.todo} attribute on the test methods that are expected to fail. These methods will still be run, but their failure will not be counted the same as normal failures: they will go into an ``expected failures'' category. Developers should learn to treat this category as a second-priority queue, behind actual test failures.

As the developer implements the feature, the tests will eventually start passing. This is surprising: after all those tests are marked as being expected to fail. The .todo tests which nevertheless pass are put into a ``unexpected success'' category. The developer should remove the .todo tag from these tests. At that point, they become normal tests, and their failure is once again cause for immediate action by the entire development team.

The life cycle of a test is thus:\begin{enumerate}
\item Test is created, marked \texttt{.todo}. Test fails: ``expected   failure''.
\item Code is written, test starts to pass. ``unexpected success''.
\item \texttt{.todo} tag is removed. Test passes. ``success''.
\item Code is broken, test starts to fail. ``failure''. Developers spring   into action.
\item Code is fixed, test passes once more. ``success''.
\end{enumerate}


Any test which remains marked with \texttt{.todo} for too long should be examined. Either it represents functionality which nobody is working on, or the test is broken in some fashion and needs to be fixed.

\subsection{Associating Test Cases With Source Files}


Please add a \texttt{test-case-name} tag to the source file that is covered by your new test. This is a comment at the beginning of the file which looks like one of the following:\begin{verbatim}
# -*- test-case-name: twisted.test.test_defer -*-
\end{verbatim}


or\begin{verbatim}
#!/usr/bin/python
# -*- test-case-name: twisted.test.test_defer -*-
\end{verbatim}


This format is understood by emacs to mark ``File Variables''. The intention is to accept \texttt{test-case-name} anywhere emacs would on the first or second line of the file (but not in the \texttt{File Variables:} block that emacs accepts at the end of the file). If you need to define other emacs file variables, you can either put them in the \texttt{File Variables:} block or use a semicolon-separated list of variable definitions:\begin{verbatim}
# -*- test-case-name: twisted.test.test_defer; fill-column: 75; -*-
\end{verbatim}


If the code is exercised by multiple test cases, those may be marked by using a comma-separated list of tests, as follows: (NOTE: not all tools can handle this yet.. trial --testmodule does, though)\begin{verbatim}
# -*- test-case-name: twisted.test.test_defer,twisted.test.test_tcp -*-
\end{verbatim}


The \texttt{test-case-name} tag will allow \texttt{trial --testmodule twisted/dir/myfile.\linebreak[1]py} to determine which test cases need to be run to exercise the code in \texttt{myfile.\linebreak[1]py}. Several tools (as well as \texttt{twisted-dev.\linebreak[1]el}'s F9 command) use this to automatically run the right tests.

\subsection{Twisted-specific quirks: reactor, Deferreds, callLater}


The standard Python \texttt{unittest} framework, from which Trial is derived, is ideal for testing code with a fairly linear flow of control. Twisted is an asynchronous networking framework which provides a clean, sensible way to establish functions that are run in response to events (like timers and incoming data), which creates a highly non-linear flow of control. Trial has a few extensions which help to test this kind of code. This section provides some hints on how to use these extensions and how to best structure your tests.

\subsubsection{Leave the Reactor as you found it}


Trial runs the entire test suite (over one thousand tests) in a single process, with a single reactor. Therefore it is important that your test leave the reactor in the same state as it found it. Leftover timers may expire during somebody else's unsuspecting test. Leftover connection attempts may complete (and fail) during a later test. These lead to intermittent failures that wander from test to test and are very time-consuming to track down.

Your test is responsible for cleaning up after itself. The \texttt{tear\linebreak[1]Down} method is an ideal place for this cleanup code: it is always run regardless of whether your test passes or fails (like a bare \texttt{except} clause in a try-except construct). Exceptions in tearDown are flagged as errors and flunk the test.

TODO: helper functions: TestCase.addPort, TestCase.addTimer

reactor.stop is considered very harmful, and should only be used by reactor-specific test cases which know how to restore the state that it kills. If you must use reactor.run, use reactor.crash to stop it instead of reactor.stop.

Trial tries to help insure that the reactor is clean after each test, but the reactor does not yet support an interface that would make this work properly. It can catch leftover timers, but not lingering sockets.

\subsubsection{deferredResult}


If your test creates a \texttt{Deferred} and simply wants to verify something about its result, use \texttt{deferred\linebreak[1]Result}. It will wait for the Deferred to fire and give you the result. If the Deferred runs the errback handler instead, it will raise an exception so your test can fail. Note that the \begin{em}only\end{em} thing that will terminate a \texttt{deferred\linebreak[1]Result} call is if the Deferred fires; in particular, timers which raise exceptions will not cause it to return.

\subsubsection{Waiting for Things}


The preferred way to run a test that waits for something to happen (always triggered by other things that you have done) is to use a \texttt{while not self.\linebreak[1]done} loop that does \texttt{reactor.\linebreak[1]iterate(0.\linebreak[1]1)} at the beginning of each pass. The ``0.1'' argument sets a limit on how long the reactor will wait to return if there is nothing to do. 100 milliseconds is long enough to avoid spamming the CPU while your timers wait to expire.

\subsubsection{Using Timers to Detect Failing Tests}


It is common for tests to establish some kind of fail-safe timeout that will terminate the test in case something unexpected has happened and none of the normal test-failure paths are followed. This timeout puts an upper bound on the time that a test can consume, and prevents the entire test suite from stalling because of a single test. This is especially important for the Twisted test suite, because it is run automatically by the buildbot whenever changes are committed to the CVS repository.

Trial tests indicate they have failed by raising a FailTest exception (self.fail and friends are just wrappers around this \texttt{raise} statement). Exceptions that are raised inside a callRemote timer are caught and logged but otherwise ignored. Trial uses a logging hook to notice when errors have been logged by the test that just completed (so such errors will flunk the test), but this happens after the fact: they will not be noticed by the main body of your test code. Therefore callRemote timers can not be used directly to establish timeouts which terminate and flunk the test.

The right way to implement this sort of timeout is to have a \texttt{self.\linebreak[1]done} flag, and a while loop which iterates the reactor until it becomes true. Anything that causes the test to be finished (success \begin{em}or\end{em} failure) can set self.done to cause the loop to exit.

Most of the code in Twisted is run by the reactor as a result of socket activity. This is almost always started by Protocol.connectionMade or Protocol.dataReceived (because the output side goes through a buffer which queues data for transmission). Exceptions that are raised by code called in this way (by the reactor, through doRead or doWrite) are caught, logged, handed to connectionLost, and otherwise ignored.

This means that your Protocol's connectionLost method, if invoked because of an exception, must also set this self.done flag. Otherwise the test will not terminate.

Exceptions that are raised in a Deferred callback are turned into a Failure and stashed inside the Deferred. When an errback handler is attached, the Failure is given to it. If the Deferred goes out of scope while an error is still pending, the error is logged just like exceptions that happen in timers or protocol handlers. This will cause the current test to flunk (eventually), but it is not checked until after the test fails. So again, it is a good idea to add errbacks to your Deferreds that will terminate your test's main loop.

Here is a brief example that demonstrates a few of these techniques.\begin{verbatim}
class MyTest(unittest.TestCase):
    def setUp(self):
        self.done = False
        self.failure = None

    def tearDown(self):
        self.server.stopListening()
        # TODO: also shut down client
        try:
            self.timeout.cancel()
        except (error.AlreadyCancelled, error.AlreadyCalled):
            pass

    def succeeded(self):
        self.done = True

    def failed(self, why):
        self.done = True
        self.failure = why

    def testServer(self):
        self.server = reactor.listenTCP(port, factory)
        self.client = reactor.connectTCP(port, factory)
        # you should give the factories a way to call our 'succeeded' or
        # 'failed' methods
        self.timeout = reactor.callLater(5, self.failed, "timeout")
        while not self.done:
            reactor.iterate(0.1)

        # we get here if the test is finished, for good or for bad
        if self.failure:
            self.fail(self.failure)
        # otherwise it probably passed. Cleanup will be done in tearDown()
\end{verbatim}


\subsection{Links}
\label{doc/howto/policy/test-standard.xhtmlHASHlinks}\begin{itemize}
\item A chapter on Unit Testing\footnote{http://diveintopython.org/roman\_divein.html}       in Mark Pilgrim's Dive Into       Python\footnote{http://diveintopython.org}.
\item \texttt{unittest}\footnote{http://www.python.org/doc/current/lib/module-unittest.html} module documentation, in the Python Library       Reference\footnote{http://www.python.org/doc/current/lib/}.
\item UnitTests\footnote{http://c2.com/cgi/wiki?UnitTests} on       the PortlandPatternRepository       Wiki\footnote{http://c2.com/cgi/wiki}, where all the cool ExtremeProgramming\footnote{http://c2.com/cgi/wiki?ExtremeProgramming} kids hang out.
\item Unit       Tests\footnote{http://www.extremeprogramming.org/rules/unittests.html} in Extreme Programming: A Gentle Introduction\footnote{http://www.extremeprogramming.org}.
\item Ron Jeffries espouses on the importance of Unit       Tests at 100\%\footnote{http://www.xprogramming.com/xpmag/expUnitTestsAt100.htm}.
\item Ron Jeffries writes about the Unit       Test\footnote{http://www.xprogramming.com/Practices/PracUnitTest.html} in the Extreme       Programming practices of C3\footnote{http://www.xprogramming.com/Practices/xpractices.htm}.
\item PyUnit's homepage\footnote{http://pyunit.sourceforge.net}.
\item twisted.test\footnote{http://twistedmatrix.com/documents/TwistedDocs/current/api/public/toc-twisted.test-module.html}'s inline documentation.
\item The twisted/test directory\footnote{http://twistedmatrix.com/users/jh.twistd/viewcvs/cgi/viewcvs.cgi/twisted/test/?cvsroot=Twisted} in CVS.
\end{itemize}