<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html> <head> <title>Managing the Release of a Large Python Project</title> </head> <body> <h1>Managing the Release of a Large Python Project</h1> <ul> <li>Christopher Armstrong <a href="mailto:radix@twistedmatrix.com">radix@twistedmatrix.com</a></li> <li>Moshe Zadka <a href="mailto:moshez@twistedmatrix.com">moshez@twistedmatrix.com</a></li> </ul> <h2>Abstract</h2> <p> Twisted is a Python networking framework. At last count, the project contains nearly 60,000 lines of effective code (not comments or blank lines). When preparing a release, many details must be checked, and many steps must be followed. We describe here the technologies and tools we use, and explain how we built tools on top of them which help us make releasing as painless as possible. </p> <h2>Introduction</h2> <p> One of the virtues of Python is the ease of distributing code. Its module system and the lack of necessity of compilation are what make this possible. This means that for simple Python projects, nothing more complicated then tar is needed to prepare a distribution of a library. However, Twisted has auto-generated documentation in several formats, including docstring generated documentation, HOWTOs written in HTML, and manpages written in nroff. As Twisted grew more complex and popular, a detailed procedure for putting out a release was made necessary. However, human fallibility being what it is, it was decided that most of these steps should be automated. </p> <h2>Overview of Steps</h2> <p> Despite heavy automation, there are still a number of manual steps involved in the release process. We've reduced the amount of manual steps quite a bit, and most of what's left is not fully automatable, although the process could be made easier (see <q>Future Directions</q> below). </p> <ul> <li>Test <ul> <li>Unit tests</li> <li>Acceptance tests</li> <li>Pre-release tests</li> </ul> </li> <li>Update the Changelog and README files</li> <li>Run the release script <ul> <li>unix runs admin/release-twisted</li> <li>Win32 runs win32/bdist_wininst.bat</li> </ul> </li> <li>Deploy: update twisted deployment on twistedmatrix.com</li> <li>Upload to SourceForge mirror</li> <li>Update Website</li> </ul> <h2>Testing</h2> <p> Twisted has three categories of tests: unit, acceptance, and pre-release. Testing is an important part of releasing quality software, of course, so these will be explained. </p> <p> Unit tests are run as often as possible by each of the developers as they write code, and must pass before they commit any changes to CVS. While the Twisted team tries to follow the XP practice of ensuring all code is releasable, this isn't always true. Thus, running the unit tests on several platforms before releasing is necessary. Our BuildBot runs the unit tests constantly on several hosts and multiple platforms, so the <a href="http://twistedmatrix.com/users/warner.twistd/">status page</a> is simply checked for green lights before a release. </p> <p> Acceptance tests (which, unfortunately, are not quite the same as <a href="http://xprogramming.org/">Extreme Programming's</a> Acceptance Tests) are simply interactive tests of various Twisted services. There is a script that executes several system commands that use the Twisted end-user executables and start several clients (web browsers, IRC clients, etc) to allow the user to interactively test the different services that Twisted offers. These are only routinely run before a release, but we also encourage developers to run these before they make major changes. </p> <p> The pre-release tests are for ensuring the web server (One of the most popular parts of Twisted, and which the twistedmatrix.com web site uses) runs correctly in a semi-production environment. The script starts up a web server on twistedmatrix.com, similar to the one on port 80, but on an out-of-the-way port. <q>lynx</q> is then run several times, with URLs strategically chosen to test different features of the web server. Afterwards, the log of the web server is displayed and the user is to check for any errors. </p> <h2>The release-twisted Script</h2> <p> Like many other build/release systems, the automated parts of our release system started out as a number of small shell scripts. Eventually these became a single Python script which was a large improvement, but still had many problems, especially since our release process became more complex (documentation generation, different types of archive formats, etc). This led to problems with steps in the middle of the process breaking; the release manager would need to restart the entire thing, or enter the remaining commands manually. </p> <p> The solution that we came up with was a simple framework for pseudo-transactions; Every step of the process is implemented with a class that has <code class="python">doIt</code> and <code class="python">undoIt</code> methods. Each step also has a command-line argument associated with it, so a typical run of the script looks something like this: <pre class="shell"> $SOMEWHERE/admin/release-twisted -V $VERSION -o $LASTVERSION --checkout \ --release=/twisted/Releases --upver --tag --exp --dist --docs --balls \ --rel --deb --debi </pre> </p> <h3>Transactions</h3> <p> As stated above, our transaction system is very simple. One of our rather simple transaction classes is <code class="python">Export</code>. </p> <pre class="python"> class Export(Transaction): def doIt(self, opts): print "Export" root = opts['cvsroot'] ver = opts['release-version'] sh('cvs -d%s export -r release-%s Twisted' % (root, ver.replace('.', '_'))) def undoIt(self, opts, fail): sh('rm -rf Twisted') </pre> <p> One useful feature to note is the <code class="python">sensitiveUndo</code> attribute on Transaction classes. If a transaction has this set, the user will be prompted before running the <code class="python">undoIt</code> method. This is useful for very long-running processes, like documentation generation, debian package building, and uploading to sourceforge. If something goes wrong in the middle of one of these processes, we want to give the user a chance to manually fix the problem rather than redoing the entire transaction. They can then continue from the next command by omitting the commands that have already been accomplished from the <code class="shell">release-twisted</code> arguments. </p> <p> A list of all of the transactions defined in release-twisted follows. </p> <dl> <dt>CheckOut</dt> <dd> checks out the latest revision of Twisted from CVS and puts it in the <q>Twisted.CVS</q> directory. </dd> <dt>UpdateVersion</dt> <dd> changes the version number of the current release -- updating twisted/copyright.py (the canonical location for the current version) and a few other text files where the current version is mentioned. </dd> <dt>Tag</dt> <dd> tags the revisions in the current source tree with the version passed in on the command line. </dd> <dt>Export</dt> <dd> runs the cvs <q>export</q> command, which is similar to <q>checkout</q>, but leaves out CVS support directories; this is what we package up in the archives. </dd> <dt>PrepareDist</dt> <dd> simply copies the directory containing the version of Twisted to be released to a new directory specifically for the release process. The reason that we have this extra copy is that sometimes one will want to create a release from a directory that wasn't created from the <q>Export</q> command; having the release script munge that directory in-place would be impolite. </dd> <dt>GenerateDocs</dt> <dd> generates the various documentation: HTML API documentation (via Epydoc), HTML, PostScript, and PDF howto documentation (via twisted.lore), and HTML man-pages (via lore, converted from the nroff source). </dd> <dt>CreateTarballs</dt> <dd> creates the various archives that each Twisted release involves: tarred and gzipped or bzip2ed versions of archives with code plus documentation, code without documentation, and only documentation. </dd> <dt>Release</dt> <dd> copies all of the archives to a directory specified by the --release parameter. This is meant to be a publically accessible directory, thus the name <q>Release</q>. </dd> <dt>MakeDebs</dt> <dd> creates the .deb packages and support files for the Twisted Debian packages. </dd> <dt>InstallDebs</dt> <dd> Creates an apt-gettable Debian package repository in the (unfortunately hard-coded) <q>/twisted/Debian</q> directory. </dd> <dt>Sourceforge</dt> <dd> uploads the archives and debian packages to Twisted's sourceforge mirror at <a href="http://twisted.sourceforge.net">http://twisted.sourceforge.net/</a>. </dd> <dt>UpgradeDebian</dt> <dd> Installs the recently-generated Debian packages via <q>dpkg</q> on the local machine. </dd> </dl> <h2>setup.py</h2> <p> Twisted has an extensive and very customized setup.py script. We have a number of C extension modules and try to ensure that they all build, or at least fail gracefully, on win32, Mac OSX, Linux and other popular unix-style OSes. </p> <p> We have overridden three of the distutils <q>command classes</q>: <code class="python">build_ext</code>, <code class="python">install_scripts</code>, and <code class="python">install_data</code>. </p> <h3>Building C extensions</h3> <p> <code class="python">build_ext_twisted</code> detects, based on various features of the platform, which C extensions to build. It overrides the <code class="python">build_extensions</code> method to first check which C extensions are appropriate to build for the current platform before proceeding as normal (by calling the superclass's <code class="python">build_extensions</code>). The module-detection consists of several simple tests for platform features and conditional additions to the `extensions' attribute. One especially useful feature is the <code class="python">_check_header</code> method, which takes the name of an arbitrary head file and tries to compile (via the distutil's C compiler interafce) a simple C file that only #includes it. </p> <h3>Installing scripts</h3> <p> <code class="python">install_data_twisted</code> ensures that the data files are installed along-side the python modules in the twisted package. This is accomplished with the incantation: </p> <pre class="python"> class install_data_twisted(install_data): def finalize_options (self): self.set_undefined_options('install', ('install_lib', 'install_dir') ) install_data.finalize_options(self) </pre> <h3>Windows Releases</h3> <!-- <p> This section will cover the problems with packaging Python projects for windows, especially ones which contain scripts. The problem of clickability is especially acute, as windows determines types by extensions and not by #! lines. </p> --> <p> Packaging software for windows involves a unique set of problems. The problem of clickability is especially acute; Several customizations to the distutils setup had to be made. </p> <p> The first customization was to make the <q>scripts</q> end with a <q>.py</q> extension, since Windows relies on extension rather than a she-bang line to specify what interpreter should execute a file. This was accomplished by overriding the <code class="python">install_scripts</code> command, like so: </p> <pre class="python"> class install_scripts_twisted(install_scripts): """Renames scripts so they end with '.py' on Windows.""" def run(self): install_scripts.run(self) if os.name == "nt": for file in self.get_outputs(): if not file.endswith(".py"): os.rename(file, file + ".py") </pre> <p> We also wanted to have a Start-menu group with a number of icons for running different Twisted programs. This was accomplished with a post-install script specified with the command-line parameter <code class="shell">--install-script=twisted_postinstall.py</code>. </p> <h2>Future Directions</h2> <p> The theme is, of course, automation, and there are still many manual steps involved in a Twisted release. The currently most annoying step is updating the documentation and downloads section of the twistedmatrix.com website. Automating this would be a major improvement to the time it takes from the running of the release script to a fully completed release. </p> <p> Another major improvement will involve further integration with BuildBot. Currently we have BuildBot running unit tests, building C extensions, and generating documentation on several hosts. Eventually we would like to have it constantly generating full release archives, and have an additional web form for <q>finalizing</q> any particular build that we deem releasable. The result would be uploading the release to the mirrors and updating the website. </p> <p> The tagging scheme used by the release-twisted scripts can sometimes be problematic. If we find serious problems in the code-base after the Tag command is executed (which is fairly early in the process), we are forced to fix the bug and increase the version number. This can be prevented by, instead of making the official tag, using the unofficial tag <q>releasing-$version</q> (as opposed to <q>release-$version</q>) at that early stage. Once most of the steps are complete, the official tag will be made. If something in between goes wrong, we can just re-use the unofficial <q>releasing-$version</q> tag and not worry about users trying to use that tag. </p> </body> </html>