proposal   [plain text]


Transluscent Inter-Process Service Migration

Jean-Paul Calderone
exarkun@twistedmatrix.com

Time Slot: Half Hour Presentation

Author Background:

  Jean-Paul Calderone completed coursework towards a computer science degree
at the State University of New York at Stony Brook before taking a position
at the National Center for Biotechnological Information at NIH.  He is
currently employed by Divmod.  He spoke previously at PyCon 1 about two
applications developed using Twisted.

Presentation Summary:

  Circumstances exist where it is desirable to change the behavior of a
running application without interrupting the service of those users whom it
is currently serving.  In high load environments, it is often the case where
there will always be a significant number of users relying on the service.
In these environments, the simplistic approach of waiting for all users to
sign off, then shutting down and restarting the server software are not
feasible.

  Several possible techniques, of varying degrees of sophistication, for
dealing with this problem will be discussed.  Among them are uses of
multiple processes (possibly on multiple physical computers) in a cluster
configuration, dynamic loading and unloading of code within a single
process, process re-execution (via execl()) to upgrade code without dropping
resources such as open files and sockets, and migration by passing resources
such as open files and sockets to a new process running new code.

  Examples of why this is a real problem, the specific issues related to
solving it in Python and the strengths and weaknesses of each covered
technique will be discussed.  If it is technically possible to do so,
working examples of some of the techniques will be shared and demonstrated.

Presentation Outline (Tentative slide titles):

    * Introduction
    * What is migrating?
    * Why do we need migrating?
    * Naive solutions
    * Problems with naive solutions
    * Load balancing
    * Upgrading code
    * Upgrading hardware
    * Fail-over
    * Problems with migrating
    * Different migration models
    * execl()
    * Problems with execl()
    * Multiple processes, same machine
    * Problems with multiple processes
    * Multiple machines
    * Problems with multiple machines
    * Example of simple (stateless) case
    * Example of complex (stateful) case
    * Demonstration of rapid migration [if possible]
    * Walk-through of rapid migration
    * Demonstration of gradual migration [if possible]
    * Walk-through of gradual migration
    * Integration with existing software
    * Designing for migratability
    * Conclusions
    * Future directions
    * Further reading