migration.xhtml   [plain text]


<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html>

<head>
<title>Translucent Inter-Process Service Migration</title>
</head>

<body>

<h1>Translucent Inter-Process Service Migration</h1>
<p>Jean-Paul Calderone<br />
exarkun@divmod.com</p>

<h2>Abstract</h2>

<h3>What Is Migration?</h3>

<p><i>Service migration</i> is a term coined for the process involving a
hand-off of application-level duties from one process to another.
Historically this has typically been achieved by stopping one process and
then starting another.  In the case of certain platforms, wider reaching
<i>restart</i> procedures (such as system reboots) have been required,
but ultimately the effect on the user has been the same: application state
must be saved or lost, a period of absence of service must be tolerated,
and finally application state must be reloaded or recreated.</p>

<p>Examples of service migration are widespread and exist anywhere software
is in use.  From a desktop user loading a new version of a word processing
tool, to a system administrator upgrading a company's web server, to a phone
company rolling over the software that drives millions of telephone calls
every day; the fundamental requirement of using new software to process old
data is the same in each case even though the risk, difficulty, and
consequences differ greatly.</p>

<h3>Why Translucent Migration?</h3>

<p>Circumstances exist where it is desirable to change the behavior of a
running application <i>without</i> interrupting the service of those clients
who are using it at the time.  In high load environments, it is often the
case that there will always be a significant number of users relying on the
service at all times. In these environments, the simplistic approach of
waiting for the service to become quiescent, then shutting down and
restarting the server software is not feasible.</p>

<p>Translucent migration removes the most user-visible portion of the
migration process: absence of service.  Instead of starting a new process
after the old process has been shut down, the order is reversed causing the
new and old processes to exist at the same time.  During this overlap,
required state can be moved between the two processes <i>while one of the
processes continues to serve user requests</i>.  Expensive operations,
such as opening a database, establishing connections to remote hosts, or
building in-memory caches can all be performed by the new process while
the old is still responding to user requests, minimizing the time during
which the application is non-responsive.</p>

<h2>Application Requirements</h2>

<p>There are several requirements made of software which is to participate in
this migration scheme.  The requirements are minimally invasive to the
application itself, but necessary for services to be migrated to a new
process.  It is for these reasons that the process of migration is referred
to as <i>translucent</i>.</p>

<h3>Migration Server</h3>

<p>The first requirement is that a migration server be made to listen on a
socket providing virtual circuit capabilities.  In the demonstration, a
stream oriented UNIX socket (<code>AF_UNIX</code>, <code>SOCK_STREAM</code>)
is used, but a TCP/IP socket (<code>AF_INET</code>,
<code>SOCK_STREAM</code>) would work equally well.  When migration is to
occur, the process assuming responsibility for services (henceforth referred
to as the <i>service recipient</i>) will connect to this server to initiate
the migration procedure.  The server must be made aware of all groupings of
objects which represent services (henceforth referred to as <i>service object
graphs</i>) to be made available for migration.  Each of these service object
graphs will be transferred to the new process separately from the others,
though typically they will all be transferred in rapid succession.</p>

<h3>Serializability</h3>

<p>The second requirement is that every service object graph must be
serializable <a name="1_backref"><a href="#1">[1]</a></a>.  This is
necessary so that they can be migrated to another process.  Most objects
will be serialized as a simple byte stream and sent over a stream-oriented
socket, but some may be serialized in more exotic manners.  Serializers
for instances of most built-in types, including ints, strings, lists,
and dicts, are provided by framework code.  Additionally, instances of
most user-defined classes can also be serialized by using the fully
qualified name of their class and the contents of their instance <code>
__dict__</code>.  Since the focus of this migration technique is internet
 servers, a serializer for file-like objects which have been wrapped in
a <code>twisted.internet.abstract.FileDescriptor</code> object is provided
as well.  This allows sockets to be migrated without special consideration
from application-level code.</p>

<h3>Migration Client</h3>

<p>The final requirement is placed upon the service recipient rather than
the service originator.  It must be capable of issuing requests to a service
originator to take responsibility for a service.  As previously mentioned,
responsibility for a service is assumed by connecting to a server running in
the process which is to give up the service in question.  At this point, a
sequence of events which ultimately result in the state of one or more
services being transferred to a new process occurs.</p>

<h4>Authentication</h4>

<p>First, if desired, the client can be required to authenticate with the
server, proving its identity.  While UNIX sockets already provide some level
of security with respect to who may connect to the migration server, at some
times it may be desirable to pass services to a process running as a
different user, a process which would be otherwise unable to access a socket
restricted by filesystem permissions, or a process for which filesystem
permissions alone are too coarsely grained to prevent access.</p>

<h4>Secondary Transports</h4>

<p>Second, secondary transports are established, providing a path for objects
which cannot be serialized solely as a byte stream to be transported over.
There is one example of this in the current implementation: a second UNIX
socket is opened and used to pass file descriptors (via sendmsg(2) and
recvmsg(2)) to the process assuming responsibility.  Passing such things as
POSIX capabilities (via the same mechanism) is another use-case for UNIX
sockets as secondary transports.</p>

<h4>Service Object Graph Migration</h4>

<p>Next, the actual migration occurs.  The client selects a service from
among those the server publishes as available for migration.  The server
serializes the service's objects and seconds them over whatever transports
are appropriate.  While the service is in transport, it is important that
parts of it not be allowed to alter state in such a way as to invalidate the
object graph received by the client.  In a network server, this typically
means that the server must cease socket related activities, such as
receiving new data from the kernel and attempting to send buffered output
data, as well as not invoking any methods on objects belonging to the
service.  It also marks the service as in-transit so that it cannot be
requests by a second migration client while it is still being sent to the
first.  Simultaneously, the client is deserializing bytes it receives and
gradually building up an object graph representing the service it has
requested.  Deserialization is straightforward, with the added requirement
that information from more than one transport may need to be combined to
produce the correct objects.  If there is an error while migrating the
service, the server can mark the service as no longer in-transit and resume
normal processing on it, resulting in no user-visible consequences of the
failure.  If the migration is successful, then the client has assumed full
responsibility for the service and the server simply discards all objects
related to that service.</p>

<h4>Shutdown</h4>

<p>Once all services have been migrated out of a process, the process may
terminate normally.</p>

<h2>Serializers</h2>

<h3>Abstract File Descriptors</h3>

<p>As previously mentioned, <code>twisted.spread.jelly</code> is used to communicate
objects between processes.  This implementation uses interfaces, adapters,
and components to maintain a separation of concerns between application
logic and serialization logic.  Annotated versions of the serializer and
deserializer for <code>twisted.internet.abstract.FileDescriptor</code> are
given below:</p>

<pre class="python">
from twisted.spread import interfaces as ispread
from twisted.python import components

# Serialization

# When the jellier invokes ispread.IJellyable(obj), and obj is an instance
# of FileDescriptor, an instance of this class will be returned.
class FileDescriptorJellier(components.Adapter):
    __implements__ = (ispread.IJellyable,)

    # This is the main external entry point into the file descriptor
    # serialization code.  The jellier will call .jellyFor(self) on the
    # object returned by ispread.IJellyable(obj).  The object returned is an
    # s-expression representing the state of this adapted FileDescriptor
    # object.
    def jellyFor(self, jellier):
        # This is where we indicate to Twisted's reactor that this socket
        # should no longer be polled for input nor have its output buffer
        # flushed.  This ensures that state remains consistent between the
        # times when serialization begins and ends.
        self.original.stopReading()
        self.original.stopWriting()

        # The next three lines set up failure handlers, so that if there
        # is an error transferring the service this socket is part of to
        # the client, normal processing can be resumed on the server.
        a = jellier.invoker.serializingPerspective
        j = IJanitor(a)
        j.track(a.sendingServerID,
                lambda: self.original.socket.close(),
                lambda: (self.original.startReading(), self.original.startWriting()))

        # Here is where the object state is actually turned into an
        # s-expression.
        sxp = jellier.prepare(self.original)
        sxp.extend([
            reflect.qual(self.original.__class__),
            jellier.jelly(self.getStateFor(jellier))])
        return jellier.preserve(self.original, sxp)

    # This function computes the actual instance state of the
    # FileDescriptor.
    def getStateFor(self, jellier):
        state = self.original.__dict__.copy()
        ISocketStorage(jellier).put(state.pop('socket'))

        # dChannel is the secondary transport used for transmitting file
        # descriptors.  The file descriptor is sent now, then the related
        # items of state are removed from the state dictionary.  They will
        # be re-associated with the FileDescriptor which is constructed in
        # the client process.
        jellier.invoker.serializingPerspective.dChannel.transport.sendFileDescriptors([state['fileno']()])
        del state['fileno']
        return state

# Deserialization

# Unjelliers are simpler.  They are nothing more than callables which are
# passed an unjellier object and an s-expression and which return an object.

# FileDescriptorUnjellier can work on both server ports and connected
# sockets.  The mode argument specifies which.  For server ports, the mode
# is READ.  For other sockets, it is READ | WRITE.
READ = 1
WRITE = 2

class FileDescriptorUnjellier:
    def __init__(self, mode):
        self.mode = mode

    def __call__(self, unjellier, jellyList):
        # Unjellier.invoker.fdproto is the client's handle on the secondary
        # transport used to transfer file descriptors.
        fdproto = unjellier.invoker.fdproto

        # This is a common deserialization technique.
        klass = reflect.namedAny(jellyList[0])
        inst = _DummyClass()
        inst.__class__ = klass
        state = unjellier.unjelly(jellyList[1])
        inst.__dict__ = state

        # Here we determine what kind of socket has been transferred.
        addressFamily = getattr(klass, 'addressFamily', socket.AF_INET)
        socketType = getattr(klass, 'socketType', socket.SOCK_STREAM)

        # Finally, we create a socket from the file descriptor and associate
        # it with the FileDescriptor that has been instantiated.

        # fdproto.fds.pop() gives the appropriate file descriptor because
        # unjellying happens in the reverse order as jellying.  For
        # serialization schemes where this does not hold, it is a simple
        # matter to include an identifier which allows the right file
        # descriptor to be found.
        skt = socket.fromfd(fdproto.fds.pop(), addressFamily, socketType)
        socketInMyPocket(skt, inst, 'socket', self.mode)
        return inst

def socketInMyPocket(skt, instance, attribute, mode):
    setattr(instance, attribute, skt)
    instance.fileno = skt.fileno
    if mode &amp; READ:
        instance.startReading()
    if mode &amp; WRITE:
        instance.startWriting()
</pre>

<h3>Real File Descriptors</h3>

<p>Migrating abstract file descriptors is only of use if the real file
descriptors they represent can be migrated as well.  A reactor-based API
based on <code>sendmsg(2)</code> and <code>recvmsg(2)</code> is used by
this migration implementation, and part of its implementation is presented
below:</p>

<pre class="python">

from twisted.internet import unix

from sendmsg import sendmsg, recvmsg
from sendmsg import SCM_RIGHTS

class Server(unix.Server):
    def sendFileDescriptors(self, fileno, data="Filler"):
        """
        @param fileno: An iterable of the file descriptors to pass.
        """
        payload = struct.pack("%di" % len(fileno), *fileno)
        return sendmsg(self.fileno(), data, 0, (socket.SOL_SOCKET, SCM_RIGHTS, payload))

class Client(unix.Client):
    def doRead(self):
        while self.connected:
            try:
                msg, flags, ancillary = recvmsg(self.fileno())
            except socket.error, e:
                if e[0] == errno.EAGAIN:
                    break
                else:
                    log.err()
            except:
                log.err()
            else:
                if ancillary:
                    buf = ancillary[0][2]
                    fds = struct.unpack('%di' % (len(buf) / 4), buf)
                    try:
                        self.protocol.fileDescriptorsReceived(fds)
                    except:
                        log.err()
                else:
                    break
        return unix.Client.doRead(self)

</pre>

<p><code>FileDescriptorJellier</code> and <code>FileDescriptorUnjellier
</code> rely on these classes to ensure the correct real file descriptor
is associated with each abstract file descriptor which is migrated to
a new process.</p>

<h2>Results</h2>

<h3>Interactive Python Shell / Telnet Server</h3>

<p>To demonstrate this technique in a complex and stateful environment,
a migration server was added to a telnet server connected to an
interactive Python shell.  The telnet portion of the server uses classes
which have been part of Twisted for nearly two years as of the writing of
this paper, and only makes a few lines of modifications to allow migration
to be possible.</p>

<p>The first portion of a transcript of a brief telnet session
with this server is as follows:</p>

<pre class="shell">
exarkun@boson:~$ telnet localhost 8000
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

telnet.ShellFactory
Twisted 1.2.1alpha1
username: admin
password: *****
>>> import sys, os
>>> sys.version
'2.2.3+ (#1, Feb 25 2004, 23:29:31) \n[GCC 3.3.3 (Debian)]'
>>> os.getpid()
880
</pre>

<p>At this point, a migration client is started and instructed to
retrieve both the telnet server and all connected clients from
the service originator.  The remainder of the transcript results
from the migration client running the telnet server:</p>

<pre class="shell">
>>> os.getpid()
884
>>> try:
...     os.kill(880, 0)
... except OSError, e:
...     print e
... else:
...     print 'No exception occurred'
...
[Errno 3] No such process
>>> sys.version
'2.3.3 (#2, Jan 13 2004, 00:47:05) \n[GCC 3.3.3 20040110 (prerelease) (Debian)]'
>>> ^]
telnet> c
Connection closed.
</pre>

<p>The PID change indicates that a new process is handling the session
and the <code>No such process</code> error from <code>os.kill</code>
indicates that the originating process no longer exists.  Additionally,
the presence of the <code>os</code> and <code>sys</code> modules in the
interpreter namespace indicate that application state as well as socket
state has successfully migrated to the service recipient.  Finally,
the new value of <code>sys.version</code> indicates that an entirely
different version of the Python interpreter is now handling this telnet
session.</p>

<h3>Flexibility and Security Concerns</h3>

<p>It is significant to note that services can be transferred not only
between two versions of the same application, but also between otherwise
unrelated applications, so long as all the necessary deserializers are
present in the receiving application.  In the above demonstration, all
of the application logic was actually carried along with the service
object graph.  The service recipient used was extremely generic and had
no code relating to telnet or interactive Python shells until it received
the service and deserialized the <code>ShellFactory</code> it contained.
This should further underscore the need for authentication before migrating
services: this process relies on the client deserializing an arbitrary
object graph, one which a malicious server could manipulate to gain
control over the client.</p>

<h2>Conclusions</h2>

<p>Upgrading software in a production environment can be a daunting prospect;
even in the best case, when software can be upgraded and restarted without
incident, customers are always unhappy with service outages.  Translucent
service migration can provide a way to test software updates on a live
system before enacting them as well as a way to minimize user-visible system
downtime.</p>

<p>Translucent migration can also provide a means of a very rapid test/edit/test
cycle.  While the author strongly believes that unit tests are a sounder
overall approach to writing robust, correct software, he recognizes that
one of Python's many strengths is the ability of a programmer to get rapid
feedback to code changes; the test/edit/test cycle is sometimes very useful
in this area.  Similar to the <code>reload</code> built-in, migration can
provide a way to move changes into a running application very easily.  Unlike
<code>reload</code>, though, migration provides a very robust error handling
capability.  Changes which would cause an entirely application to abort
abnormally if loaded with <code>reload</code> have absolutely no effect on
a migration, relieving the programmer of the need to recover from a mistake
by restarting an application.</p>

<h2>Footnotes</h2>

<ol>
<li><a name="1">The initial implementation of this migration technique uses
<code>twisted.spread.jelly</code> as the serialization library, however it
is not restricted to this library alone; the pickle module provides all the
necessary hooks for implementing the same functionality.<a
href="#1_backref">&lt;-</a></a></li>
</ol>

</body>
</html>