ex_comm.html   [plain text]


<!--$Id: ex_comm.so,v 1.9 2006/08/24 17:59:56 bostic Exp $-->
<!--Copyright (c) 1997,2008 Oracle.  All rights reserved.-->
<!--See the file LICENSE for redistribution information.-->
<html>
<head>
<title>Berkeley DB Reference Guide: Ex_rep_base: a TCP/IP based communication infrastructure</title>
<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
<meta name="keywords" content="embedded,database,programmatic,toolkit,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,Java,C,C++">
</head>
<body bgcolor=white>
<table width="100%"><tr valign=top>
<td><b><dl><dt>Berkeley DB Reference Guide:<dd>Berkeley DB Replication</dl></b></td>
<td align=right><a href="../rep/ex.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../rep/ex_rq.html"><img src="../../images/next.gif" alt="Next"></a>
</td></tr></table>
<p align=center><b>Ex_rep_base: a TCP/IP based communication infrastructure</b></p>
<p>
Applications which use the Base replication API must implement a
communication infrastructure.  The communication infrastructure
consists of three parts: a way to map environment IDs to particular
sites, the functions to get and receive messages, and the application
architecture that supports the particular communication infrastructure
used (for example, individual threads per communicating site, a shared
message handler for all sites, a hybrid solution).  The communication
infrastructure for ex_rep_base is implemented in the file
<b>ex_rep/base/rep_net.c</b>, and each part of that infrastructure
is described as follows.</p>
<p>Ex_rep_base maintains a table of environment ID to TCP/IP port
mappings.  A pointer to this table is stored in a structure pointed to
by the app_private field of the <a href="../../api_c/env_class.html">DB_ENV</a> object so it can be
accessed by any function that has the database environment handle.
The table is represented by a machtab_t structure which contains a
reference to a linked list of member_t's, both of which are defined in
<b>ex_rep/base/rep_net.c</b>.  Each member_t contains the host and
port identification, the environment ID, and a file descriptor.</p>
<p>This design is particular to this application and communication
infrastructure, but provides an indication of the sort of functionality
that is needed to maintain the application-specific state for a
TCP/IP-based infrastructure.  The goal of the table and its interfaces
is threefold: First, it must guarantee that given an environment ID,
the send function can send a message to the appropriate place.  Second,
when given the special environment ID <a href="../../api_c/rep_transport.html#DB_EID_BROADCAST">DB_EID_BROADCAST</a>, the send
function can send messages to all the machines in the group.  Third,
upon receipt of an incoming message, the receive function can correctly
identify the sender and pass the appropriate environment ID to the
<a href="../../api_c/rep_message.html">DB_ENV-&gt;rep_process_message</a> method.</p>
<p>Mapping a particular environment ID to a specific port is accomplished
by looping through the linked list until the desired environment ID is
found.  Broadcast communication is implemented by looping through the
linked list and sending to each member found.  Since each port
communicates with only a single other environment, receipt of a message
on a particular port precisely identifies the sender.</p>
<p>This is implemented in the quote_send, quote_send_broadcast and
quote_send_one functions, which can be found in
<b>ex_rep/base/rep_net.c</b>.</p>
<p>The example provided is merely one way to satisfy these requirements,
and there are alternative implementations as well.  For instance,
instead of associating separate socket connections with each remote
environment, an application might instead label each message with a
sender identifier; instead of looping through a table and sending a
copy of a message to each member of the replication group, the
application could send a single message using a broadcast protocol.</p>
<p>The quote_send function is passed as the callback to
<a href="../../api_c/rep_transport.html">DB_ENV-&gt;rep_set_transport</a>; Berkeley DB automatically sends messages as needed
for replication.  The receive function is a mirror to the quote_send_one
function.  It is not a callback function (the application is responsible
for collecting messages and calling <a href="../../api_c/rep_message.html">DB_ENV-&gt;rep_process_message</a> on them as is
convenient).  In the sample application, all messages transmitted are
Berkeley DB messages that get handled by <a href="../../api_c/rep_message.html">DB_ENV-&gt;rep_process_message</a>, however, this
is not always going to be the case.  The application may want to pass
its own messages across the same channels, distinguish between its own
messages and those of Berkeley DB, and then pass only the Berkeley DB ones to
<a href="../../api_c/rep_message.html">DB_ENV-&gt;rep_process_message</a>.</p>
<p>The final component of the communication infrastructure is the process
model used to communicate with all the sites in the replication group.
Each site creates a thread of control that listens on its designated
socket (as specified by the <b>-m</b> command line argument) and
then creates a new channel for each site that contacts it.  In addition,
each site explicitly connects to the sites specified in the
<b>-o</b> command line argument.  This is a fairly standard TCP/IP
process architecture and is implemented by the following functions (all
in <b>ex_rep/base/rep_net.c</b>).</p>
<table width="100%"><tr><td><br></td><td align=right><a href="../rep/ex.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../rep/ex_rq.html"><img src="../../images/next.gif" alt="Next"></a>
</td></tr></table>
<p><font size=1>Copyright (c) 1996,2008 Oracle.  All rights reserved.</font>
</body>
</html>