cni.sgml   [plain text]


<!DOCTYPE article PUBLIC "-//Davenport//DTD DocBook V3.0//EN">
<article>
<artheader>
<title>The Cygnus Native Interface for C++/Java Integration</title>
<subtitle>Writing native Java methods in natural C++</subtitle>
<authorgroup>
<corpauthor>Cygnus Solutions</corpauthor>
</authorgroup>
<date>March, 2000</date>
</artheader>

<abstract><para>
This documents CNI, the Cygnus Native Interface,
which is is a convenient way to write Java native methods using C++.
This is a more efficient, more convenient, but less portable
alternative to the standard JNI (Java Native Interface).</para>
</abstract>

<sect1><title>Basic Concepts</title>
<para>
In terms of languages features, Java is mostly a subset
of C++.  Java has a few important extensions, plus a powerful standard
class library, but on the whole that does not change the basic similarity.
Java is a hybrid object-oriented language, with a few native types,
in addition to class types.  It is class-based, where a class may have
static as well as per-object fields, and static as well as instance methods.
Non-static methods may be virtual, and may be overloaded.  Overloading is
resolved at compile time by matching the actual argument types against
the parameter types.  Virtual methods are implemented using indirect calls
through a dispatch table (virtual function table).  Objects are
allocated on the heap, and initialized using a constructor method.
Classes are organized in a package hierarchy.
</para>
<para>
All of the listed attributes are also true of C++, though C++ has
extra features (for example in C++ objects may be allocated not just
on the heap, but also statically or in a local stack frame).  Because
<acronym>gcj</acronym> uses the same compiler technology as
<acronym>g++</acronym> (the GNU C++ compiler), it is possible
to make the intersection of the two languages use the same
<acronym>ABI</acronym> (object representation and calling conventions).
The key idea in <acronym>CNI</acronym> is that Java objects are C++ objects,
and all Java classes are C++ classes (but not the other way around).
So the most important task in integrating Java and C++ is to
remove gratuitous incompatibilities.
</para>
<para>
You write CNI code as a regular C++ source file.  (You do have to use
a Java/CNI-aware C++ compiler, specifically a recent version of G++.)</para>
<para>
You start with:
<programlisting>
#include &lt;gcj/cni.h&gt;
</programlisting></para>

<para>
You then include header files for the various Java classes you need
to use:
<programlisting>
#include &lt;java/lang/Character.h&gt;
#include &lt;java/util/Date.h&gt;
#include &lt;java/lang/IndexOutOfBoundsException.h&gt;
</programlisting></para>

<para>
In general, <acronym>CNI</acronym> functions and macros start with the
`<literal>Jv</literal>' prefix, for example the function
`<literal>JvNewObjectArray</literal>'.  This convention is used to
avoid conflicts with other libraries.
Internal functions in <acronym>CNI</acronym> start with the prefix
`<literal>_Jv_</literal>'.  You should not call these;
if you find a need to, let us know and we will try to come up with an
alternate solution.  (This manual lists <literal>_Jv_AllocBytes</literal>
as an example;  <acronym>CNI</acronym> should instead provide
a <literal>JvAllocBytes</literal> function.)</para>
<para>
These header files are automatically generated by <command>gcjh</command>.
</para>
</sect1>

<sect1><title>Packages</title>
<para>
The only global names in Java are class names, and packages.
A <firstterm>package</firstterm> can contain zero or more classes, and
also zero or more sub-packages.
Every class belongs to either an unnamed package or a package that
has a hierarchical and globally unique name.
</para>
<para>
A Java package is mapped to a C++ <firstterm>namespace</firstterm>.
The Java class <literal>java.lang.String</literal>
is in the package <literal>java.lang</literal>, which is a sub-package
of <literal>java</literal>.  The C++ equivalent is the
class <literal>java::lang::String</literal>,
which is in the namespace <literal>java::lang</literal>,
which is in the namespace <literal>java</literal>.
</para>
<para>
Here is how you could express this:
<programlisting>
// Declare the class(es), possibly in a header file:
namespace java {
  namespace lang {
    class Object;
    class String;
    ...
  }
}

class java::lang::String : public java::lang::Object
{
  ...
};
</programlisting>
</para>
<para>
The <literal>gcjh</literal> tool automatically generates the
nessary namespace declarations.</para>

<sect2><title>Nested classes as a substitute for namespaces</title>
<para>
<!-- FIXME the next line reads poorly jsm -->
It is not that long since g++ got complete namespace support,
and it was very recent (end of February 1999) that <literal>libgcj</literal>
was changed to uses namespaces.  Releases before then used
nested classes, which are the C++ equivalent of Java inner classes.
They provide similar (though less convenient) functionality.
The old syntax is:
<programlisting>
class java {
  class lang {
    class Object;
    class String;
  };
};
</programlisting>
The obvious difference is the use of <literal>class</literal> instead
of <literal>namespace</literal>.  The more important difference is
that all the members of a nested class have to be declared inside
the parent class definition, while namespaces can be defined in
multiple places in the source.  This is more convenient, since it
corresponds more closely to how Java packages are defined.
The main difference is in the declarations; the syntax for
using a nested class is the same as with namespaces:
<programlisting>
class java::lang::String : public java::lang::Object
{ ... }
</programlisting>
Note that the generated code (including name mangling)
using nested classes is the same as that using namespaces.</para>
</sect2>

<sect2><title>Leaving out package names</title>
<para>
<!-- FIXME next line reads poorly jsm -->
Having to always type the fully-qualified class name is verbose.
It also makes it more difficult to change the package containing a class.
The Java <literal>package</literal> declaration specifies that the
following class declarations are in the named package, without having
to explicitly name the full package qualifiers.
The <literal>package</literal> declaration can be followed by zero or
more <literal>import</literal> declarations, which allows either
a single class or all the classes in a package to be named by a simple
identifier.  C++ provides something similar
with the <literal>using</literal> declaration and directive.
</para>
<para>
A Java simple-type-import declaration:
<programlisting>
import <replaceable>PackageName</replaceable>.<replaceable>TypeName</replaceable>;
</programlisting>
allows using <replaceable>TypeName</replaceable> as a shorthand for
<literal><replaceable>PackageName</replaceable>.<replaceable>TypeName</replaceable></literal>.
The C++ (more-or-less) equivalent is a <literal>using</literal>-declaration:
<programlisting>
using <replaceable>PackageName</replaceable>::<replaceable>TypeName</replaceable>;
</programlisting>
</para>
<para>
A Java import-on-demand declaration:
<programlisting>
import <replaceable>PackageName</replaceable>.*;
</programlisting>
allows using <replaceable>TypeName</replaceable> as a shorthand for
<literal><replaceable>PackageName</replaceable>.<replaceable>TypeName</replaceable></literal>
The C++ (more-or-less) equivalent is a <literal>using</literal>-directive:
<programlisting>
using namespace <replaceable>PackageName</replaceable>;
</programlisting>
</para>
</sect2>
</sect1>

<sect1><title>Primitive types</title>
<para>
Java provides 8 <quote>primitives</quote> types:
<literal>byte</literal>, <literal>short</literal>, <literal>int</literal>,
<literal>long</literal>, <literal>float</literal>, <literal>double</literal>,
<literal>char</literal>, and <literal>boolean</literal>.
These are the same as the following C++ <literal>typedef</literal>s
(which are defined by <literal>gcj/cni.h</literal>):
<literal>jbyte</literal>, <literal>jshort</literal>, <literal>jint</literal>,
<literal>jlong</literal>, <literal>jfloat</literal>,
<literal>jdouble</literal>,
<literal>jchar</literal>, and <literal>jboolean</literal>.
You should use the C++ typenames
(<ForeignPhrase><Abbrev>e.g.</Abbrev></ForeignPhrase> <literal>jint</literal>),
and not the Java types names
(<ForeignPhrase><Abbrev>e.g.</Abbrev></ForeignPhrase> <literal>int</literal>),
even if they are <quote>the same</quote>.
This is because there is no guarantee that the C++ type
<literal>int</literal> is a 32-bit type, but <literal>jint</literal>
<emphasis>is</emphasis> guaranteed to be a 32-bit type.

<informaltable frame="all" colsep="1" rowsep="0">
<tgroup cols="3">
<thead>
<row>
<entry>Java type</entry>
<entry>C/C++ typename</entry>
<entry>Description</entry>
</thead>
<tbody>
<row>
<entry>byte</entry>
<entry>jbyte</entry>
<entry>8-bit signed integer</entry>
</row>
<row>
<entry>short</entry>
<entry>jshort</entry>
<entry>16-bit signed integer</entry>
</row>
<row>
<entry>int</entry>
<entry>jint</entry>
<entry>32-bit signed integer</entry>
</row>
<row>
<entry>long</entry>
<entry>jlong</entry>
<entry>64-bit signed integer</entry>
</row>
<row>
<entry>float</entry>
<entry>jfloat</entry>
<entry>32-bit IEEE floating-point number</entry>
</row>
<row>
<entry>double</entry>
<entry>jdouble</entry>
<entry>64-bit IEEE floating-point number</entry>
</row>
<row>
<entry>char</entry>
<entry>jchar</entry>
<entry>16-bit Unicode character</entry>
</row>
<row>
<entry>boolean</entry>
<entry>jboolean</entry>
<entry>logical (Boolean) values</entry>
</row>
<row>
<entry>void</entry>
<entry>void</entry>
<entry>no value</entry>
</row>
</tbody></tgroup>
</informaltable>
</para>

<para>
<funcsynopsis>
<funcdef><function>JvPrimClass</function></funcdef>
<paramdef><parameter>primtype</parameter></paramdef>
</funcsynopsis>
This is a macro whose argument should be the name of a primitive
type, <ForeignPhrase><Abbrev>e.g.</Abbrev></ForeignPhrase>
<literal>byte</literal>.
The macro expands to a pointer to the <literal>Class</literal> object
corresponding to the primitive type.
<ForeignPhrase><Abbrev>E.g.</Abbrev></ForeignPhrase>,
<literal>JvPrimClass(void)</literal>
has the same value as the Java expression
<literal>Void.TYPE</literal> (or <literal>void.class</literal>).
</para>

</sect1>

<sect1><title>Objects and Classes</title>
<sect2><title>Classes</title>
<para>
All Java classes are derived from <literal>java.lang.Object</literal>.
C++ does not have a unique <quote>root</quote>class, but we use
a C++ <literal>java::lang::Object</literal> as the C++ version
of the <literal>java.lang.Object</literal> Java class.  All
other Java classes are mapped into corresponding C++ classes
derived from <literal>java::lang::Object</literal>.</para>
<para>
Interface inheritance (the <quote><literal>implements</literal></quote>
keyword) is currently not reflected in the C++ mapping.</para>
</sect2>
<sect2><title>Object references</title>
<para>
We implement a Java object reference as a pointer to the start
of the referenced object.  It maps to a C++ pointer.
(We cannot use C++ references for Java references, since
once a C++ reference has been initialized, you cannot change it to
point to another object.)
The <literal>null</literal> Java reference maps to the <literal>NULL</literal>
C++ pointer.
</para>
<para>
Note that in some Java implementations an object reference is implemented as
a pointer to a two-word <quote>handle</quote>.  One word of the handle
points to the fields of the object, while the other points
to a method table.  Gcj does not use this extra indirection.
</para>
</sect2>
<sect2><title>Object fields</title>
<para>
Each object contains an object header, followed by the instance
fields of the class, in order.  The object header consists of
a single pointer to a dispatch or virtual function table.
(There may be extra fields <quote>in front of</quote> the object,
for example for
memory management, but this is invisible to the application, and
the reference to the object points to the dispatch table pointer.)
</para>
<para>
The fields are laid out in the same order, alignment, and size
as in C++.  Specifically, 8-bite and 16-bit native types
(<literal>byte</literal>, <literal>short</literal>, <literal>char</literal>,
and <literal>boolean</literal>) are <emphasis>not</emphasis>
widened to 32 bits.
Note that the Java VM does extend 8-bit and 16-bit types to 32 bits
when on the VM stack or temporary registers.</para>
<para>
If you include the <literal>gcjh</literal>-generated header for a
class, you can access fields of Java classes in the <quote>natural</quote>
way.  Given the following Java class:
<programlisting>
public class Int
{
  public int i;
  public Integer (int i) { this.i = i; }
  public static zero = new Integer(0);
}
</programlisting>
you can write:
<programlisting>
#include &lt;gcj/cni.h&gt;
#include &lt;Int.h&gt;
Int*
mult (Int *p, jint k)
{
  if (k == 0)
    return Int::zero;  // static member access.
  return new Int(p->i * k);
}
</programlisting>
</para>
<para>
<acronym>CNI</acronym> does not strictly enforce the Java access
specifiers, because Java permissions cannot be directly mapped
into C++ permission.  Private Java fields and methods are mapped
to private C++ fields and methods, but other fields and methods
are mapped to public fields and methods.
</para>
</sect2>
</sect1>

<sect1><title>Arrays</title>
<para>
While in many ways Java is similar to C and C++,
it is quite different in its treatment of arrays.
C arrays are based on the idea of pointer arithmetic,
which would be incompatible with Java's security requirements.
Java arrays are true objects (array types inherit from
<literal>java.lang.Object</literal>).  An array-valued variable
is one that contains a reference (pointer) to an array object.
</para>
<para>
Referencing a Java array in C++ code is done using the
<literal>JArray</literal> template, which as defined as follows:
<programlisting>
class __JArray : public java::lang::Object
{
public:
  int length;
};

template&lt;class T&gt;
class JArray : public __JArray
{
  T data[0];
public:
  T&amp; operator[](jint i) { return data[i]; }
};
</programlisting></para>
<para>
<funcsynopsis> 
   <funcdef>template&lt;class T&gt;  T *<function>elements</function></funcdef>
   <paramdef>JArray&lt;T&gt; &amp;<parameter>array</parameter></paramdef>
</funcsynopsis>
   This template function can be used to get a pointer to the
   elements of the <parameter>array</parameter>.
   For instance, you can fetch a pointer
   to the integers that make up an <literal>int[]</literal> like so:
<programlisting>
extern jintArray foo;
jint *intp = elements (foo);
</programlisting>
The name of this function may change in the future.</para>
<para>
There are a number of typedefs which correspond to typedefs from JNI.
Each is the type of an array holding objects of the appropriate type:
<programlisting>
typedef __JArray *jarray;
typedef JArray&lt;jobject&gt; *jobjectArray;
typedef JArray&lt;jboolean&gt; *jbooleanArray;
typedef JArray&lt;jbyte&gt; *jbyteArray;
typedef JArray&lt;jchar&gt; *jcharArray;
typedef JArray&lt;jshort&gt; *jshortArray;
typedef JArray&lt;jint&gt; *jintArray;
typedef JArray&lt;jlong&gt; *jlongArray;
typedef JArray&lt;jfloat&gt; *jfloatArray;
typedef JArray&lt;jdouble&gt; *jdoubleArray;
</programlisting>
</para>
<para>
 You can create an array of objects using this function:
<funcsynopsis> 
   <funcdef>jobjectArray <function>JvNewObjectArray</function></funcdef>
   <paramdef>jint <parameter>length</parameter></paramdef>
   <paramdef>jclass <parameter>klass</parameter></paramdef>
   <paramdef>jobject <parameter>init</parameter></paramdef>
   </funcsynopsis>
   Here <parameter>klass</parameter> is the type of elements of the array;
   <parameter>init</parameter> is the initial
   value to be put into every slot in the array.
</para>
<para>
For each primitive type there is a function which can be used
   to create a new array holding that type.  The name of the function
   is of the form
   `<literal>JvNew&lt;<replaceable>Type</replaceable>&gt;Array</literal>',
   where `&lt;<replaceable>Type</replaceable>&gt;' is the name of
   the primitive type, with its initial letter in upper-case.  For
   instance, `<literal>JvNewBooleanArray</literal>' can be used to create
   a new array of booleans.
   Each such function follows this example:
<funcsynopsis>  
   <funcdef>jbooleanArray <function>JvNewBooleanArray</function></funcdef> 
   <paramdef>jint <parameter>length</parameter></paramdef>
</funcsynopsis>
</para>
<para>
<funcsynopsis>
   <funcdef>jsize <function>JvGetArrayLength</function></funcdef>
   <paramdef>jarray <parameter>array</parameter></paramdef> 
   </funcsynopsis>
   Returns the length of <parameter>array</parameter>.</para>
</sect1>

<sect1><title>Methods</title>

<para>
Java methods are mapped directly into C++ methods.
The header files generated by <literal>gcjh</literal>
include the appropriate method definitions.
Basically, the generated methods have the same names and
<quote>corresponding</quote> types as the Java methods,
and are called in the natural manner.</para>

<sect2><title>Overloading</title>
<para>
Both Java and C++ provide method overloading, where multiple
methods in a class have the same name, and the correct one is chosen
(at compile time) depending on the argument types.
The rules for choosing the correct method are (as expected) more complicated
in C++ than in Java, but given a set of overloaded methods
generated by <literal>gcjh</literal> the C++ compiler will choose
the expected one.</para>
<para>
Common assemblers and linkers are not aware of C++ overloading,
so the standard implementation strategy is to encode the
parameter types of a method into its assembly-level name.
This encoding is called <firstterm>mangling</firstterm>,
and the encoded name is the <firstterm>mangled name</firstterm>.
The same mechanism is used to implement Java overloading.
For C++/Java interoperability, it is important that both the Java
and C++ compilers use the <emphasis>same</emphasis> encoding scheme.
</para>
</sect2>

<sect2><title>Static methods</title>
<para>
Static Java methods are invoked in <acronym>CNI</acronym> using the standard
C++ syntax, using the `<literal>::</literal>' operator rather
than the `<literal>.</literal>' operator.  For example:
</para>
<programlisting>
jint i = java::lang::Math::round((jfloat) 2.3);
</programlisting>
<para>
<!-- FIXME this next sentence seems ungammatical jsm -->
Defining a static native method uses standard C++ method
definition syntax.  For example:
<programlisting>
#include &lt;java/lang/Integer.h&gt;
java::lang::Integer*
java::lang::Integer::getInteger(jstring str)
{
  ...
}
</programlisting>
</sect2>

<sect2><title>Object Constructors</title>
<para>
Constructors are called implicitly as part of object allocation
using the <literal>new</literal> operator.  For example:
<programlisting> 
java::lang::Int x = new java::lang::Int(234);
</programlisting> 
</para>
<para>
<!-- FIXME rewrite needed here, mine may not be good jsm -->
Java does not allow a constructor to be a native method.
Instead, you could define a private method which
you can have the constructor call.
</para>
</sect2>

<sect2><title>Instance methods</title>
<para>
<!-- FIXME next para week, I would remove a few words from some sentences jsm -->
Virtual method dispatch is handled essentially the same way
in C++ and Java -- <abbrev>i.e.</abbrev> by doing an
indirect call through a function pointer stored in a per-class virtual
function table.  C++ is more complicated because it has to support
multiple inheritance, but this does not effect Java classes.
However, G++ has historically used a different calling convention
that is not compatible with the one used by <acronym>gcj</acronym>.
During 1999, G++ will switch to a new ABI that is compatible with
<acronym>gcj</acronym>.  Some platforms (including Linux) have already
changed.  On other platforms, you will have to pass
the <literal>-fvtable-thunks</literal> flag to g++ when
compiling <acronym>CNI</acronym> code.  Note that you must also compile
your C++ source code with <literal>-fno-rtti</literal>.
</para>
<para>
Calling a Java instance method in <acronym>CNI</acronym> is done
using the standard C++ syntax.  For example:
<programlisting>
  java::lang::Number *x;
  if (x-&gt;doubleValue() &gt; 0.0) ...
</programlisting>
</para>
<para>
Defining a Java native instance method is also done the natural way:
<programlisting>
#include &lt;java/lang/Integer.h&gt;
jdouble
java::lang:Integer::doubleValue()
{
  return (jdouble) value;
}
</programlisting>
</para>
</sect2>

<sect2><title>Interface method calls</title>
<para>
In Java you can call a method using an interface reference.
This is not yet supported in <acronym>CNI</acronym>.</para>
</sect2>
</sect1>

<sect1><title>Object allocation</title>

<para>
New Java objects are allocated using a
<firstterm>class-instance-creation-expression</firstterm>:
<programlisting>
new <replaceable>Type</replaceable> ( <replaceable>arguments</replaceable> )
</programlisting>
The same syntax is used in C++.  The main difference is that
C++ objects have to be explicitly deleted; in Java they are
automatically deleted by the garbage collector.
Using <acronym>CNI</acronym>, you can allocate a new object
using standard C++ syntax.  The C++ compiler is smart enough to
realize the class is a Java class, and hence it needs to allocate
memory from the garbage collector.  If you have overloaded
constructors, the compiler will choose the correct one
using standard C++ overload resolution rules.  For example:
<programlisting>
java::util::Hashtable *ht = new java::util::Hashtable(120);
</programlisting>
</para>
<para>
<funcsynopsis>
  <funcdef>void *<function>_Jv_AllocBytes</function></funcdef>
  <paramdef>jsize <parameter>size</parameter></paramdef>
</funcsynopsis>
   Allocate <parameter>size</parameter> bytes.  This memory is not
   scanned by the garbage collector.  However, it will be freed by
the GC if no references to it are discovered.
</para>
</sect1>

<sect1><title>Interfaces</title>
<para>
A Java class can <firstterm>implement</firstterm> zero or more
<firstterm>interfaces</firstterm>, in addition to inheriting from
a single base class. 
An interface is a collection of constants and method specifications;
it is similar to the <firstterm>signatures</firstterm> available
as a G++ extension.  An interface provides a subset of the
functionality of C++ abstract virtual base classes, but they
are currently implemented differently.
CNI does not currently provide any support for interfaces,
or calling methods from an interface pointer.
This is partly because we are planning to re-do how
interfaces are implemented in <acronym>gcj</acronym>.
</para>
</sect1>

<sect1><title>Strings</title>
<para>
<acronym>CNI</acronym> provides a number of utility functions for
working with Java <literal>String</literal> objects.
The names and interfaces are analogous to those of <acronym>JNI</acronym>.
</para>

<para>
<funcsynopsis>
  <funcdef>jstring <function>JvNewString</function></funcdef>
  <paramdef>const jchar *<parameter>chars</parameter></paramdef>
  <paramdef>jsize <parameter>len</parameter></paramdef>
  </funcsynopsis>
  Creates a new Java String object, where
  <parameter>chars</parameter> are the contents, and
  <parameter>len</parameter> is the number of characters.
</para>

<para>
<funcsynopsis>
  <funcdef>jstring <function>JvNewStringLatin1</function></funcdef>
  <paramdef>const char *<parameter>bytes</parameter></paramdef>
  <paramdef>jsize <parameter>len</parameter></paramdef>
 </funcsynopsis>
  Creates a new Java String object, where <parameter>bytes</parameter>
  are the Latin-1 encoded
  characters, and <parameter>len</parameter> is the length of
  <parameter>bytes</parameter>, in bytes.
</para>

<para>
<funcsynopsis>
  <funcdef>jstring <function>JvNewStringLatin1</function></funcdef>
  <paramdef>const char *<parameter>bytes</parameter></paramdef>
  </funcsynopsis>
  Like the first JvNewStringLatin1, but computes <parameter>len</parameter>
  using <literal>strlen</literal>.
</para>

<para>
<funcsynopsis>
  <funcdef>jstring <function>JvNewStringUTF</function></funcdef>
  <paramdef>const char *<parameter>bytes</parameter></paramdef>
  </funcsynopsis>
   Creates a new Java String object, where <parameter>bytes</parameter> are
   the UTF-8 encoded characters of the string, terminated by a null byte.
</para>

<para>
<funcsynopsis>
   <funcdef>jchar *<function>JvGetStringChars</function></funcdef>
  <paramdef>jstring <parameter>str</parameter></paramdef>
  </funcsynopsis>
   Returns a pointer to the array of characters which make up a string.
</para>

<para>
<funcsynopsis>
   <funcdef> int <function>JvGetStringUTFLength</function></funcdef>
  <paramdef>jstring <parameter>str</parameter></paramdef>
  </funcsynopsis>
   Returns number of bytes required to encode contents
   of <parameter>str</parameter> as UTF-8.
</para>

<para>
<funcsynopsis>
  <funcdef> jsize <function>JvGetStringUTFRegion</function></funcdef>
  <paramdef>jstring <parameter>str</parameter></paramdef>
  <paramdef>jsize <parameter>start</parameter></paramdef>
  <paramdef>jsize <parameter>len</parameter></paramdef>
  <paramdef>char *<parameter>buf</parameter></paramdef>
  </funcsynopsis>
  This puts the UTF-8 encoding of a region of the
  string <parameter>str</parameter> into
  the buffer <parameter>buf</parameter>.
  The region of the string to fetch is specifued by
  <parameter>start</parameter> and <parameter>len</parameter>.
   It is assumed that <parameter>buf</parameter> is big enough
   to hold the result.  Note
   that <parameter>buf</parameter> is <emphasis>not</emphasis> null-terminated.
</para>
</sect1>

<sect1><title>Class Initialization</title>
<para>
Java requires that each class be automatically initialized at the time 
of the first active use.  Initializing a class involves 
initializing the static fields, running code in class initializer 
methods, and initializing base classes.  There may also be 
some implementation specific actions, such as allocating 
<classname>String</classname> objects corresponding to string literals in
the code.</para>
<para>
The Gcj compiler inserts calls to <literal>JvInitClass</literal> (actually
<literal>_Jv_InitClass</literal>) at appropriate places to ensure that a
class is initialized when required.  The C++ compiler does not
insert these calls automatically - it is the programmer's
responsibility to make sure classes are initialized.  However,
this is fairly painless because of the conventions assumed by the Java
system.</para>
<para>
First, <literal>libgcj</literal> will make sure a class is initialized
before an instance of that object is created.  This is one
of the responsibilities of the <literal>new</literal> operation.  This is
taken care of both in Java code, and in C++ code.  (When the G++
compiler sees a <literal>new</literal> of a Java class, it will call
a routine in <literal>libgcj</literal> to allocate the object, and that
routine will take care of initializing the class.)  It follows that you can
access an instance field, or call an instance (non-static)
method and be safe in the knowledge that the class and all
of its base classes have been initialized.</para>
<para>
Invoking a static method is also safe.  This is because the
Java compiler adds code to the start of a static method to make sure
the class is initialized.  However, the C++ compiler does not
add this extra code.  Hence, if you write a native static method
using CNI, you are responsible for calling <literal>JvInitClass</literal>
before doing anything else in the method (unless you are sure
it is safe to leave it out).</para>
<para>
Accessing a static field also requires the class of the
field to be initialized.  The Java compiler will generate code
to call <literal>_Jv_InitClass</literal> before getting or setting the field.
However, the C++ compiler will not generate this extra code,
so it is your responsibility to make sure the class is
initialized before you access a static field.</para>
</sect1>
<sect1><title>Exception Handling</title>
<para>
While C++ and Java share a common exception handling framework,
things are not yet perfectly integrated.  The main issue is that the
<quote>run-time type information</quote> facilities of the two
languages are not integrated.</para>
<para>
Still, things work fairly well.  You can throw a Java exception from
C++ using the ordinary <literal>throw</literal> construct, and this
exception can be caught by Java code.  Similarly, you can catch an
exception thrown from Java using the C++ <literal>catch</literal>
construct.
<para>
Note that currently you cannot mix C++ catches and Java catches in
a single C++ translation unit.  We do intend to fix this eventually.
</para>
<para>
Here is an example:
<programlisting>
if (i >= count)
   throw new java::lang::IndexOutOfBoundsException();
</programlisting>
</para>
<para>
Normally, GNU C++ will automatically detect when you are writing C++
code that uses Java exceptions, and handle them appropriately.
However, if C++ code only needs to execute destructors when Java
exceptions are thrown through it, GCC will guess incorrectly.  Sample
problematic code:
<programlisting>
  struct S { ~S(); };
  extern void bar();    // is implemented in Java and may throw exceptions
  void foo()
  {
    S s;
    bar();
  }
</programlisting>
The usual effect of an incorrect guess is a link failure, complaining of
a missing routine called <literal>__gxx_personality_v0</literal>.
</para>
<para>
You can inform the compiler that Java exceptions are to be used in a
translation unit, irrespective of what it might think, by writing
<literal>#pragma GCC java_exceptions</literal> at the head of the
file.  This <literal>#pragma</literal> must appear before any
functions that throw or catch exceptions, or run destructors when
exceptions are thrown through them.</para>
</sect1>

<sect1><title>Synchronization</title>
<para>
Each Java object has an implicit monitor.
The Java VM uses the instruction <literal>monitorenter</literal> to acquire
and lock a monitor, and <literal>monitorexit</literal> to release it.
The JNI has corresponding methods <literal>MonitorEnter</literal>
and <literal>MonitorExit</literal>.  The corresponding CNI macros
are <literal>JvMonitorEnter</literal> and <literal>JvMonitorExit</literal>.
</para>
<para>
The Java source language does not provide direct access to these primitives.
Instead, there is a <literal>synchronized</literal> statement that does an
implicit <literal>monitorenter</literal> before entry to the block,
and does a <literal>monitorexit</literal> on exit from the block.
Note that the lock has to be released even the block is abnormally
terminated by an exception, which means there is an implicit
<literal>try</literal>-<literal>finally</literal>.
</para>
<para>
From C++, it makes sense to use a destructor to release a lock.
CNI defines the following utility class.
<programlisting>
class JvSynchronize() {
  jobject obj;
  JvSynchronize(jobject o) { obj = o; JvMonitorEnter(o); }
  ~JvSynchronize() { JvMonitorExit(obj); }
};
</programlisting>
The equivalent of Java's:
<programlisting>
synchronized (OBJ) { CODE; }
</programlisting>
can be simply expressed:
<programlisting>
{ JvSynchronize dummy(OBJ); CODE; }
</programlisting>
</para>
<para>
Java also has methods with the <literal>synchronized</literal> attribute.
This is equivalent to wrapping the entire method body in a
<literal>synchronized</literal> statement.
(Alternatively, an implementation could require the caller to do
the synchronization.  This is not practical for a compiler, because
each virtual method call would have to test at run-time if
synchronization is needed.)  Since in <literal>gcj</literal>
the <literal>synchronized</literal> attribute is handled by the
method implementation, it is up to the programmer
of a synchronized native method to handle the synchronization
(in the C++ implementation of the method).
In otherwords, you need to manually add <literal>JvSynchronize</literal>
in a <literal>native synchornized</literal> method.</para>
</sect1>

<sect1><title>Reflection</title>
<para>The types <literal>jfieldID</literal> and <literal>jmethodID</literal>
are as in JNI.</para>
<para>
The function <literal>JvFromReflectedField</literal>,
<literal>JvFromReflectedMethod</literal>,
<literal>JvToReflectedField</literal>, and
<literal>JvToFromReflectedMethod</literal> (as in Java 2 JNI)
will be added shortly, as will other functions corresponding to JNI.</para>

<sect1><title>Using gcjh</title>
<para>
      The <command>gcjh</command> is used to generate C++ header files from
      Java class files.  By default, <command>gcjh</command> generates
      a relatively straightforward C++ header file.  However, there
      are a few caveats to its use, and a few options which can be
      used to change how it operates:
</para>
<variablelist>
<varlistentry>
<term><literal>--classpath</literal> <replaceable>path</replaceable></term>
<term><literal>--CLASSPATH</literal> <replaceable>path</replaceable></term>
<term><literal>-I</literal> <replaceable>dir</replaceable></term>
<listitem><para>
        These options can be used to set the class path for gcjh.
        Gcjh searches the class path the same way the compiler does;
	these options have their familiar meanings.</para>
</listitem>
</varlistentry>

<varlistentry>
<term><literal>-d <replaceable>directory</replaceable></literal></term>
<listitem><para>
Puts the generated <literal>.h</literal> files
beneath <replaceable>directory</replaceable>.</para>
</listitem>
</varlistentry>

<varlistentry>
<term><literal>-o <replaceable>file</replaceable></literal></term>
<listitem><para>
        Sets the name of the <literal>.h</literal> file to be generated.
        By default the <literal>.h</literal> file is named after the class.
        This option only really makes sense if just a single class file
        is specified.</para>
</listitem>
</varlistentry>

<varlistentry>
<term><literal>--verbose</literal></term>
<listitem><para>
        gcjh will print information to stderr as it works.</para>
</listitem>
</varlistentry>

<varlistentry>
<term><literal>-M</literal></term>
<term><literal>-MM</literal></term>
<term><literal>-MD</literal></term>
<term><literal>-MMD</literal></term>
<listitem><para>
        These options can be used to generate dependency information
        for the generated header file.  They work the same way as the
        corresponding compiler options.</para>
</listitem>
</varlistentry>

<varlistentry>
<term><literal>-prepend <replaceable>text</replaceable></literal></term>
<listitem><para>
This causes the <replaceable>text</replaceable> to be put into the generated
        header just after class declarations (but before declaration
        of the current class).  This option should be used with caution.</para>
</listitem>
</varlistentry>

<varlistentry> 
<term><literal>-friend <replaceable>text</replaceable></literal></term>
<listitem><para>
This causes the <replaceable>text</replaceable> to be put into the class
declaration after a <literal>friend</literal> keyword.
This can be used to declare some
        other class or function to be a friend of this class.
        This option should be used with caution.</para>
</listitem>
</varlistentry>

<varlistentry>  
<term><literal>-add <replaceable>text</replaceable></literal></term>
<listitem><para>
The <replaceable>text</replaceable> is inserted into the class declaration.
This option should be used with caution.</para>
</listitem>
</varlistentry>

<varlistentry> 
<term><literal>-append <replaceable>text</replaceable></literal></term>
<listitem><para>
The <replaceable>text</replaceable> is inserted into the header file
after the class declaration.  One use for this is to generate
inline functions.  This option should be used with caution.
</listitem>
</varlistentry>
</variablelist>
<para>
All other options not beginning with a <literal>-</literal> are treated
as the names of classes for which headers should be generated.</para>
<para>
gcjh will generate all the required namespace declarations and
<literal>#include</literal>'s for the header file.
In some situations, gcjh will generate simple inline member
functions.  Note that, while gcjh puts <literal>#pragma
interface</literal> in the generated header file, you should
<emphasis>not</emphasis> put <literal>#pragma implementation</literal>
into your C++ source file.  If you do, duplicate definitions of
inline functions will sometimes be created, leading to link-time
errors.
</para>
<para>
There are a few cases where gcjh will fail to work properly:</para>
<para>
gcjh assumes that all the methods and fields of a class have ASCII
names.  The C++ compiler cannot correctly handle non-ASCII
identifiers.  gcjh does not currently diagnose this problem.</para>
<para>
gcjh also cannot fully handle classes where a field and a method have
the same name.  If the field is static, an error will result.
Otherwise, the field will be renamed in the generated header; `__'
will be appended to the field name.</para>
<para>
Eventually we hope to change the C++ compiler so that these
restrictions can be lifted.</para>
</sect1>

</article>