rld.3 [plain text]

.TH RLD 3 "August 3, 2000" "Apple Computer, Inc."
.SH NAME
rld_load, rld_load_from_memory, rld_unload, rld_lookup, rld_forget_symbol, rld_unload_all, rld_load_basefile, rld_address_func, rld_write_symfile \- programmatically link edit and load object files
.SH SYNOPSIS
.nf
.PP
#include <rld.h>
extern long rld_load(
	NXStream *stream,
	struct mach_header **header_addr,
	const char * const *object_filenames,
	const char *output_filename);
.sp .5
extern long rld_load_from_memory(
	NXStream *stream,
	struct mach_header **header_addr,
	const char *object_name,
	char *object_addr,
	long *object_size,
	const char *output_filename);
.sp .5
extern long rld_unload(
	NXStream *stream);
.sp .5
extern long rld_lookup(
	NXStream *stream,
	const char *symbol_name,
	unsigned long *value);
.sp .5
extern long rld_forget_symbol(
	NXStream *stream,
	const char *symbol_name);
.sp .5
extern long rld_unload_all(
	NXStream *stream,
	long deallocate_sets);
.sp .5
extern long rld_load_basefile(
	NXStream *stream,
	const char *base_filename);
.sp.5
extern void rld_address_func(
	unsigned long (*func)(unsigned long size, unsigned long headers_size));
.sp .5
extern long rld_write_symfile(
	NXStream *stream,
	const char *output_filename);
.fi
.SH DESCRIPTION
.I rld_load
link edits and loads the files specified by the NULL-terminated array
.I object_filenames
into the program that called it.  Alternatively, it can be used to load into another program.
An object name can be an archive, in which case only those members defining 
undefined symbols will be loaded.
.PP
If the program is to allow the loaded
object files to use symbols from itself, it must be built with the
.B \-seglinkedit
option of the link editor,
.IR ld (1),
in order to have its symbol table mapped into memory.
.PP
The symbol table may be trimmed to limit which symbols are allowed to be
referenced by loaded objects.  This can be accomplished with the
.B "\-s filename"
option to
.IR strip (1).
For the routines described here, only global symbols are used, so local
symbols can be removed with the
.B \-x
option to
.IR ld (1)
or
.IR strip (1).
Doing so saves space in the final program and vastly decreases the time
spent by the first call to
.IR rld_load .
(This is true of the first call in the program, as well as the first call after an invocation of
.IR rld_unload_all).
The first call to
.I rld_load
must go through all the symbols of the program, so if the program has been
compiled for debugging (for example), it can take orders of magnitude longer.
.PP
Since the objects loaded with
.I rld_load 
can only use symbols that appear in the executable program,
if the program uses a library and wants to make all the symbols in that
library available to the loaded objects, it must force all of the library
symbols into the executable.
This can be done for all libraries with the
.B \-all_load
option to
.IR ld (1)
when building the executable.  For libraries that aren't shared, however,
this will copy all the library code into the executable.
Thus this approach may be desirable only for the shared libraries, where the library code isn't copied into the executable.
This can be done easily for shared libraries by having the
executable program reference the shared library reference symbol for each of
the shared libraries it uses.  The shared library reference symbol is named
after the base name of the target library (its name up to the first '.').  
For example, the target
shared library 
.I /usr/shlib/libsys_s.B.shlib 
has the shared library reference symbol
.I libsys_s.
This name intentionally doesn't start with an underscore, '_', 
in order to be out of the normal name space for C external symbols.  
So to reference the symbol in this example, 
.IR ld (1)
would be invoked with the flag
.BI \-u " libsys_s"
when linking the program, and this flag would come before the library on
the link edit command line.
If new routines and data are added to the target shared library,
they will be unavailable to objects loaded with
.I rld_load
until the program is relinked against the host shared library that matches the
target shared library.  This approach is necessary to avoid link editing and
loading errors that would otherwise be very hard to detect.
.PP
The set of object files being
loaded will only be successful if there are no link edit errors (undefined
symbols, etc.).  If an error occurs, the set of object files is unloaded
automatically.  If errors occur, and the value specified for
.I stream
isn't NULL, error messages are printed on that stream.  
If the link editing and loading is successful,
the address of the header of what was loaded is returned
through the pointer
.I header_addr
(if it isn't NULL).
If
.I rld_load
is successful and the parameter
.I output_filename
isn't NULL, an object file is written to that filename.
This file can be used with the
.IR gdb (1)
.I add-file
command to debug the code in the dynamically loaded set of objects.
When the program being debugged is using rld to load into itself,
the debugger
knows how to get the symbols automatically in most cases, and the
.I output_filename
parameter isn't needed.  However, if some other program is
using rld to load into the program being
debugged, or if the specified objects
don't contain full paths, the debugger can't do this automatically.
The 
.I rld_load
function returns 1 for success and 0 for failure.  If a fatal system error 
(out of memory, etc.) occurs, all future calls to 
.I rld_load 
and the other routines described here will fail.
.PP
.I rld_load_from_memory()
is similar to
.IR rld_load() ,
but works on memory rather than a file.  The argument 
.I object_name 
is the name associated with the memory and is used for messages.
(It must not be NULL.) The
arguments 
.I object_addr 
and 
.I object_size 
are the memory address and size of the object file.  
.I rld_load_from_memory()
only allows one thin object file (not an archive or ``fat'' file) to be
loaded.
.PP
.I rld_unload()
unlinks and unloads the last set of objects that were loaded.
It returns 1 if it is successful and 0 otherwise.  If any errors occur
and
.I stream
isn't zero, the error messages are printed
on that stream.  It's the caller's responsibility not to have any pointers
into the data areas of the object set, as well as to deallocate
any memory which that set of objects may have allocated.
.PP
.I rld_lookup()
looks up the specified symbol name and returns its value indirectly through the pointer
.I value.
It returns 1 if it finds the symbol, and 0 otherwise.  If any errors occur and
.I stream
isn't zero, the error messages are printed on
that stream.   (For this routine, only internal errors can result.)
.PP
.I rld_forget_symbol()
causes this package to forget the existence of the specified symbol name.
This allows a new object to be loaded that defines this symbol.  All objects
loaded before this call will continue to use the value of the symbol in effect
at the time the object was loaded.
It returns 1 if it finds the symbol and 0 otherwise.  If any errors occur and
.I stream
isn't zero, the error messages are printed on
that stream.  (For this routine, only internal errors can result.)
.PP
.I rld_unload_all()
clears out all allocated data structures used by these routines.  If the
parameter
.I deallocate_sets
is non-zero, the function also unloads all object sets that were loaded.  
If
.I deallocate_sets
is zero the object sets aren't unloaded, and the program can continue to use
the code and data loaded.  However, further calls to the routines 
described here will no longer know
about the symbols in those sets.  If object sets aren't to be allowed access
to each other's symbols, an
.I rld_unload_all
call between calls to
.I rld_load
allows the objects sets to be loaded without fear of global symbol
names' clashing.
.I rld_unload_all
returns 1 if it is successful and 0 otherwise.  If any errors occur
and 
.I stream
isn't zero, the error messages are printed on that stream.
.PP
The second argument to
.IR rld_load_basefile
specifies a base file, whose symbol table is taken as the
basis for subsequent
.I rld_load's.
The base file may be a ``fat'' file, and
must contain an architecture that would execute on the host; 
otherwise, it is an error.  
If the file is a fat file, the ``best'' architecture (as defined by
what the kernel 
.IR exec (2) 
would select) is used as the base file.
.I rld_load_basefile
must be invoked before any call to 
.I rld_load.
Alternatively, it can be called after
.IR rld_unload_all ,
which unloads the base file.  This call is intended to be used when a program
is dynamically loading object sets into a program other than itself, where 
.I base_filename
contains the symbol table of the target program.  The routine
.IR rld_address_func ,
described next, would also be used.
.PP
.I rld_address_func
is passed a pointer to a function,
.IR func ,
that will be called from
.IR rld_load .
The parameter values that
.I rld_load
will supply to
.I func
are the size of the memory required for the object set being loaded,
and the size of the headers (which are also included in the
calculation of
.IR size ).
The function
specified by
.I func
should return the address where the output is to be link edited.  
.I rld_address_func
is
intended to be used when a program is dynamically loading object sets into a
program other than itself; the function allows it to pick the place in the
address space of the target program.
.PP
.I rld_write_symfile()
writes an object file containing only absolute symbols that mirrors that last
object set loaded.  This file can later be used to recreate the loaded state
after the sets have been unloaded.  It returns 1 for success and 0 for failure.

.SH "FAT FILE SUPPORT"
All functions that accept object files or archives also accept ``fat'' files,
except for the restrictions noted above for
.I rld_load_from_memory
and
.IR rld_load_basefile .

.SH "CPU SUBTYPE HANDLING"
For compatibility (and due to existing bugs in the the CPU subtype handling),
the
.IR rld (3)
package will function as if the
.IR ld (1)
.B \-force_cpusubtype_ALL 
option were specified.
As of the NEXTSTEP 3.0 release, the
.IR rld (3)
package doesn't do the same checking as
.IR exec (2)
does with regard to the handling of the CPU subtype.  In the case of the
m68k architecture,
if an object file with the 68040 CPU subtype 
is used with the
.IR rld (3)
package on a 68030 machine, no error is generated.  This ``bug'' has the same effect as the
.IR ld (1)
.B \-force_cpusubtype_ALL
option, but it lacks the error detection supported by
.IR exec (2).
Since adding the error detection could break existing uses of
.IR rld (3),
the
.IR rld (3)
package functions as though the
.B \-force_cpusubtype_ALL
option to 
.IR ld (1)
were specified.

.SH "SEE ALSO"
ld(1), strip(1), gdb(1)

.SH BUGS
There exists one semantic link edit problem with respect to common symbols.
If a set of object files are loaded that have common symbols left after the
symbols have been merged,
.I rld_load
has to allocate storage for these symbols
for the code to run without error.  The problem occurs if, on a later call to
.IR rld_load ,
one of the common symbols that 
.I rld_load
allocated appears in an object
file as a defining symbol (not a common or undefined symbol).  In this case,
.I rld_load
will report the symbol as being multiply defined.  However, if this combination
of object files were statically linked, no error would occur.