RegExp.3   [plain text]


'\"
'\" Copyright (c) 1994 The Regents of the University of California.
'\" Copyright (c) 1994-1996 Sun Microsystems, Inc.
'\"
'\" See the file "license.terms" for information on usage and redistribution
'\" of this file, and for a DISCLAIMER OF ALL WARRANTIES.
'\" 
'\" RCS: @(#) $Id: RegExp.3,v 1.2 2001/09/14 01:42:14 zlaski Exp $
'\" 
.so man.macros
.TH Tcl_RegExpMatch 3 7.4 Tcl "Tcl Library Procedures"
.BS
.SH NAME
Tcl_RegExpMatch, Tcl_RegExpCompile, Tcl_RegExpExec, Tcl_RegExpRange \- Pattern matching with regular expressions
.SH SYNOPSIS
.nf
\fB#include <tcl.h>\fR
.sp
int
\fBTcl_RegExpMatch\fR(\fIinterp\fR, \fIstring\fR, \fIpattern\fR)
.sp
Tcl_RegExp
\fBTcl_RegExpCompile\fR(\fIinterp\fR, \fIpattern\fR)
.sp
int
\fBTcl_RegExpExec\fR(\fIinterp\fR, \fIregexp\fR, \fIstring\fR, \fIstart\fR)
.sp
\fBTcl_RegExpRange\fR(\fIregexp\fR, \fIindex\fR, \fIstartPtr\fR, \fIendPtr\fR)
.SH ARGUMENTS
.AS Tcl_Interp *interp
.AP Tcl_Interp *interp in
Tcl interpreter to use for error reporting.
.AP char *string in
String to check for a match with a regular expression.
.AP char *pattern in
String in the form of a regular expression pattern.
.AP Tcl_RegExp regexp in
Compiled regular expression.  Must have been returned previously
by \fBTcl_RegExpCompile\fR.
.AP char *start in
If \fIstring\fR is just a portion of some other string, this argument
identifies the beginning of the larger string.
If it isn't the same as \fIstring\fR, then no \fB^\fR matches
will be allowed.
.AP int index in
Specifies which range is desired:  0 means the range of the entire
match, 1 or greater means the range that matched a parenthesized
sub-expression.
.AP char **startPtr out
The address of the first character in the range is stored here, or
NULL if there is no such range.
.AP char **endPtr out
The address of the character just after the last one in the range
is stored here, or NULL if there is no such range.
.BE

.SH DESCRIPTION
.PP
\fBTcl_RegExpMatch\fR determines whether its \fIpattern\fR argument
matches \fIregexp\fR, where \fIregexp\fR is interpreted
as a regular expression using the same rules as for the
\fBregexp\fR Tcl command.
If there is a match then \fBTcl_RegExpMatch\fR returns 1.
If there is no match then \fBTcl_RegExpMatch\fR returns 0.
If an error occurs in the matching process (e.g. \fIpattern\fR
is not a valid regular expression) then \fBTcl_RegExpMatch\fR
returns \-1 and leaves an error message in \fIinterp->result\fR.
.PP
\fBTcl_RegExpCompile\fR, \fBTcl_RegExpExec\fR, and \fBTcl_RegExpRange\fR
provide lower-level access to the regular expression pattern matcher.
\fBTcl_RegExpCompile\fR compiles a regular expression string into
the internal form used for efficient pattern matching.
The return value is a token for this compiled form, which can be
used in subsequent calls to \fBTcl_RegExpExec\fR or \fBTcl_RegExpRange\fR.
If an error occurs while compiling the regular expression then
\fBTcl_RegExpCompile\fR returns NULL and leaves an error message
in \fIinterp->result\fR.
Note:  the return value from \fBTcl_RegExpCompile\fR is only valid
up to the next call to \fBTcl_RegExpCompile\fR;  it is not safe to
retain these values for long periods of time.
.PP
\fBTcl_RegExpExec\fR executes the regular expression pattern matcher.
It returns 1 if \fIstring\fR contains a range of characters that
match \fIregexp\fR, 0 if no match is found, and
\-1 if an error occurs.
In the case of an error, \fBTcl_RegExpExec\fR leaves an error
message in \fIinterp->result\fR.
When searching a string for multiple matches of a pattern,
it is important to distinguish between the start of the original
string and the start of the current search.
For example, when searching for the second occurrence of a
match, the \fIstring\fR argument might point to the character
just after the first match;  however, it is important for the
pattern matcher to know that this is not the start of the entire string,
so that it doesn't allow \fB^\fR atoms in the pattern to match.
The \fIstart\fR argument provides this information by pointing
to the start of the overall string containing \fIstring\fR.
\fIStart\fR will be less than or equal to \fIstring\fR;  if it
is less than \fIstring\fR then no \fB^\fR matches will be allowed.
.PP
\fBTcl_RegExpRange\fR may be invoked after \fBTcl_RegExpExec\fR
returns;  it provides detailed information about what ranges of
the string matched what parts of the pattern.
\fBTcl_RegExpRange\fR returns a pair of pointers in \fI*startPtr\fR
and \fI*endPtr\fR that identify a range of characters in
the source string for the most recent call to \fBTcl_RegExpExec\fR.
\fIIndex\fR indicates which of several ranges is desired:
if \fIindex\fR is 0, information is returned about the overall range
of characters that matched the entire pattern;  otherwise,
information is returned about the range of characters that matched the
\fIindex\fR'th parenthesized subexpression within the pattern.
If there is no range corresponding to \fIindex\fR then NULL
is stored in \fI*firstPtr\fR and \fI*lastPtr\fR.

.SH KEYWORDS
match, pattern, regular expression, string, subexpression