Regular Expressions
This is the definition of the kind of regular expressions we use:
Regular expressions are the extended kind found in egrep.
They are composed of characters as follows:
c matches the non-metacharacter c.
\c matches the literal character c.
. matches any character except newline.
^ matches the beginning of a line or a string.
$ matches the end of a line or a string.
[abc...] character class, matches any of the characters
abc....
[^abc...] negated character class, matches any character
except abc... and newline.
r1|r2 alternation: matches either r1 or r2.
r1r2 concatenation: matches r1, and then r2.
r+ matches one or more r's.
r* matches zero or more r's.
r? matches zero or one r's.
(r) grouping: matches r.
Some simple Examples:
la
matches all strings containing the substring la
, e.g. Klaus, Nikolai, Clarence etc.
(Boris)|(Phil)
matches all strings containing the substring Boris
or the substring Phil
, e.g. Boris, Philippe, Phil, Philip, etc.
^12..$
matches all strings beginning with 12 followed by only two further characters, e.g. 1200, 1201, 12AZ, etc.
^....$
matches all strings with length 4, e.g. Ruth, John, Henk, Rick, etc.
Hypertext by Markus Spitzer, ms@kgw.TU-Berlin.DE
Last modification: Tue Jul 25 15:10:47 MDT 1995