B-17
User Guide for Cisco Security MARS Local Controller
78-17020-01
Appendix B Regular Expression Reference
Assertions
More complicated assertions are coded as subpatterns. There are two kinds: those that look ahead of the
current position in the subject string, and those that look behind it. An assertion subpattern is matched
in the normal way, except that it does not cause the current matching position to be changed.
Assertion subpatterns are not capturing subpatterns, and may not be repeated, because it makes no sense
to assert the same thing several times. If any kind of assertion contains capturing subpatterns within it,
these are counted for the purposes of numbering the capturing subpatterns in the whole pattern. However,
substring capturing is carried out only for positive assertions, because it does not make sense for negative
assertions.
Lookahead Assertions
Lookahead assertions start with (?= for positive assertions and (?! for negative assertions. For example,
\w+(?=;)
matches a word followed by a semicolon, but does not include the semicolon in the match, and
foo(?!bar)
matches any occurrence of "foo" that is not followed by "bar". Note that the apparently similar pattern
(?!foo)bar
does not find an occurrence of "bar" that is preceded by something other than "foo"; it finds any
occurrence of "bar" whatsoever, because the assertion (?!foo) is always true when the next three
characters are "bar". A lookbehind assertion is needed to achieve the other effect.
If you want to force a matching failure at some point in a pattern, the most convenient way to do it is
with (?!) because an empty string always matches, so an assertion that requires there not to be an empty
string must always fail.
Lookbehind Assertions
Lookbehind assertions start with (?<= for positive assertions and (?<! for negative assertions. For
example,
(?<!foo)bar
does find an occurrence of "bar" that is not preceded by "foo". The contents of a lookbehind assertion
are restricted such that all the strings it matches must have a fixed length. However, if there are several
alternatives, they do not all have to have the same fixed length. Thus
(?<=bullock|donkey)
is permitted, but
(?<!dogs?|cats?)
causes an error at compile time. Branches that match different length strings are permitted only at the
top level of a lookbehind assertion. This is an extension compared with Perl (at least for 5.8), which
requires all branches to match the same length of string. An assertion such as