Revision 7.10
2/28/2012
[a-z&&[^bc]]
a
through
z
, except for
b
and
c
:
[ad-z]
(subtraction)
[a-z&&[^m-p]]
a
through
z
, and not
m
through
p
:
[a-lq-z]
(subtraction)
Predefined character classes
.
Any character (may or may not match
\d
A digit:
[0-9]
\D
A nondigit:
[^0-9]
\s
A whitespace character:
[ \t\n\x0B\f\r]
\S
A nonwhitespace character:
[^\s]
\w
A word character:
[a-zA-Z_0-9]
\W
A nonword character:
[^\w]
POSIX character classes (USASCII only)
\p{Lower}
A lowercase alphabetic character:
[a-z]
\p{Upper}
An uppercase alphabetic character:
[A-Z]
\p{ASCII}
All ASCII:
[\x00-\x7F]
\p{Alpha}
An alphabetic character:
[\p{Lower}\p{Upper}]
\p{Digit}
A decimal digit:
[0-9]
\p{Alnum}
An alphanumeric character:
[\p{Alpha}\p{Digit}]
\p{Punct}
Punctuation: One of
!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
\p{Graph}
A visible character:
[\p{Alnum}\p{Punct}]
\p{Print}
A printable character:
[\p{Graph}]
\p{Blank}
A space or a tab:
[ \t]
\p{Cntrl}
A control character:
[\x00-\x1F\x7F]
\p{XDigit}
A hexadecimal digit:
[0-9a-fA-F]
\p{Space}
A whitespace character:
[ \t\n\x0B\f\r]
Classes for Unicode blocks and categories
\p{InGreek}
A character in the Greek block (simple
\p{Lu}
\p{Sc}
A currency symbol
\P{InGreek}
Any character except one in the Greek block (negation)
[\p{L}&&[^\p{Lu}]]
Any letter except an uppercase letter (subtraction)
Boundary matchers
Page 212 of 228