Compile a regular expression pattern into a regular expression
object, which can be used for matching using its match() and
search() methods, described below.
The expression's behaviour can be modified by specifying a
flags value. Values can be any of the following variables,
combined using bitwise OR (the | operator).
The sequence
prog = re.compile(pat)
result = prog.match(str)
is equivalent to
result = re.match(pat, str)
but the version using compile() is more efficient when the
expression will be used several times in a single program.
When specified, the pattern character "^" matches at the
beginning of the string and at the beginning of each line
(immediately following each newline); and the pattern character
"$" matches at the end of the string and at the end of each
line (immediately preceding each newline). By default, "^"
matches only at the beginning of the string, and "$" only
at the end of the string and immediately before the newline (if any)
at the end of the string.
This flag allows you to write regular expressions that look nicer.
Whitespace within the pattern is ignored,
except when in a character class or preceded by an unescaped
backslash, and, when a line contains a "#" neither in a
character class or preceded by an unescaped backslash, all characters
from the leftmost such "#" through the end of the line are
ignored.
Scan through string looking for a location where the regular
expression pattern produces a match, and return a
corresponding MatchObject instance.
Return None if no
position in the string matches the pattern; note that this is
different from finding a zero-length match at some point in the string.
If zero or more characters at the beginning of string match
the regular expression pattern, return a corresponding
MatchObject instance. Return None if the string does not
match the pattern; note that this is different from a zero-length
match.
Note:
If you want to locate a match anywhere in
string, use search() instead.
Split string by the occurrences of pattern. If
capturing parentheses are used in pattern, then the text of all
groups in the pattern are also returned as part of the resulting list.
If maxsplit is nonzero, at most maxsplit splits
occur, and the remainder of the string is returned as the final
element of the list. (Incompatibility note: in the original Python
1.5 release, maxsplit was ignored. This has been fixed in
later releases.)
Return a list of all non-overlapping matches of pattern in
string. If one or more groups are present in the pattern,
return a list of groups; this will be a list of tuples if the
pattern has more than one group. Empty matches are included in the
result.
New in version 1.5.2.
Return the string obtained by replacing the leftmost non-overlapping
occurrences of pattern in string by the replacement
repl. If the pattern isn't found, string is returned
unchanged. repl can be a string or a function; if it is a
string, any backslash escapes in it are processed. That is,
"\n" is converted to a single newline character, "\r" is converted to a linefeed, and so forth. Unknown escapes such as
"\j" are left alone. Backreferences, such as "\6", are
replaced with the substring matched by group 6 in the pattern. For
example:
If repl is a function, it is called for every non-overlapping
occurrence of pattern. The function takes a single match
object argument, and returns the replacement string. For example:
The pattern may be a string or an RE object; if you need to specify
regular expression flags, you must use a RE object, or use embedded
modifiers in a pattern; for example, "sub("(?i)b+", "x", "bbbb
BBBB")" returns 'x x'.
The optional argument count is the maximum number of pattern
occurrences to be replaced; count must be a non-negative
integer. If omitted or zero, all occurrences will be replaced.
Empty matches for the pattern are replaced only when not adjacent to
a previous match, so "sub('x*', '-', 'abc')" returns
'-a-b-c-'.
In addition to character escapes and backreferences as described
above, "\g<name>" will use the substring matched by the group
named "name", as defined by the (?P<name>...) syntax.
"\g<number>" uses the corresponding group number;
"\g<2>" is therefore equivalent to "\2", but isn't
ambiguous in a replacement such as "\g<2>0". "\20" would be interpreted as a reference to group 20, not a reference to
group 2 followed by the literal character "0". The
backreference "\g<0>" substitutes in the entire substring
matched by the RE.
Return string with all non-alphanumerics backslashed; this is
useful if you want to match an arbitrary literal string that may have
regular expression metacharacters in it.
Exception raised when a string passed to one of the functions here
is not a valid regular expression (for example, it might contain
unmatched parentheses) or when some other error occurs during
compilation or matching. It is never an error if a string contains
no match for a pattern.