Python Regex Guide with Examples
re.match(pattern, string)
Tries to match a pattern at the beginning of the string. Returns a match object if found, else None.
Example: re.match(r'cat', 'catapult') → matches 'cat' at the start
re.search(pattern, string)
Scans the entire string and returns the first match object if the pattern is found, else None.
Example: re.search(r'cat', 'concatenate') → matches 'cat' in the middle
re.findall(pattern, string)
Returns all non-overlapping matches of the pattern in the string as a list of strings.
Example: re.findall(r'\d+', 'There are 12 apples and 34 bananas') → ['12', '34']
re.finditer(pattern, string)
Returns an iterator yielding match objects for all non-overlapping matches of the pattern.
Example: [m.group() for m in re.finditer(r'\d+', '12 apples 34 bananas')] → ['12', '34']
re.fullmatch(pattern, string)
Checks if the entire string matches the pattern exactly. Returns a match object or None.
Example: re.fullmatch(r'\d+', '12345') → matches '12345'
re.sub(pattern, repl, string)
Replaces occurrences of the pattern in the string with repl. Returns the new string.
Example: re.sub(r'cat', 'dog', 'concatenate cat') → 'condogenate dog'
re.split(pattern, string)
Splits the string by the occurrences of the pattern. Returns a list.
Example: re.split(r'\s+', 'split this sentence') → ['split', 'this', 'sentence']
re.compile(pattern)
Compiles a regex pattern into a regex object for repeated use. More efficient for multiple operations.
Example: pattern = re.compile(r'\d+'); pattern.findall('123 abc 456') → ['123', '456']
re.escape(string)
Escapes all special regex characters in the string so it can be used literally in a regex pattern.
Example: re.escape('1.5+2.5') → '1\.5\+2\.5'
Common Regex Tokens
.
Matches any character except newline.
Example: re.findall(r'.', 'abc') → ['a', 'b', 'c']
^
Matches the start of the string.
Example: re.match(r'^a', 'abc') → matches 'a'
$
Matches the end of the string.
Example: re.search(r'c$', 'abc') → matches 'c'
*
Matches 0 or more repetitions.
Example: re.findall(r'ab*', 'abb ab a') → ['abb', 'ab', 'a']
+
Matches 1 or more repetitions.
Example: re.findall(r'ab+', 'abb ab a') → ['abb', 'ab']
?
Matches 0 or 1 repetition.
Example: re.findall(r'ab?', 'abb ab a') → ['ab', 'ab', 'a']
{n}
Matches exactly n repetitions.
Example: re.findall(r'\d{3}', '123 4567') → ['123', '456']
{n,}
Matches n or more repetitions.
Example: re.findall(r'\d{2,}', '1 12 1234') → ['12', '1234']
{n,m}
Matches between n and m repetitions.
Example: re.findall(r'\d{2,3}', '1 12 1234') → ['12', '123', '234']
[]
Defines a character class.
Example: re.findall(r'[aeiou]', 'hello world') → ['e', 'o', 'o']
[^]
Negates a character class.
Example: re.findall(r'[^0-9]', 'abc123') → ['a','b','c']
\d
Matches any digit (0-9).
Example: re.findall(r'\d', 'abc123') → ['1','2','3']
\D
Matches any non-digit.
Example: re.findall(r'\D', 'abc123') → ['a','b','c']
\w
Matches any word character.
Example: re.findall(r'\w', 'a_b!') → ['a','_','b']
\W
Matches any non-word character.
Example: re.findall(r'\W', 'a_b!') → ['!']
\s
Matches any whitespace.
Example: re.findall(r'\s', 'a b\t c') → [' ', '\t', ' ']
\S
Matches any non-whitespace character.
Example: re.findall(r'\S', 'a b') → ['a','b']
\b
Matches a word boundary.
Example: re.sub(r'\b', '|', 'cat') → '|cat|'
\B
Matches a position that is NOT a word boundary.
Example: re.sub(r'\B', '|', 'cat') → 'c|a|t'
|
Acts as OR.
Example: re.findall(r'cat|dog', 'cat and dog') → ['cat', 'dog']
()
Capturing group.
Example: re.findall(r'(ab)+', 'ababab') → ['ab', 'ab', 'ab']
(?:)
Non-capturing group.
Example: re.findall(r'(?:ab)+', 'ababab') → ['ababab']