We are on a mission to address the digital
skills gap for 10 Million+ young professionals,
train and empower them to forge a career
path into future tech
Regular Expression
MONTH, YEAR
Regular Expression
Introduction
✓ Regular expression is a sequence of characters that forms a pattern which is mainly used to find
or replace patterns in a string.
✓ Most common uses of regular expressions are:
• Finding patterns in a string or file.(Ex: find all the numbers present in a string)
• Replace a part of the string with another string.
• Search substring in string or file.
• Split string into substrings.
• Validate email format.
3 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
RegEx Module
✓ In Python, we have a built-in package called re to work with regular expressions.
Program
import re
text = '''Alan Turing was a pioneer of theoretical computer science and artificial intelligence. He was born on 23
June 1912 in Maida Vale, London'''
res = re.search("^Alan.*London$",text)
if(res):
print("We have a match!")
else:
print("We don't have a match")
4 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
RegEx Module
Sample Input and Output
We have a match!
5 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Regular Expression Functions
✓ The ‘re’ package in Python provides various functions to work with regular expressions.
✓ We will discuss some commonly used ones.
S No Meta character Description
This matches all the occurrences of the pattern present in the
1 findall(pattern,string)
string.
This matches the pattern which is present at any position in
2 search(pattern,string)
the string. This will match the first occurrence of the pattern.
3 split(pattern,string) This splits the string on the given pattern.
sub(pattern,rep_substring,strin This replaces one or more matching pattern in the string with
4
g) the given substring
6 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Regular Expression Functions: findall(pattern,string)
✓ This function is the same as search but it matches all the occurrences of the pattern in the given
string and returns a list. The list contains the number of times it is present in the string.
# Example
import re
text = '''Alan Turing was a pioneer of theoretical computer science and artificial intelligence. He was born on 23
June 1912 in Maida Vale, London'''
res = re.findall('Turing',text)
print("Result = {}".format(res))
Output:
Result = ['Turing']
7 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Regular Expression Functions: findall(pattern,string)
Notes
It is advisable to use findall while searching for a pattern in a string as it covers both match and search
functions.
8 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Regular Expression Functions: search(pattern,string)
✓ This is the same as match function but this function can search patterns irrespective of the
position at which the pattern is present. The pattern can be present anywhere in the string. This
function matches the first occurrence of the pattern.
import re
text = '''Alan Turing was a pioneer of theoretical computer science and artificial intelligence. He was born on 23
June 1912 in Maida Vale, London'''
res = re.search('Turing',text)
print("Result = {} and start,end position = {}".format(res,res.span()))
#Output:
Result = <re.Match object; span=(5, 11), match='Turing'> and start,end position = (5, 11)
9 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Regular Expression Functions: search(pattern,string)
✓ The function returns re.Match object if pattern if present in the string else returns None.
✓ We can also get the start and end positions of matching pattern by calling span method on the
re.Match object.
10 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Regular Expression Functions: split(pattern,string)
✓ This function splits a string on the given pattern. This returns the result as a list after splitting.
import re
text = '''Alan Turing was a pioneer of theoretical computer science and artificial intelligence. He was born on 23
June 1912 in Maida Vale, London'''
res = re.split("a", text)
print("Result = {}".format(res))
#Output:
Result = ['Al', 'n Turing w', 's ', ' pioneer of theoretic', 'l computer science ', 'nd ', 'rtifici', 'l intelligence. He w', 's
born on 23 June 1912 in M', 'id', ' V', 'le, London']
11 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Regular Expression Functions: sub(pattern,repl,string)
✓ This function replaces a pattern with the given substring in a given string.
# Replace the word ‘theoretical’ with ‘practical’.
import re
text = '''Alan Turing was a pioneer of theoretical computer science and artificial intelligence. He was born
on 23 June 1912 in Maida Vale, London'''
res = re.sub('theoretical','practical',text)
print("Result = {}".format(res))
#Output:
Result = Alan Turing was a pioneer of practical computer science and artificial intelligence. He was born
on 23 June 1912 in Maida Vale, London
12 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Meta Characters
✓ Meta characters are the characters which have special meaning. The following are some of the meta
characters with their uses.
S No Meta character Description
This matches any single character in this bracket with the
1 [ ](Square brackets)
given string.
This matches all the characters except the newline. If we
pass this as a pattern in the findall() function it will match with
2 . (Period)
all the characters present in the string except newline
characters.
This matches the given pattern at the start of the string.This is
3 ^ (Carret) used to check if the string starts with a particular pattern or
not.
13 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Meta Characters
S No Meta character Description
This matches the given pattern at the end of string. This is used to
4 $ (Dollar)
check if the string ends with a pattern or not.
5 * (Star) This matches 0 or more occurrences of the pattern to its left.
6 + (Plus) This matches 1 or more occurrences of the pattern to its left.
7 ? (Question mark) This matches 0 or 1 occurrence of the pattern to its left.
This matches the specified number of occurrences of pattern present
8 { } (Braces)
in the braces.
This works like ‘or’ condition. In this we can give two or more patterns.
9 (Alternation) If the string contains at least one of the given patterns this will give a
match.
14 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Meta Characters
S No Meta character Description
This is used to group various regular expressions together and
10 ( ) (Group)
then find a match in the string.
This is used to match special sequences or can be used as
11 \ (Backslash)
escape characters also.
15 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Special Sequence in RegEx
S No Sequence Description
This gives a match if the characters to the right of this are at the
1 \A
beginning of the string.
This gives a match if the characters to the right are at the
2 \b beginning of a word or the characters to the left are at the end of a
word in the given string.
This gives a match if the characters to the right or left of \B are not
3 \B
present at the beginning or end of a word in the given string.
4 \d This gives a match if the string contains a digit.
5 \D This gives a match if the string contains only non digit characters.
6 \s This gives a match if the string contains a white space character.
16 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Special Sequence in RegEx
S No sequence Description
This gives a match if the string contains only characters other than
7 \S
white space character.
This gives a match if the string contains any character in a-z, A-Z,
8 \w
0-9 and underscore(_).
This gives a match if the string contains characters other than a-z,
9 \W
A-Z, 0-9 and underscore(_).
This gives a match if the characters to the left of \Z are present at
10 \Z
the end of the string.
17 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Special Sequence in RegEx
Program
# Example for each sequence
import re
text = '''Alan Turing was born on 23 June 1912 in London.'''
res = re.findall('\AAlan',text) # Example for \A
print("Result for \A = ", res)
print("-"*79)
res = re.findall(r'\bLon',text) # Example for \b
print("Result for \\b = ", res)
print("-"*79)
18 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Special Sequence in RegEx
Program
# Example for \b
res = re.findall(r'ring\b',text)
print("Result for \\b = ", res)
print("-"*79)
# Example for \B
res = re.findall('\Bon',text)
print("Result for \B = ", res)
print("-"*79)
19 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Special Sequence in RegEx
Program
# Example for \d
res = re.findall('\d',text)
print("Result for \d = ", res)
print("-"*79)
# Example for \D
res = re.findall('\D',text)
print("Result for \D = ", res)
print("-"*79)
20 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Special Sequence in RegEx
Program
# Example for \s
res = re.findall('\s',text)
print("Result for \s = ", res)
print("-"*79)
# Example for \S
res = re.findall('\S',text)
print("Result for \S = ", res)
print("-"*79)
21 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Special Sequence in RegEx
Program
# Example for \w
res = re.findall('\w',text)
print("Result for \w = ", res)
print("-"*79)
# Example for \W
res = re.findall('\W',text)
print("Result for \W = ", res)
print("-"*79)
22 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Special Sequence in RegEx
Program
# Example for \Z
res = re.findall('London.\Z',text)
print("Result for \Z = ", res)
23 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
RegEx Module
Sample Input and Output
Result for \A = ['Alan']
-------------------------------------------------------------------------------
Result for \b = ['Lon']
-------------------------------------------------------------------------------
Result for \b = ['ring']
-------------------------------------------------------------------------------
Result for \B = ['on', 'on']
-------------------------------------------------------------------------------
Result for \d = ['2', '3', '1', '9', '1', '2']
-------------------------------------------------------------------------------
Result for \D = ['A', 'l', 'a', 'n', ' ', 'T', 'u', 'r', 'i', 'n', 'g', ' ', 'w', 'a', 's', ' ', 'b', 'o', 'r', 'n', ' ', 'o', 'n', ' ', ' ', 'J', 'u', 'n', 'e', ' ', ' ', 'i',
'n', ' ', 'L', 'o', 'n', 'd', 'o', 'n', '.']
-------------------------------------------------------------------------------
Result for \s = [' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
-------------------------------------------------------------------------------
24 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
RegEx Module
Sample Input and Output
Result for \S = ['A', 'l', 'a', 'n', 'T', 'u', 'r', 'i', 'n', 'g', 'w', 'a', 's', 'b', 'o', 'r', 'n', 'o', 'n', '2', '3', 'J', 'u', 'n', 'e', '1', '9', '1', '2', 'i', 'n',
'L', 'o', 'n', 'd', 'o', 'n', '.']
-------------------------------------------------------------------------------
Result for \w = ['A', 'l', 'a', 'n', 'T', 'u', 'r', 'i', 'n', 'g', 'w', 'a', 's', 'b', 'o', 'r', 'n', 'o', 'n', '2', '3', 'J', 'u', 'n', 'e', '1', '9', '1', '2', 'i', 'n',
'L', 'o', 'n', 'd', 'o', 'n']
-------------------------------------------------------------------------------
Result for \W = [' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '.']
-------------------------------------------------------------------------------
Result for \Z = ['London.']
25 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Sets
✓ A set is a set of characters inside the square bracket which is treated as a pattern. Given below
are some examples of sets:
S No Set Description
1 [abcd] Gives a match if the string contains a,b,c or d.
2 [a-z] Gives a match if the string contains any character from a to z.
3 [A-Z] Gives a match if the string contains any character from A to Z.
4 [0-9] Gives a match if string contains digits from 0 to 9
5 [a-zA-Z0-9] Gives a match if any of the above conditions holds true.
26 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Sets
S No Set Description
6 [^a-zA-Z] Gives a match if the string doesn’t contain any alphabet.
Gives a match if the string contains any of these characters.
7 [%&$#@*] When these characters are in square brackets they are
treated as normal characters.
27 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Match Object
✓ Whenever we call any regex method/function it searches the pattern in the string.
✓ If it finds a match then it returns a match object else returns None.
✓ We will see what the match object looks like and how to access the methods and properties of that
object. Let’s search for a pattern in a string and print the match object.
import re
text = '''Alan Turing was a pioneer of theoretical computer science and artificial intelligence. He was born on 23
June 1912 in Maida Vale, London'''
# Searches the pattern in the string.
res = re.search('computer',text)
print("Match object = {}".format(res)
#Output:
Match object = <re.Match object; span=(41, 49), match='computer'>
28 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Match Object
✓ In the example above we can see that if a match happens then the re.Match object is returned. If
there is no match then None will be returned.
✓ Now we will see the attributes and properties of re.Match objects one by one. They are as follows:
S No Function Description
1 match.group() This returns the part of the string where the match was there.
This returns the start position of the matching pattern in the
2 match.start()
string.
This returns the end position of the matching pattern in the
3 Match.end()
string.
This returns a tuple which has start and end positions of
4 match.span()
matching pattern.
29 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Match Object
S No Function Description
5 match.re This returns the pattern object used for matching.
6 match.string This returns the string given for matching.
This is used to convert the pattern to raw string.This means
any special character will be treated as normal character. Ex: \
7 Using r prefix before regex
character will not be treated as an escape character if we use
r before the pattern.
30 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Match Object
Program
import re
text = '''Alan Turing was a pioneer of theoretical computer science and artificial intelligence. He was born on 23
June 1912 in Maida Vale, London'''
# Searches the pattern in the string.
res = re.search('computer',text)
print("Match object = {}".format(res))
print("--"*30)
print("group method output = ",res.group())
print("--"*30)
31 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Match Object
Program
print("start method output = ",res.start())
print("--"*30)
print("end method output = ",res.end())
print("--"*30)
print("span method output = ",res.span())
print("--"*30)
print("re attribute output = ",res.re)
print("--"*30)
print("string attribute output = ",res.string)
print("--"*30)
32 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Match Object
Program
# Example of using r as prefix.
# Searching for \\ in the following string
text = r'search \\ in this string'
# searching using r as prefix
res = re.search(r"\\",text)
print("With r as prefix = ",res)
33 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Match Object
Sample Input and Output
Match object = <re.Match object; span=(41, 49), match='computer'>
------------------------------------------------------------
group method output = computer
------------------------------------------------------------
start method output = 41
------------------------------------------------------------
end method output = 49
------------------------------------------------------------
span method output = (41, 49)
------------------------------------------------------------
re attribute output = re.compile('computer')
------------------------------------------------------------
string attribute output = Alan Turing was a pioneer of theoretical computer science and artificial intelligence. He was
born on 23 June 1912 in Maida Vale, London
------------------------------------------------------------
With r as prefix = <re.Match object; span=(7, 8), match='\\'>
34 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Example #1
Program
# Example 1: Matching a simple pattern
import re
pattern = r'\b\w+ing\b'
text = "Walking and talking are important activities."
match_result = re.search(pattern, text)
if match_result:
print("Match found:", match_result.group())
else:
print("No match found")
35 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Example #1
Sample Input and Output
Match found: Walking
36 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Example #2
Program
# Example 2: Extracting email addresses from a text
import re
email_pattern = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'
text_with_emails = "Contact us at [email protected] or [email protected]"
emails_found = re.findall(email_pattern, text_with_emails)
if emails_found:
print("Email addresses found:", emails_found)
else:
print("No email addresses found")
37 Getting Start with Python | © SmartCliff | Internal | Version 1.0
Regular Expression
Example #2
Sample Input and Output
38 Getting Start with Python | © SmartCliff | Internal | Version 1.0
THANK YOU