0% found this document useful (0 votes)

50 views17 pages

Regular Expression

This document provides an overview of regular expressions (RegEx) in Python. It explains what RegEx is, how to import and use the re module in Python, common RegEx patterns like character sets, quantifiers, boundaries and escapes, and functions in the re module like search(), findall(), split(), and sub(). It also describes metacharacters and special sequences that have special meanings in RegEx like [], ., ^, $, *, +, etc. and provides examples of using each with re functions and matching strings.

Uploaded by

Vinston Raja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views17 pages

Regular Expression

Uploaded by

Vinston Raja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 17

Python RegEx / Regular Expression

A RegEx, or Regular Expression, is a sequence of characters that forms a search pattern.

RegEx can be used to check if a string contains the specified search pattern.

RegEx Module
Python has a built-in package called re, which can be used to work with Regular Expressions.
Import the re module:
import re

RegEx in Python
When you have imported the re module, you can start using regular expressions:

Search the string to see if it starts with "The" and ends with "Spain":
import re
txt = "The rain in Spain"
x = re.search("^The.*Spain$", txt)
Output:
YES! We have a match!

RegEx Functions
The re module offers a set of functions that allows us to search a string for a match:
Functio Description
n

findall Returns a list containing all matches

search Returns a Match object if there is a match anywhere in the string

split Returns a list where the string has been split at each match

sub Replaces one or many matches with a string

Metacharacters are characters with a special meaning:

Characte Description Example
r

[] A set of characters "[a-m]"

\ Signals a special sequence (can also be used to escape "\d"

special characters)

. Any character (except newline character) "he..o"

^ Starts with "^hello"

$ Ends with "planet$"

* Zero or more occurrences "he.*o"

+ One or more occurrences "he.+o"

? Zero or one occurrences "he.?o"

{} Exactly the specified number of occurrences "he.{2}o"

| Either or "falls|stays"

() Capture and group

Special Sequences
A special sequence is a \ followed by one of the characters in the list below, and has a special
meaning:
Characte Description Example
r

\A Returns a match if the specified characters are at the "\AThe"

beginning of the string

\b Returns a match where the specified characters are at the r"\bain"

beginning or at the end of a word
(the "r" in the beginning is making sure that the string is r"ain\b"
being treated as a "raw string")

\B Returns a match where the specified characters are present, r"\Bain"

but NOT at the beginning (or at the end) of a word
(the "r" in the beginning is making sure that the string is r"ain\B"
being treated as a "raw string")

\d Returns a match where the string contains digits (numbers "\d"

from 0-9)

\D Returns a match where the string DOES NOT contain digits "\D"

\s Returns a match where the string contains a white space "\s"

character

\S Returns a match where the string DOES NOT contain a "\S"

white space character

\w Returns a match where the string contains any word "\w"

characters (characters from a to Z, digits from 0-9, and the
underscore _ character)
\W Returns a match where the string DOES NOT contain any "\W"
word characters

\Z Returns a match if the specified characters are at the end of "Spain\Z"

the string

Sets
A set is a set of characters inside a pair of square brackets [] with a special meaning:
Set Description

[arn] Returns a match where one of the specified characters (a, r, or n) is present

[a-n] Returns a match for any lower case character, alphabetically between a and n

[^arn] Returns a match for any character EXCEPT a, r, and n

[0123] Returns a match where any of the specified digits (0, 1, 2, or 3) are present

[0-9] Returns a match for any digit between 0 and 9

[0-5][0- Returns a match for any two-digit numbers from 00 and 59

[a-zA- Returns a match for any character alphabetically between a and z, lower case
Z] OR upper case

[+] In sets, +, *, ., |, (), $,{} has no special meaning, so [+] means: return a match
for any + character in the string

The findall() Function

The findall() function returns a list containing all matches.
Example
Print a list of all matches:
import re
txt = "The rain in Spain"
x = re.findall("ai", txt)
print(x)
Output:
['ai', 'ai']

The list contains the matches in the order they are found.
If no matches are found, an empty list is returned:
Example
Return an empty list if no match was found:
import re
txt = "The rain in Spain"
x = re.findall("Portugal", txt)
print(x)
Output:
[]
No match

The search() Function

The search() function searches the string for a match, and returns a Match object if there is a
match.
If there is more than one match, only the first occurrence of the match will be returned:
Example
Search for the first white-space character in the string:
import re
txt = "The rain in Spain"
x = re.search("\s", txt)
print("The first white-space character is located in position:", x.start())
Output:
The first white-space character is located in position: 3

If no matches are found, the value None is returned:

Example
Make a search that returns no match:
import re
txt = "The rain in Spain"
x = re.search("Portugal", txt)
print(x)
Output:
None

The split() Function

The split() function returns a list where the string has been split at each match:
Example
Split at each white-space character:
import re
txt = "The rain in Spain"
x = re.split("\s", txt)
print(x)
Output:
['The', 'rain', 'in', 'Spain']

You can control the number of occurrences by specifying the maxsplit parameter:
Example
Split the string only at the first occurrence:
import re
txt = "The rain in Spain"
x = re.split("\s", txt, 1)
print(x)
Output:
['The', 'rain in Spain']

The sub() Function

The sub() function replaces the matches with the text of your choice:
Example
Replace every white-space character with the number 9:
import re
txt = "The rain in Spain"
x = re.sub("\s", "9", txt)
print(x)
Output:
The9rain9in9Spain

You can control the number of replacements by specifying the count parameter:
Example
Replace the first 2 occurrences:
import re
txt = "The rain in Spain"
x = re.sub("\s", "9", txt, 2)
print(x)
Output:
The9rain9in Spain

Match Object
A Match Object is an object containing information about the search and the result.
Note: If there is no match, the value None will be returned, instead of the Match Object.
Example
Do a search that will return a Match Object:
import re
txt = "The rain in Spain"
x = re.search("ai", txt)
print(x) #this will print an object
Output:
<_sre.SRE_Match object; span=(5, 7), match='ai'>

The Match object has properties and methods used to retrieve information about the search,
and the result:
.span() returns a tuple containing the start-, and end positions of the match.
.string returns the string passed into the function
.group() returns the part of the string where there was a match

Example
Print the position (start- and end-position) of the first match occurrence.
The regular expression looks for any words that starts with an upper case "S":
import re
txt = "The rain in Spain"
x = re.search(r"\bS\w+", txt)
print(x.span())
Output:
(12, 17)

Example
Print the string passed into the function:
import re
txt = "The rain in Spain"
x = re.search(r"\bS\w+", txt)
print(x.string)
Output:
The rain in Spain

Example
Print the part of the string where there was a match.
The regular expression looks for any words that starts with an upper case "S":
import re

txt = "The rain in Spain"

x = re.search(r"\bS\w+", txt)
print(x.group())
Output:
Spain

Metacharacters
Metacharacters are characters with a special meaning:
[] A set of characters "[a-m]"

import re
txt = "The rain in Spain"
#Find all lower case characters alphabetically between "a" and "m":
x = re.findall("[a-m]", txt)
print(x)
Output:
['h', 'e', 'a', 'i', 'i', 'a', 'i']

\ Signals a special sequence (can also be used to escape special characters) "\d"
import re
txt = "That will be 59 dollars"
#Find all digit characters:
x = re.findall("\d", txt)
print(x)
Output:
['5', '9']

. Any character (except newline character) "he..o"

import re
txt = "hello planet"
#Search for a sequence that starts with "he", followed by two (any) characters, and an "o":
x = re.findall("he..o", txt)
print(x)
Output:
['hello']

^ Starts with "^hello"

import re
txt = "hello planet"
#Check if the string starts with 'hello':
x = re.findall("^hello", txt)
if x:
print("Yes, the string starts with 'hello'")
else:
print("No match")
Output:
Yes, the string starts with 'hello'

$ Ends with "planet$"

import re
txt = "hello planet"
#Check if the string ends with 'planet':
x = re.findall("planet$", txt)
if x:
print("Yes, the string ends with 'planet'")
else:
print("No match")
Output:
Yes, the string ends with 'planet'
* Zero or more occurrences "he.*o"
import re
txt = "hello planet"
#Search for a sequence that starts with "he", followed by 0 or more (any) characters, and an
"o":
x = re.findall("he.*o", txt)
print(x)

Output
['hello']

+ One or more occurrences "he.+o"

import re
txt = "hello planet"
#Search for a sequence that starts with "he", followed by 1 or more (any) characters,
and an "o":
x = re.findall("he.+o", txt)
print(x)
output:
['hello']

? Zero or one occurrences "he.?o"

import re
txt = "hello planet"
#Search for a sequence that starts with "he", followed by 0 or 1 (any) character, and an "o":
x = re.findall("he.?o", txt)
print(x)
#This time we got no match, because there were not zero, not one, but two characters
between "he" and the "o"
Output:
[]

{} Exactly the specified number of occurrences "he.{2}o"

import re
txt = "hello planet"
#Search for a sequence that starts with "he", followed excactly 2 (any) characters, and an "o":
x = re.findall("he.{2}o", txt)
print(x)
Output:
['hello']

| Either or "falls|stays"

import re
txt = "The rain in Spain falls mainly in the plain!"
#Check if the string contains either "falls" or "stays":
x = re.findall("falls|stays", txt)
print(x)
if x:
print("Yes, there is at least one match!")
else:
print("No match")

Output:
['falls']
Yes, there is at least one match!

Special Sequences
A special sequence is a \ followed by one of the characters in the list below, and has a special
meaning:
Characte Description Example
r

\A Returns a match if the specified characters are at the beginning "\AThe"

of the string
import re
txt = "The rain in Spain"
#Check if the string starts with "The":
x = re.findall("\AThe", txt)
print(x)
if x:
print("Yes, there is a match!")
else:
print("No match")
Output:
['The']
Yes, there is a match!
\ Returns a match where the specified characters are at the beginning or at r"\bain"
b the end of a word
(the "r" in the beginning is making sure that the string is being treated as a r"ain\b"
"raw string")
import re
txt = "The rain in Spain"

#Check if "ain" is present at the beginning of a WORD:

x = re.findall(r"\bain", txt)
print(x)
if x:
print("Yes, there is at least one match!")
else:
print("No match")

Output:
[]
No match

import re
txt = "The rain in Spain"
#Check if "ain" is present at the end of a WORD:
x = re.findall(r"ain\b", txt)
print(x)
if x:
print("Yes, there is at least one match!")
else:
print("No match")

Output:
['ain', 'ain']
Yes, there is at least one match!

\ Returns a match where the specified characters are present, but NOT at the r"\
B beginning (or at the end) of a word Bain"
(the "r" in the beginning is making sure that the string is being treated as a
"raw string") r"ain\
B"

import re
txt = "The rain in Spain"
#Check if "ain" is present, but NOT at the beginning of a word:
x = re.findall(r"\Bain", txt)
print(x)
if x:
print("Yes, there is at least one match!")
else:
print("No match")
Output:
['ain', 'ain']
Yes, there is at least one match!

\d Returns a match where the string contains digits (numbers from 0-9) "\d"

import re
txt = "The rain in Spain"
#Check if the string contains any digits (numbers from 0-9):
x = re.findall("\d", txt)
print(x)
if x:
print("Yes, there is at least one match!")
else:
print("No match")

Output:
[]
No match

\D Returns a match where the string DOES NOT contain digits "\D"

import re
txt = "The rain in Spain"
#Return a match at every no-digit character:
x = re.findall("\D", txt)
print(x)
if x:
print("Yes, there is at least one match!")
else:
print("No match")

Output:
['T', 'h', 'e', ' ', 'r', 'a', 'i', 'n', ' ', 'i', 'n', ' ', 'S', 'p', 'a', 'i', 'n']
Yes, there is at least one match!

\s Returns a match where the string contains a white space character "\s"
import re
txt = "The rain in Spain"
#Return a match at every white-space character:
x = re.findall("\s", txt)
print(x)
if x:
print("Yes, there is at least one match!")
else:
print("No match")

Output:
[' ', ' ', ' ']
Yes, there is at least one match!

\S Returns a match where the string DOES NOT contain a white space character "\S"
import re
txt = "The rain in Spain"
#Return a match at every NON white-space character:
x = re.findall("\S", txt)
print(x)
if x:
print("Yes, there is at least one match!")
else:
print("No match")

Output:
['T', 'h', 'e', 'r', 'a', 'i', 'n', 'i', 'n', 'S', 'p', 'a', 'i', 'n']
Yes, there is at least one match!

\ Returns a match where the string contains any word characters (characters from "\
w a to Z, digits from 0-9, and the underscore _ character) w"

import re
txt = "The rain in Spain"
#Return a match at every word character (characters from a to Z, digits from 0-9, and the
underscore _ character):
x = re.findall("\w", txt)
print(x)
if x:
print("Yes, there is at least one match!")
else:
print("No match")

Output:
['T', 'h', 'e', 'r', 'a', 'i', 'n', 'i', 'n', 'S', 'p', 'a', 'i', 'n']
Yes, there is at least one match!
\W Returns a match where the string DOES NOT contain any word characters "\W"
import re

txt = "The rain in Spain"

#Return a match at every NON word character (characters NOT between a and Z. Like "!",
"?" white-space etc.):
x = re.findall("\W", txt)
print(x)
if x:
print("Yes, there is at least one match!")
else:
print("No match")

Output:
[' ', ' ', ' ']
Yes, there is at least one match!

\Z Returns a match if the specified characters are at the end of the string "Spain\Z"
import re
txt = "The rain in Spain"
#Check if the string ends with "Spain":
x = re.findall("Spain\Z", txt)
print(x)
if x:
print("Yes, there is a match!")
else:
print("No match")

Output:
['Spain']
Yes, there is a match!

Sets
A set is a set of characters inside a pair of square brackets [] with a special meaning:
[arn] Returns a match where one of the specified characters (a, r, or n) is present
import re
txt = "The rain in Spain"
#Check if the string has any a, r, or n characters:
x = re.findall("[arn]", txt)
print(x)
if x:
print("Yes, there is at least one match!")
else:
print("No match")

Output:
['r', 'a', 'n', 'n', 'a', 'n']
Yes, there is at least one match!

[a-n] Returns a match for any lower case character, alphabetically between a and n
import re
txt = "The rain in Spain"
#Check if the string has any characters between a and n:
x = re.findall("[a-n]", txt)
print(x)
if x:
print("Yes, there is at least one match!")
else:
print("No match")

Output:
['h', 'e', 'a', 'i', 'n', 'i', 'n', 'a', 'i', 'n']
Yes, there is at least one match!

[^arn] Returns a match for any character EXCEPT a, r, and n

import re
txt = "The rain in Spain"
#Check if the string has other characters than a, r, or n:
x = re.findall("[^arn]", txt)
print(x)
if x:
print("Yes, there is at least one match!")
else:
print("No match")

Output:
['T', 'h', 'e', ' ', 'i', ' ', 'i', ' ', 'S', 'p', 'i']
Yes, there is at least one match!

[0123] Returns a match where any of the specified digits (0, 1, 2, or 3) are present
import re
txt = "The rain in Spain"
#Check if the string has any 0, 1, 2, or 3 digits:
x = re.findall("[0123]", txt)
print(x)
if x:
print("Yes, there is at least one match!")
else:
print("No match")

Output:
[]
No match

[0-9] Returns a match for any digit between 0 and 9

import re
txt = "8 times before 11:45 AM"
#Check if the string has any digits:
x = re.findall("[0-9]", txt)
print(x)
if x:
print("Yes, there is at least one match!")
else:
print("No match")

Output:
['8', '1', '1', '4', '5']
Yes, there is at least one match!

[0-5][0-9] Returns a match for any two-digit numbers from 00 and 59

import re
txt = "8 times before 11:45 AM"
#Check if the string has any two-digit numbers, from 00 to 59:
x = re.findall("[0-5][0-9]", txt)
print(x)
if x:
print("Yes, there is at least one match!")
else:
print("No match")

Output:
['11', '45']
Yes, there is at least one match!

[a-zA-Z] Returns a match for any character alphabetically between a and z, lower case
OR upper case
import re
txt = "8 times before 11:45 AM"

#Check if the string has any characters from a to z lower case, and A to Z upper case:
x = re.findall("[a-zA-Z]", txt)
print(x)
if x:
print("Yes, there is at least one match!")
else:
print("No match")

Output:
['t', 'i', 'm', 'e', 's', 'b', 'e', 'f', 'o', 'r', 'e', 'A', 'M']
Yes, there is at least one match!

[+ In sets, +, *, ., |, (), $,{} has no special meaning, so [+] means: return a match for
] any + character in the string
import re
txt = "8 times before 11:45 AM"
#Check if the string has any + characters:
x = re.findall("[+]", txt)
print(x)
if x:
print("Yes, there is at least one match!")
else:
print("No match")

Output:
[]
No match

Pattern matching in Python with Regex

What is Regular Expression?
In the real world, string parsing in most programming languages is handled by regular
expression. Regular expression in a python programming language is a method used for
matching text pattern.

The “re” module which comes with every python installation provides regular expression
support.

In python, a regular expression search is typically written as:

match = re.search(pattern, string)

The re.search() method takes two arguments, a regular expression pattern and a string and
searches for that pattern within the string. If the pattern is found within the string, search()
returns a match object or None otherwise. So in a regular expression, given a string,
determine whether that string matches a given pattern, and, optionally, collect substrings that
contain relevant information. A regular expression can be used to answer questions like −

 Is this string a valid URL?

 Which users in /etc/passwd are in a given group?
 What is the date and time of all warning messages in a log file?
 What username and document were requested by the URL a visitor typed?

Matching patterns
Regular expressions are complicated mini-language. They rely on special characters to match
unknown strings, but let's start with literal characters, such as letters, numbers, and the space
character, which always match them. Let's see a basic example:

#Need module 're' for regular expression

import re
#
search_string ="King of Kings"
pattern = "King"
match = re.match(pattern, search_string)
#If-statement after search() tests if it succeeded
if match:
print("regex matches: ", match.group())
else:
print('pattern not found')

Output:
regex matches: King

Day-13 Python Regx
No ratings yet
Day-13 Python Regx
11 pages
Python Regex Basics
No ratings yet
Python Regex Basics
16 pages
Pandas
No ratings yet
Pandas
8 pages
Unit 4 Regular Expression
No ratings yet
Unit 4 Regular Expression
16 pages
Python Regex Cheat Sheet
No ratings yet
Python Regex Cheat Sheet
29 pages
Regular
No ratings yet
Regular
9 pages
Understanding Regular Expressions in Python
No ratings yet
Understanding Regular Expressions in Python
22 pages
Ge Rex
No ratings yet
Ge Rex
32 pages
App Dev Using Python-Chapter 3
No ratings yet
App Dev Using Python-Chapter 3
16 pages
Python Regular Expressions Guide
No ratings yet
Python Regular Expressions Guide
10 pages
Reg Expressions
No ratings yet
Reg Expressions
9 pages
17 - Regular Expression
No ratings yet
17 - Regular Expression
20 pages
Python RegEx
No ratings yet
Python RegEx
11 pages
Regular Expression 01
No ratings yet
Regular Expression 01
48 pages
3.III-Regular Expression Part-I & II 2022-23
No ratings yet
3.III-Regular Expression Part-I & II 2022-23
14 pages
9 RegEx
No ratings yet
9 RegEx
57 pages
UNIT4
No ratings yet
UNIT4
67 pages
Python Unit 3
No ratings yet
Python Unit 3
46 pages
9 RegEx
No ratings yet
9 RegEx
57 pages
PP - Chapter - 4
No ratings yet
PP - Chapter - 4
15 pages
Lecture 14 Regular Expressions
No ratings yet
Lecture 14 Regular Expressions
4 pages
Python Regular Expressions
No ratings yet
Python Regular Expressions
6 pages
Unit - 4 Regex
No ratings yet
Unit - 4 Regex
28 pages
Re Expression 19 and 20
No ratings yet
Re Expression 19 and 20
26 pages
Python Regular Expressions
No ratings yet
Python Regular Expressions
14 pages
Python Regex Guide & Examples
No ratings yet
Python Regex Guide & Examples
26 pages
Module 24 Regular Expressions Revisited
No ratings yet
Module 24 Regular Expressions Revisited
15 pages
RegEx in Python
No ratings yet
RegEx in Python
5 pages
Python 201 - (Slightly) Advanced Python Topics
No ratings yet
Python 201 - (Slightly) Advanced Python Topics
69 pages
Python Complete Unit 3
No ratings yet
Python Complete Unit 3
40 pages
Python Regex Basics and Usage
No ratings yet
Python Regex Basics and Usage
12 pages
Regular Expression L
No ratings yet
Regular Expression L
20 pages
Regular Expression
No ratings yet
Regular Expression
21 pages
Python
No ratings yet
Python
4 pages
Python Regex
No ratings yet
Python Regex
8 pages
Unit 4 - Regular Expressions
No ratings yet
Unit 4 - Regular Expressions
20 pages
Regex for Genomics & Programming
No ratings yet
Regex for Genomics & Programming
38 pages
Python Unit-3
No ratings yet
Python Unit-3
23 pages
Chapter - 11 - Regular Expressions
100% (1)
Chapter - 11 - Regular Expressions
10 pages
Python Reg Expressions
No ratings yet
Python Reg Expressions
8 pages
Python Reg Expressions PDF
No ratings yet
Python Reg Expressions PDF
8 pages
Unit 2
No ratings yet
Unit 2
69 pages
Regular Expression
No ratings yet
Regular Expression
39 pages
Regex Metacharacters Guide
No ratings yet
Regex Metacharacters Guide
49 pages
Lecture 7 Re Part2 Split
No ratings yet
Lecture 7 Re Part2 Split
8 pages
Python Regex Functions Overview
No ratings yet
Python Regex Functions Overview
4 pages
Python Regex Guide With Examples
No ratings yet
Python Regex Guide With Examples
4 pages
PP - Module-3 Notes
No ratings yet
PP - Module-3 Notes
56 pages
Regular Expressions - Regexes in Python (Part 2) - Real Python
No ratings yet
Regular Expressions - Regexes in Python (Part 2) - Real Python
27 pages
Python Course: Session 6b - Regular Expressions
No ratings yet
Python Course: Session 6b - Regular Expressions
11 pages
Regular Expressions - Regexes in Python (Part 1) - Real Python
No ratings yet
Regular Expressions - Regexes in Python (Part 1) - Real Python
44 pages
Python Regular Expressions
No ratings yet
Python Regular Expressions
25 pages
Unit7 RegularExpressionpdf 2023 10 17 09 16 29
No ratings yet
Unit7 RegularExpressionpdf 2023 10 17 09 16 29
17 pages
Regex Guide for Developers
No ratings yet
Regex Guide for Developers
10 pages
8 - String and Regular Expression
No ratings yet
8 - String and Regular Expression
27 pages
Python Regex Guide for Developers
No ratings yet
Python Regex Guide for Developers
3 pages
Python Re
No ratings yet
Python Re
18 pages
Unit-2.1 PPT Basic Structural Modeling
No ratings yet
Unit-2.1 PPT Basic Structural Modeling
51 pages
Lecture 3 - Decision Trees and Random Forest
No ratings yet
Lecture 3 - Decision Trees and Random Forest
20 pages
IP Project Medical Management
No ratings yet
IP Project Medical Management
19 pages
Adk Docs Llms
No ratings yet
Adk Docs Llms
15 pages
JSP and Servlet
No ratings yet
JSP and Servlet
9 pages
TOC Assignment 2 5B
No ratings yet
TOC Assignment 2 5B
3 pages
Playwright
No ratings yet
Playwright
2 pages
C++ Programming Concepts and Exercises
100% (3)
C++ Programming Concepts and Exercises
54 pages
Programming for Problem Solving Course
No ratings yet
Programming for Problem Solving Course
416 pages
ADA Que Bank 2022-2023
No ratings yet
ADA Que Bank 2022-2023
14 pages
MIRO, MIR7, MIRA Batch Input Is Not Possible
0% (1)
MIRO, MIR7, MIRA Batch Input Is Not Possible
2 pages
Programming for Problem Solving: Unit I
No ratings yet
Programming for Problem Solving: Unit I
132 pages
Java Static Keyword Explained
No ratings yet
Java Static Keyword Explained
6 pages
Excelmacroforsolvingapolynomialequation 141126122007 Conversion Gate01 PDF
No ratings yet
Excelmacroforsolvingapolynomialequation 141126122007 Conversion Gate01 PDF
4 pages
Unit 5 Notes Data Analytics Kit 601
No ratings yet
Unit 5 Notes Data Analytics Kit 601
44 pages
Sum of Two Numbers Program
No ratings yet
Sum of Two Numbers Program
80 pages
RX Scripting Guide for Roulette Xtreme
No ratings yet
RX Scripting Guide for Roulette Xtreme
102 pages
Practical Assignment 1 Cs Class 12
No ratings yet
Practical Assignment 1 Cs Class 12
3 pages
ChatGpt Java
No ratings yet
ChatGpt Java
6 pages
SUMMER 2020 Paper Solution - DBMS
No ratings yet
SUMMER 2020 Paper Solution - DBMS
21 pages
Cloud Big Data: Spark vs MPI/OpenMP
No ratings yet
Cloud Big Data: Spark vs MPI/OpenMP
10 pages
Java8 MCQ
No ratings yet
Java8 MCQ
7 pages
A Calculator in Pharo Smalltalk Using Spec
100% (1)
A Calculator in Pharo Smalltalk Using Spec
15 pages
Python Notes
No ratings yet
Python Notes
11 pages
Mod 2.1 - (Lec 8) - Syntax Analyzer and CFG
No ratings yet
Mod 2.1 - (Lec 8) - Syntax Analyzer and CFG
39 pages
Log
No ratings yet
Log
86 pages
ES Chapter6
No ratings yet
ES Chapter6
61 pages
System Software System Software: R Anderson R. Anderson
No ratings yet
System Software System Software: R Anderson R. Anderson
17 pages
Data Table
No ratings yet
Data Table
15 pages
8086 Microprocessor Registers
No ratings yet
8086 Microprocessor Registers
5 pages

Regular Expression

Uploaded by

Regular Expression

Uploaded by

Python RegEx / Regular Expression

A RegEx, or Regular Expression, is a sequence of characters that forms a search pattern.

findall Returns a list containing all matches

search Returns a Match object if there is a match anywhere in the string

sub Replaces one or many matches with a string

Metacharacters are characters with a special meaning:

[] A set of characters "[a-m]"

\ Signals a special sequence (can also be used to escape "\d"

. Any character (except newline character) "he..o"

^ Starts with "^hello"

* Zero or more occurrences "he.*o"

+ One or more occurrences "he.+o"

? Zero or one occurrences "he.?o"

{} Exactly the specified number of occurrences "he.{2}o"

() Capture and group

\A Returns a match if the specified characters are at the "\AThe"

\b Returns a match where the specified characters are at the r"\bain"

\B Returns a match where the specified characters are present, r"\Bain"

\d Returns a match where the string contains digits (numbers "\d"

\s Returns a match where the string contains a white space "\s"

\S Returns a match where the string DOES NOT contain a "\S"

\w Returns a match where the string contains any word "\w"

\Z Returns a match if the specified characters are at the end of "Spain\Z"

[^arn] Returns a match for any character EXCEPT a, r, and n

[0-9] Returns a match for any digit between 0 and 9

[0-5][0- Returns a match for any two-digit numbers from 00 and 59

The findall() Function

The search() Function

If no matches are found, the value None is returned:

The split() Function

The sub() Function

txt = "The rain in Spain"

. Any character (except newline character) "he..o"

^ Starts with "^hello"

$ Ends with "planet$"

+ One or more occurrences "he.+o"

? Zero or one occurrences "he.?o"

{} Exactly the specified number of occurrences "he.{2}o"

\A Returns a match if the specified characters are at the beginning "\AThe"

#Check if "ain" is present at the beginning of a WORD:

txt = "The rain in Spain"

[^arn] Returns a match for any character EXCEPT a, r, and n

[0-9] Returns a match for any digit between 0 and 9

[0-5][0-9] Returns a match for any two-digit numbers from 00 and 59

Pattern matching in Python with Regex

In python, a regular expression search is typically written as:

match = re.search(pattern, string)

 Is this string a valid URL?

#Need module 're' for regular expression

You might also like