Python
Python
File Handling:
Syntax:
f = open(filename, mode)
Modes of Operation:
1. r: open an existing file for a read operation.
2. w: open an existing file for a write operation. If the file already contains some data then
it will be overridden but if the file is not present then it creates the file as well.
3. a: open an existing file for append operation. It won’t override existing data.
4. r+: To read and write data into the file. The previous data in the file will be overridden.
5. w+: To write and read data. It will override existing data.
6. a+: To append and read data from the file. It won’t override existing data.
file.read().
# Python code to illustrate read() mode
file = open("file.txt", "r")
print (file.read())
print (file.read(5))
read() : Returns the read bytes in form of a string. Reads n bytes, if no n specified, reads the
entire file.
1
File_object.read([n])
readline() : Reads a line of the file and returns in form of a string.For specified n, reads at most
n bytes. However, does not reads more than one line, even if n exceeds the length of the line.
File_object.readline([n])
readlines() : Reads all the lines and return them as each line a string element in a list.
File_object.readlines()
2
word = line.split()
print (word)
File Handling:
Python provides inbuilt functions for creating, writing, and reading files. There are two types
of files that can be handled in Python, normal text files and binary files (written in binary
language, 0s, and 1s).
Text files: In this type of file, each line of text is terminated with a special character called EOL
(End of Line), which is the new line character (‘\n’) in Python by default. In the case of
CSV(Comma Separated Files, the EOF is a comma by default.
Binary files: In this type of file, there is no terminator for a line, and the data is stored after
converting it into machine-understandable binary language, i.e., 0 and 1 format.
1. Read Only (‘r’): Open text file for reading. The handle is positioned at the beginning of
the file. If the file does not exist, raises an I/O error. This is also the default mode in
which the file is opened.
2. Read and Write (‘r+’): Open the file for reading and writing. The handle is positioned
at the beginning of the file. Raises I/O error if the file does not exist.
3. Write Only (‘w’): Open the file for writing. For the existing files, the data is truncated
and over-written. The handle is positioned at the beginning of the file. Creates the file
if the file does not exist.
4. Write and Read (‘w+’): Open the file for reading and writing. For existing files, data is
truncated and over-written. The handle is positioned at the beginning of the file.
5. Append Only (‘a’): Open the file for writing. The file is created if it does not exist. The
handle is positioned at the end of the file. The data being written will be inserted at the
end, after the existing data.
6. Append and Read (‘a+’): Open the file for reading and writing. The file is created if it
does not exist. The handle is positioned at the end of the file. The data being written
will be inserted at the end, after the existing data.
7. Read Only in Binary format(‘rb’): It lets the user open the file for reading in binary
format.
8. Read and Write in Binary Format(‘rb+’): It lets the user open the file for reading and
writing in binary format.
9. Write Only in Binary Format(‘wb’): It lets the user open the file for writing in binary
format. When a file gets opened in this mode, there are two things that can happen
mostly. A new file gets created if the file does not exist. The content within the file will
get overwritten if the file exists and has some data stored in it.
10. Write and Read in Binary Format(‘wb+’): It lets the user open the file for reading as
well as writing in binary format. When a file gets opened in this mode, there are two
things that can mostly happen. A new file gets created for writing and reading if the file
does not exist. The content within the file will get overwritten if the file exists and has
some data stored in it.
11. Append only in Binary Format(‘ab’): It lets the user open the file for appending in
binary format. A new file gets created if there is no file. The data will be inserted at the
end if the file exists and has some data stored in it.
3
12. Append and Read in Binary Format(‘ab+’): It lets the user open the file for appending
and reading in binary format. A new file will be created for reading and appending if
the file does not exist. We can read and append if the file exists and has some data stored
in it.
By default, the open() function will open the file in read mode, if no parameter is provided.
file1 = open("myfile.txt")
# Reading from file
print(file1.read())
file1.close()
try:
file1 = open("test.txt", "r")
read_content = file1.read()
print(read_content)
finally:
# close the file
file1.close()
Here, we have closed the file in the finally block as finally always executes, and the file
will be closed even if an exception occurs.
Method Description
close() Closes an opened file. It has no effect if the file is already closed.
detach() Separates the underlying binary buffer from the TextIOBase and returns it.
4
fileno() Returns an integer number (file descriptor) of the file.
flush() Flushes the write buffer of the file stream.
isatty() Returns True if the file stream is interactive.
read(n) Reads at most n characters from the file. Reads till end of file if it is negative or
None.
readable() Returns True if the file stream can be read from.
readline(n=-1) Reads and returns one line from the file. Reads in at most n bytes if
specified.
readlines(n=-1) Reads and returns a list of lines from the file. Reads in at most n
bytes/characters if specified.
seek(offset,from=SEEK_SET) Changes the file position to offset bytes, in reference to
from (start, current, end).
seekable() Returns True if the file stream supports random access.
tell() Returns an integer that represents the current position of the file's object.
truncate(size=None) Resizes the file stream to size bytes. If size is not specified,
resizes to current location.
writable() Returns True if the file stream can be written to.
write(s) Writes the string s to the file and returns the number of characters written.
writelines(lines) Writes a list of lines to the file.
Binary File
# Writing binary data to a file
with open("binary_data.bin", "wb") as file:
data = b'\x48\x65\x6c\x6c\x6f\x20\x57\x6f\x72\x6c\x64' # binary data to write
file.write(data)
# Reading binary data from a file
with open("binary_data.bin", "rb") as file:
data = file.read()
print(data) # prints b'Hello World' (binary data)
Directory
A directory is a collection of files and subdirectories. A directory inside a directory is known
as a subdirectory.
Python has the os module that provides us with many useful methods to work with
directories (and files as well).
5
This method returns the current working directory in the form of a string. For example,
import os
print(os.getcwd())
# Output: C:\Program Files\PyScripter
Here, getcwd() returns the current directory in the form of a string.
Changing Directory in Python
import os
# change directory
os.chdir('C:\\Python33')
print(os.getcwd())
Output: C:\Python33
List Directories and Files in Python
import os
print(os.getcwd())
C:\Python33
# list all sub-directories
os.listdir()
['DLLs',
'Doc',
'include',
'Lib',
'libs',
'LICENSE.txt',
'NEWS.txt',
'python.exe',
'pythonw.exe',
'README.txt',
'Scripts',
6
'tcl',
'Tools']
os.listdir('G:\\')
['$RECYCLE.BIN',
'Movies',
'Music',
'Photos',
'Series',
'System Volume Information']
Making a New Directory in Python
In Python, we can make a new directory using the mkdir() method.
This method takes in the path of the new directory. If the full path is not specified, the new
directory is created in the current working directory.
os.mkdir('test')
os.listdir()
['test']
Renaming a Directory or a File
import os
os.listdir()
['test']
# rename a directory
os.rename('test','new_one')
os.listdir()
['new_one']
The os module (and sys, and path):
The os and sys modules provide numerous tools to deal with filenames, paths, directories.
The os module contains two sub-modules os.sys (same as sys) and os.path that are
dedicated to the system and directories; respectively.
Whenever possible, you should use the functions provided by these modules for file,
directory, and path manipulations. These modules are wrappers for platform-specific
modules, so functions like os.path.split work on UNIX, Windows, Mac OS, and any other
platform supported by Python.
7
These shutil, tempfile, glob modules from the Python documentation.
Quick start
You can build multi-platform path using the proper separator symbol:
>>> import os
>>> import os.path
>>> os.path.join(os.sep, 'home', 'user', 'work')
'/home/user/work'
>>> os.path.split('/usr/bin/python')
('/usr/bin', 'python')
7.2. Functions
The os module has lots of functions. We will not cover all of them thoroughly but this could
be a good start to use the module.
Manipulating Directories
The getcwd() function returns the current directory (in unicode format with getcwdu() ).
The current directory can be changed using chdir():
os.chdir(path)
The listdir() function returns the content of a directory. Note, however, that it mixes
directories and files.
The mkdir() function creates a directory. It returns an error if the parent directory does not
exist. If you want to create the parent directory as well, you should rather use makedirs():
>>> os.mkdir('temp') # creates temp directory inside the current directory
>>> os.makedirs(/tmp/temp/temp")
Once created, you can delete an empty directory with rmdir():
>>> import os
>>> os.mkdir('/tmp/temp')
>>> os.rmdir('/tmp/temp')
You can remove all directories within a directory (if there are not empty) by using
os.removedirs().
If you want to delete a non-empty directory, use shutil.rmtree() (with cautious).
Removing a file
To remove a file, use os.remove(). It raise the OSError exception if the file cannot be
removed. Under Linux, you can also use os.unlink().
8
Renaming files or directories
You can rename a file from an old name to a new one by using os.rename(). See also
os.renames().
Permission
you can change the mode of a file using chmod(). See also chown, chroot, fchmod, fchown.
The os.access() verifies the access permission specified in the mode argument. Returns 1 if
the access is granted, 0 otherwise. The mode can be:
os.F_OK Value to pass as the mode parameter of access() to test the existence of path.
os.R_OK: Value to include in the mode parameter of access() to test the readability of path.
os.W_OK Value to include in the mode parameter of access() to test the writability of path.
os.X_OK Value to include in the mode parameter of access() to determine if path can be
>>> os.access("validFile", os.F_OK)
True
You can change the mask of a file using the the os.umask() function. The mask is just a
number that summarises the permissions of a file:
os.umask(644)
Using more than one process
On Unix systems, os.fork() tells the computer to copy everything about the currently
running program into a newly created program that is separated, but almost entirely
identical. The newly created process is the child process and gets the data and code of the
parent process. The child process gets a process number known as pid. The parent and child
processes are independent.
The following code works on Unix and Unix-like systems only:
import os
pid = os.fork()
if pid == 0: # the child
print "this is the child"
elif pid > 0:
print "the child is pid %d" % pid
else:
print("An error occured")
Here, the fork is zithin the executed script but ,ost of the time; you would require the
9
One of the most common things to do after an os.fork call is to call os.execl immediately
afterward to run another program. os.execl is an instruction to replace the running program
with a new program, so the calling program goes away, and a new program appears in its
place:
import os
pid = os.fork()
# fork and exec together
print "second test"
if pid == 0: # This is the child
print "this is the child"
print "I'm going to exec another program now"
os.execl(`/bin/cat', `cat', `/etc/motd')
else:
print "the child is pid %d" % pid
os.wait()
The os.wait function instructs Python that you want the parent to not do anything until the
child process returns. It is very useful to know how this works because it works well only
under Unix and Unix-like platforms such as Linux. Windows also has a mechanism for
starting up new processes. To make the common task of starting a new program easier,
Python offers a single family of functions that combines os.fork and os.exec on Unix-like
systems, and enables you to do something similar on Windows platforms. When you want
to just start up a new program, you can use the os.spawn family of functions.
The different between the different spawn versions:
v requires a list/vector os parameters. This allows a command to be run with very different
commands from one instance to the next without needing to alter the program at all.
l requires a simple list of parameters.
e requires a dictionary containing names and values to replace the current environment.
p requires the value of the PATH key in the environment dictionary to find the program.
The
p variants are available only on Unix-like platforms. The least of what this means is that on
Windows your programs must have a completely qualified path to be usable by the
os.spawn calls, or you have to search the path yourself:
import os, sys
if sys.platform == `win32':
print "Running on a windows platform"
10
command = "C:\\winnt\\system32\\cmd.exe"
params = []
if sys.platform == `linux2':
print "Running on a Linux system, identified by %s" % sys.platform
command = `/bin/uname'
params = [`uname', `-a']
print "Running %s" % command
os.spawnv(os.P_WAIT, command, params)
The exec function comes in different flavours:
sys module
When starting a Python shell, Python provides 3 file objects called stadnard input, stadn
output and standard error. There are accessible via the sys module:
sys.stderr
sys.stdin
sys.stdout
The sys.argv is used to retrieve user argument when your module is executable.
Another useful attribute in the sys.path that tells you where Python is searching for modules
on your system. see Module for more details.
Information
sys.platform returns the platform version (e.g., linux2)
sys.version returns the python version
sys.version_info returns a named tuple
sys.exitfunc sys.last_value sys.pydebug
sys.flags sys.long_info sys.real_prefix
sys.builtin_module_names sys.float_info sys.setcheckinterval
sys.byteorder sys.float_repr_style sys.maxsize sys.setdlopenflags
sys.call_tracing sys.getcheckinterval sys.maxunicode sys.setprofile
sys.callstats sys.meta_path sys.copyright
sys.getdlopenflags sys.modules sys.settrace
sys.displayhook sys.getfilesystemencoding sys.path
sys.dont_write_bytecode sys.getprofile sys.path_hooks
11
sys.exc_clear sys.path_importer_cache
sys.exc_info sys.getrefcount sys.exc_type sys.getsizeof
sys.prefix sys.excepthook
sys.gettrace sys.ps1
sys.exec_prefix sys.ps2 sys.warnoptions
sys.executable sys.last_traceback sys.ps3
sys.last_type sys.py3kwarning
The sys.modules attribute returns list of all the modules that have been imported so far in
your environment.
recursion
See the Functions section to know more about recursions. You can limit the number of
recursions and know about the number itself using the sys.getrecursionlimit() and
sys.setrecursionlimit() functions.
A CSV (Comma Separated Values) format is one of the most simple and common ways
to store tabular data. To represent a CSV file, it must be saved with the .csv file extension.
Let's take an example:
import csv
# Writing data to a CSV file
with open("data.csv", "w", newline="") as file:
writer = csv.writer(file)
writer.writerow(["Name", "Age", "Country"])
writer.writerow(["John", 28, "USA"])
writer.writerow(["Jane", 32, "Canada"])
writer.writerow(["Bob", 45, "Australia"])
# Reading data from a CSV file
with open("data.csv", "r") as file:
reader = csv.reader(file)
for row in reader:
print(row)
12
Python provides several built-in modules to handle CSV (Comma-Separated Values)
files, including:
csv module: This is a standard library module that provides functionality to read
from and write to CSV files. It supports various dialects and options to customize
the format of the CSV file. You can use this module to read and write CSV files in
a row-based or column-based manner.
pandas module: This is a third-party module that provides high-performance data
analysis and manipulation tools. It includes a read_csv() function that can read a
CSV file into a DataFrame object, which is a two-dimensional table-like data
structure with labeled axes (rows and columns). It also provides several methods to
manipulate and analyze the data in the DataFrame.
numpy module: This is another third-party module that provides a powerful array
processing and mathematical operations. It includes a genfromtxt() function that
can read a CSV file into a numpy.ndarray object, which is a multidimensional array-
like data structure that can perform mathematical operations efficiently.
sqlite3 module: This is a standard library module that provides a lightweight and
self-contained relational database engine. You can use this module to import CSV
data into an SQLite database or export data from an SQLite database to a CSV file.
Tab Separated File Example:
# Writing data to a tab-separated file
with open("data.tsv", "w", newline="") as file:
writer = csv.writer(file, delimiter="\t")
writer.writerow(["Name", "Age", "Country"])
writer.writerow(["John", 28, "USA"])
writer.writerow(["Jane", 32, "Canada"])
writer.writerow(["Bob", 45, "Australia"])
14
Decimal Number System (base or radix = 10)
Hexadecimal Number System (base or radix = 16)
Binary Number System
A number system with base or radix 2 is known as a binary number system. Only 0 and 1
are used to represent numbers in this system.
) Binary to Decimal
For Binary to Decimal conversion, the binary number uses weights assigned to each bit
position. like
a=1001
a = 1*23 +0*22+0*21+1*20
a= (8+0+0+1) = 9
b = "1001"
print("Binary to Decimal", b, ":", int(b, 2))
Output:
inary to Decimal 1001 : 9
2) Binary to Octal
First convert binary number to decimal number by assigning weight to each binary bit.
a=1001
a =1*23 + 0*22+ 0*21 +1*20
a= (8+0+0+1) = 9
Now, 9 can be converted into octal by dividing it by 8 until we get the remainder between
(0-7).
(1001)2 = (9)10 = (11)8
o = 0b1001
print("Binary to Octal", o, ":", oct(o))
Output:
15
a =1*25+0*24+0*23+1*22+0*21+1*20
a= (32+0+0+4+0+1) = 37
As 100101 in binary is represented as 37 in decimal we can convert 37 into Hexadecimal
by dividing it by 16.
a = (100101)2= (37)10 = (25)16
h = 0b100101
print("Binary to Hexadecimal", h, ":", hex(h))
Output:
Binary to Hexadecimal 37 : 0x25
Octal Number System
Octal Number System is one in which the base value is 8. It uses 8 digits i.e. 0-7 for the
creation of Octal Numbers. It is also a positional system i.e weight is assigned to each
position.
1) Octal to Binary:
Octal numbers are converted to binary numbers by replacing each octal digit with a three-
bit binary number. In python bin( ) function is used to convert octal number to binary
number. The value is written with ‘0o’ as a prefix which indicates that the value is in the
octal form.
eg. (123)8 = (001 010 011)2 = (1010011)2
O = 0o123
print("Octal to Binary",O,":",bin(O))
Output:
Octal to Binary 0o123 : 0b1010011
2) Octal to Decimal:
An octal number can be converted into a decimal number by assigning weight to each
position. In python int( ) function is used to convert octal to decimal numbers. Two
arguments are get passed, the first is a string of octal numbers, and the second is the base
of the number system specified in the string.
(342)8 = 3* 82 + 4*81 + 2*80
= 3*64 + 4*8 + 2*1
= 226
(342)8 = (226)10
b = "342"
print("Octal to Decimal",b,":",int(b,8))
16
Output:
Octal to Decimal 342 : 226
3) Octal to Hexadecimal:
An Octal number can be converted into a hexadecimal number by converting the number
into a decimal and then a decimal number to hexadecimal. In python hex( ) function is used
to convert octal to hexadecimal numbers.
Let’s first convert b into a decimal number.
b = (456)8
(456)8 = 4*82 + 5*81+ 6*80
(456)8 = (302)10 = (12E)16
h = 0o456
print("Octal to Hexadecimal", h,":", hex(h))
Output:
Octal to Hexadecimal 302 : 0x12e
Decimal Number System
A number system with a base value of 10 is termed a Decimal number system. and it is
represented using digits between (0 to 9). Here, the place value is termed from right to left
as first place value called units, second to the left as Tens, so on Hundreds, Thousands, etc.
1) Decimal to Binary.
For decimal to binary conversion, the number is divided by 2 until we get 1 or 0 as the final
remainder. In python bin( ) function is used to convert decimal to binary numbers.
Here, a = 10
(10)10 = (1010)2
# Decimal to Binary
a = 10
print("Decimal to Binary ", a, ":", bin(a))
Output:
Decimal to Binary 10 : 0b1010
2) Decimal to Octal
For Decimal to Octal conversion, the number is divided by 8 until we get a number between
0 to 7 as the final remainder. In python oct( ) function is used to convert decimal to octal
numbers.
so, (10)10 = (12)8
17
# Decimal to Octal
a = 10
print("Decimal to Octal",a,":",oct(a))
Output:
Decimal to Octal 10 : 0b1010
3) Decimal to Hexadecimal
For decimal to hexadecimal conversion, the number is divided by 16 until we get a number
between 0 to 9 and (A to F)as the remainder. In python hex( ) function is used to convert
decimal to hexadecimal numbers.
so, (1254)10 = (4E6)16
a = 1254
print("Decimal to Hexadecimal",a,":",hex(1254))
Output:
Decimal to Hexadecimal 1254 : 0x4e6
Hexadecimal Number System
A number system with a base of 16 is called a hexadecimal number system. It is represented
using numbers between 0 to 9 and the alphabets between A to F. As both numeric digits and
alphabets are used in this system, it is also called an alphanumeric system.
1) Hexadecimal to Binary:
The hexadecimal number is converted into a Binary number by replacing each hex digit
with its binary equivalent number.
a = FACE
F = (15)10 = (1111)2
A = (1)10 = (0001)2
C = (12)10 = (1100)2
E =(14)10 = (1110)2
(FACE)16 = (1111000111001110)2
a = 0xFACE
print("Hexadecimal to Binary", a, ":", bin(a))
Output:
Hexadecimal to Binary 64206 : 0b1111101011001110
2) Hexadecimal to Octal:
18
The hexadecimal number is converted to an octal number by first converting that number
to its equivalent binary number and then to a binary number to an octal number.
Eg.)
(94AB)16 = (1001 0100 1010 1011)2
= 001 001 010 010 101 011
= (112253)8
a = 0x94AB
print("Hexadecimal to Octal", a, ":", oct(a))
Output:
Hexadecimal to Octal 38059 : 0o112253
3) Hexadecimal to Decimal:
Hexadecimal number is converted into decimal number by assigning weight (16) to each
hex digit.
(94AB)16 = 9*163 + 4*162 + A*161+ B* 160
= 9* 4096 + 4*256 + 10*16 + 11*1
= (38059)10
b = 0x94AB
print("Hexadecimal to Decimal",b,":",int ("0x94AB",16))
Output:
Hexadecimal to Decimal 0x94AB : 38059
Python Decorators:
Decorators dynamically alter the functionality of a function, method, or class without
having to directly use subclasses or change the source code of the function being decorated.
Using decorators in Python also ensures that your code is DRY(Don't Repeat Yourself).
Decorators have several use cases such as:
Authorization in Python frameworks such as Flask and Django
Logging
Measuring execution time
Synchronization
def my_decorator(func):
def wrapper():
19
print("Before the function is called.")
func()
print("After the function is called.")
return wrapper
@my_decorator
def say_hello():
print("Hello!")
say_hello()
In this example, we define a decorator function called my_decorator that takes a function
as its argument. The wrapper function is defined inside my_decorator and contains code
that will be executed before and after the decorated function is called.
We then define a function called say_hello and use the @my_decorator syntax to apply the
my_decorator decorator to it. When we call say_hello, the my_decorator function is called
with say_hello as its argument, and a new function called wrapper is returned.
When we call say_hello, it actually calls the wrapper function that was returned by
my_decorator. The wrapper function prints "Before the function is called.", then calls the
say_hello function, which prints "Hello!", and finally prints "After the function is called.".
So, the output of this program will be:
Before the function is called.
Hello!
After the function is called.
This is a very simple example, but it shows how decorators can be used to modify the
behavior of functions without changing their source code.
Regular Expression:
A RegEx, or Regular Expression, is a sequence of characters that forms a search pattern.
RegEx can be used to check if a string contains the specified search pattern.
A RegEx, or Regular Expression, is a sequence of characters that forms a search pattern.
RegEx can be used to check if a string contains the specified search pattern.
Search the string to see if it starts with "The" and ends with "Spain":
import re
txt = "The rain in Spain"
x = re.search("^The.*Spain$", txt)
20
The re module offers a set of functions that allows us to search a string for a match:
Re Functions:
Function Description
findall Returns a list containing all matches
search Returns a Match object if there is a match anywhere in the string
split Returns a list where the string has been split at each match
sub Replaces one or many matches with a string
Metacharacters
Metacharacters are characters with a special meaning:
[] A set of characters "[a-m]"
\ Signals a special sequence (can also be used to escape special characters) "\d"
. Any character (except newline character) "he..o"
^ Starts with "^hello"
$ Ends with "planet$"
* Zero or more occurrences "he.*o"
+ One or more occurrences "he.+o"
? Zero or one occurrences "he.?o"
{} Exactly the specified number of occurrences "he.{2}o"
| Either or "falls|stays"
() Capture and group
Special Sequences
A special sequence is a \ followed by one of the characters in the list below, and has a
special meaning:
\A Returns a match if the specified characters are at the beginning of the string
"\AThe"
\b Returns a match where the specified characters are at the beginning or at the end of a
word
(the "r" in the beginning is making sure that the string is being treated as a "raw string")
r"\bain"
r"ain\b"
\B Returns a match where the specified characters are present, but NOT at the beginning
(or at the end) of a word
21
(the "r" in the beginning is making sure that the string is being treated as a "raw string")
r"\Bain"
r"ain\B"
\d Returns a match where the string contains digits (numbers from 0-9) "\d"
\D Returns a match where the string DOES NOT contain digits "\D"
\s Returns a match where the string contains a white space character "\s"
\S Returns a match where the string DOES NOT contain a white space character
"\S"
\w Returns a match where the string contains any word characters (characters from a to Z,
digits from 0-9, and the underscore _ character) "\w"
\W Returns a match where the string DOES NOT contain any word characters "\W"
\Z Returns a match if the specified characters are at the end of the string "Spain\Z"
Sets
A set is a set of characters inside a pair of square brackets [] with a special meaning:
[arn] Returns a match where one of the specified characters (a, r, or n) is present
[a-n] Returns a match for any lower case character, alphabetically between a and n
22
x = re.search("\s", txt)
print("The first white-space character is located in position:", x.start())
Split()
import re
txt = "The rain in Spain"
x = re.split("\s", txt)
print(x)
Sub()
import re
txt = "The rain in Spain"
x = re.sub("\s", "9", txt)
print(x)
The Match object has properties and methods used to retrieve information about the
search, and the result:
.span() returns a tuple containing the start-, and end positions of the match.
.string returns the string passed into the function
.group() returns the part of the string where there was a match
Print the position (start- and end-position) of the first match occurrence.
The regular expression looks for any words that starts with an upper case "S":
import re
txt = "The rain in Spain"
x = re.search(r"\bS\w+", txt)
print(x.span())
ExampleGet your own Python Server
Print the string passed into the function:
import re
txt = "The rain in Spain"
x = re.search(r"\bS\w+", txt)
print(x.string)
Print the part of the string where there was a match.
The regular expression looks for any words that starts with an upper case "S":
23
import re
txt = "The rain in Spain"
x = re.search(r"\bS\w+", txt)
print(x.group())
Note: If there is no match, the value None will be returned, instead of the Match Object.
Modifiers and Patterns
In Python, a modifier is a special character that can be used to modify the behavior of a
regular expression pattern. Here are some commonly used modifiers in Python:
i: Ignore case. Matches both uppercase and lowercase letters.
m: Multiline. Changes the behavior of ^ and $ to match the beginning and end of lines, not
just the beginning and end of the entire string.
s: Dot all. Makes the . character match any character, including newline characters.
x: Verbose. Allows you to use whitespace and comments in the pattern for readability.
Here's an example of how to use modifiers in Python regular expressions:
import re
# Define a string
string = 'The quick brown fox jumps over the lazy dog'
24
print(match.group()) # prints 'First line\nSecond line\nThird line'
25
with open('file_hash.txt', 'w') as f:
f.write(hash_value)
In this example, the hashlib.sha256() function is used to calculate the SHA-256 hash value
of the file. The hexdigest() method is then used to convert the hash value to a string of
hexadecimal digits. Finally, the hash value is saved to a separate file called file_hash.txt.
Pattern Recognition:
Pattern recognition is the process of identifying patterns in data. In Python, there are several
libraries that can be used for pattern recognition, such as scikit-learn, TensorFlow, and
Keras.
a simple example of pattern recognition in Python without using any libraries:
python
Copy code
# Define a list of patterns
patterns = [
[0, 0, 0, 0, 0],
[0, 1, 1, 1, 0],
[0, 1, 0, 1, 0],
[0, 1, 1, 1, 0],
[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[0, 0, 0, 0, 1],
[0, 0, 0, 1, 0],
[0, 0, 1, 0, 0],
[0, 1, 0, 0, 0],
]
# Define a function to compare two patterns and return their similarity score
def pattern_similarity(pattern1, pattern2):
score = 0
for i in range(len(pattern1)):
if pattern1[i] == pattern2[i]:
score += 1
return score
26
# Define a function to recognize a pattern from a list of patterns
def pattern_recognition(pattern, patterns):
max_score = 0
max_pattern = None
for p in patterns:
score = pattern_similarity(pattern, p)
if score > max_score:
max_score = score
max_pattern = p
return max_pattern
# Test the pattern recognition function
test_pattern = [0, 1, 0, 1, 0]
recognized_pattern = pattern_recognition(test_pattern, patterns)
print('Test Pattern:', test_pattern)
print('Recognized Pattern:', recognized_pattern)
In this example, we define a list of 10 patterns, where each pattern is represented as a list
of binary values (0 or 1). We then define a pattern_similarity() function that takes two
patterns as input and returns their similarity score (i.e., the number of matching values in
the two patterns). Next, we define a pattern_recognition() function that takes a test pattern
and a list of patterns as input and returns the pattern from the list that is most similar to the
test pattern (i.e., has the highest similarity score). Finally, we test the pattern_recognition()
function with a test pattern and print the recognized pattern
Example Programs :
1.import re
# Search for a pattern in a string
text = "The quick brown fox jumps over the lazy dog"
pattern = r"fox"
match = re.search(pattern, text)
if match:
print("Found", match.group())
# Replace a pattern in a string
new_text = re.sub(pattern, "cat", text)
27
print(new_text)
# Split a string using a pattern
words = re.split(r"\W+", text)
print(words)
28
count = 0
for line in file:
count += 1
print("Number of lines:", count)
5. python program to count the number of capital letters for the given string.
string = input("Enter a string: ")
count = 0
for char in string:
if char.isupper():
count += 1
print("Number of capital letters:", count)
count = 0
with open(filename, "r") as file:
for line in file:
words = line.split()
for w in words:
if w == word:
count += 1
29
8.To Remove the duplicates in a given string.
string = input("Enter a string: ")
new_string = ""
for char in string:
if char not in new_string:
new_string += char
print(sorted_dict)
30