File Handling
Learning Objectives
• We have seen yet only the transient programs. The programs which run for a short
period of time and give some output and after that their data is disappeared.
And when we again run those programs then we have to use new data.
• This is because the data is entered in primary memory which is temporary memory
and its data is volatile.
• Those programs which are persistent i.e. they are always in running or run for a
long time then their data is stored in permanent storage (e.g. harddisk) .
If the program is closed or restarted then the data used will be retrieved.
• For this purpose the program should have the capability to read or write the
text files or data files. These files can be saved in permanent storage.
• The meaning of File I/O (input-output) is to transfer the data from Primary
memory to secondary memory and vice-versa.
FILE HANDLING is a mechanism by which we can read data of disk files in python
program or write back data from python program to disk files.
So far in our python program the standard input in coming from keyboard an
output is going to monitor i.e. no where data is stored permanent and entered
data is present as long as program is running BUT file handling allows us to
store data entered through python program permanently in disk file and later
It contains data pertaining to a
specific application.
The data files can be stored in two
ways –
i) Text File ii) Binary File
Text file stores information in ASCII OR UNICODE
character.
In text file everything will be stored as a
character for example if data is “computer”
then it will take 8 bytes and if the data is
floating value like 11237.9876 it will take 10
bytes.
Each line is terminated by special character
called EOL.
In text file some translation takes place
when this EOL character is read or
written. In python EOL is ‘\n’ or ‘\r’
or combination of both.
It stores the information in the same format
as in the memory i.e. data is stored according
to its data type so no translation occurs.
No delimiter for a new line.
Binary files are faster and easier for a
program to read and write than text files.
Data in binary files cannot be directly read,
it can be read only through python program
for the same.
Following main operations can be do
ne on files -
1. Opening a file
2. Performing operations
1. READ
2. WRITE etc. Beside above operations there are some
3. Closing The more operations can be done on
File files.-
• Creating of Files.
• Traversing of Data.
• Appending Data into file.
• Inserting Data into File.
• Deleting Data from File.
• Copying of File.
• Updating Data into File.
i) OPENING FILE
We should first open the file for read or write by
specifying the name of file and mode.
ii) PERFORMING READ/WRITE
Once the file is opened now we can either read or
write for which file is opened using various functions
available
iii) CLOSING FILE
After performing operation we must close the file and
release the file for other application to use it.
File can be opened for either – read, write,
append.
SYNTAX:
file_object = open(filename)
Or
file_object = open(filename,mode)
** default mode is “read”
myfile = open(“story.txt”)
Here disk file “story.txt” is loaded in
memory and its reference is linked to “myfile”
object, now python program will access
“story.txt” through “myfile” object.
Here “story.txt” is present in the same
folder where .py file is stored otherwise if
disk file is in another folder we have to give
full path.
Opening File
myfile = open(“article.txt”,”r”)
Here “r” is for read (although it is by
default, other options are “w” for write,
“a” for append)
myfile = open(“d:\\mydata\\poem.txt”,”r”)
Here we are accessing “poem.txt” file
stored in separate location i.e. d:\mydata
folder.
At the time of giving path of file we must use double
backslash(\\) in place of single backslash because in
python single slash is used for escape character and it
may cause problem like if the folder name is “nitin”
and we provide path as d:\nitin\poem.txt then in \nitin
“\n” will become escape character for new line,
SO ALWAYS USE DOUBLE BACKSLASH IN PATH
Opening File
myfile = open(“d:\\mydata\\poem.txt”,”r”) Read Mode
another solution of double backslash is
using “r” before the path you can give single
slashes in pathname.
myfile = open(r“d:\mydata\poem.txt”,”r”) (When you open a file
in readmode, the given file must exist in the folder,otherwise
python will raise FileNotFoundError)
myfile = open(“d:\\mydata\\poem.txt”,”w”) Write Mode
File Object Path to file Mode
A File-object is also known as file-handle.
File-object is a reference to a file on disk.
File Access Modes
Mode Description
r To read the text file which is already existing.
rb Read Only in binary format.
r+ To Read and write the text file but the file pointer will be at the beginning of the file.
rb+ To Read and write binary file. But the file pointer will be at the beginning of the fil
e.
w Only writing mode in text file , if file is existing the old file will be overwritten else the
new file will be created.
wb Binary file only in writing mode, if file is existing the old file will be overwritten else
the new file will be created.
wb+ Binary file only in reading and writing mode, if file is existing the old file will be
overwritten else the new file will be created.
a Append mode in text file. The file pointer will be at the end of the file.
ab Append mode in binary file. The file pointer will be at the end of the file.
a+ Appending and reading the text file if the file is existing then file pointer will be at the
end of the file else new file will be created for reading and writing.
ab+ Appending and reading in binary file if the file is existing then file pointer will be at
the end of the file else new file will be created for reading and writing.
Closing file
As reference of disk file is stored in file handle
so to close we must call the close() function
through the file handle and release the file.
myfile.close()
Note: open function is built-in function used
standalone while close() must be called through file
handle
Opening & Closing Files
. . .
A program describing the functions of file handling.
f=open("Hello.txt","r") Output
print("File Name:",f.name)
print("File Mode:",f.mode)
print("Is File readable?:",f.readable())
print("Is File closed?:",f.closed)
f.close()
print("Is File closed?:",f.closed)
Reading from File
To read from file, python provide many functions like :
Filehandle.read([n]) :
reads and return n bytes, if "n" is not specified it reads
entire file.
Filehandle.readline([n]) :
readline( ) function which can read one line at a time fro
m the file. If "n" is specified reads at most “n" bytes.
Filehandle.readlines():
reads all/many lines and returns them in a list
Example-1: read()
SAMPLE FILE
Example-2: read()
SAMPLE FILE
Example-3: readline()
SAMPLE FILE
Example-3: readline()
SAMPLE FILE
HAVEYOU NOTICEDTHE DIFFERENCE IN OUTPUT FROM PREVIOUS OUPUT?
Example-4: reading line by line using readline( )
SAMPLE FILE
Example-5: reading line by line using for loop
SAMPLE FILE
SACHIN BHARDWAJ, PGT(CS), KV NO.1 TEZPUR
Example-6: Calculating size of file with and without EOL and blank
lines
SAMPLE FILE
Example-7: readlines( )
SAMPLE FILE
Example-8 & 9: counting size of file in bytes and number of lines
SAMPLE FILE
Questions…
Writing onto files
After read operation, let us take an example of how to write
data in disk files. Python provides functions:
write (string)
writelines(sequence of line)
The above functions are called by the file handle to write
desired content.
Name Syntax Description
write() Filehandle.write(str1) Writes string str1 to file referenced by
filehandle
Writelines() Filehandle.writelines(L) Writes all string in List L as lines to file
referenced by filehandle.
Writing to a File. . .
A Program to use writelines()
function
Output
“Hello.txt” File is
created using the
above program.
Writing to a File.
Hello.txt file is opened using “with”.
Output
“Hello.txt” File is
created using the
above program.
Example-1: write() using “w” mode
Example-1: write() using “w” mode
Now we can observe that while writing data to file using “w” mode the previous
content of existing file will be overwritten and new content will be saved.
If we want to add new data without overwriting the previous content then we
should write using “a” mode i.e. append mode.
Example-2: write() using “a” mode
New content is add
ed after previous co
ntent
SACHIN BHARDWAJ, PGT(CS), KV NO.1 TEZPUR
Example-3: using writelines()
Example-4: Writing String as a record to file
Example-4: To copy the content of one file to
another file
Appending in a File
• Append means adding something new to existing file.
• ‘a’ mode is used to accomplish this task. It means opening a file in write
mode and if file is existing then adding data to the end of the file.
A program to append
into a file “Hello.Txt”
Output
A new data is appended into
Hello.txt by above program.
Writing User Input to the File.
Taking the data from
user and writing this
data the file
to
“Stude nt.txt”.
Student File is created
by using the above
Output program.
flush() function
When we write any data to file, python hold everything
in buffer (temporary memory) and pushes it onto actual
file later. If you want to force Python to write the
content of buffer onto storage, you can use flush()
function.
Python automatically flushes the files when closing
them i.e. it will be implicitly called by the close(),
BUT if you want to flush before closing any file you
can use flush().
Example: working of flush()
Nothing is in the
file temp.txt
Without flush()
When you run the above code, program will
stopped at “Press any key”, for time being
don’t press any key and go to folder where
file “temp.txt” is created an open it to see
what is in the file till now
NOW PRESS ANY KEY….
Now content is stored,
because of close() function
contents are flushed and
pushed in file
SACHIN BHARDWAJ, PGT(CS), KV NO.1 TEZPUR
Example: working of flush() All contents
before flush()
With flush() are present i
n file
When you run the above code, program will
stopped at “Press any key”, for time being
don’t press any key and go to folder where
file “temp.txt” is created an open it to see
what is in the file till now
NOW PRESS ANY KEY….
Rest of the content is
written because of close(),
contents are flushed and
pushed in file.
Removing whitespaces after reading
from file
read() and readline() reads data from file
and return it in the form of string and
readlines() returns data in the form of
list.
All these read function also read leading and
trailing whitespaces, new line characters. If
you want to remove these characters you can
use functions
strip() : removes the given character from both
ends.
lstrip(): removes given character from left end
rstrip(): removes given character from right end
Example: strip(),lstrip(),
rstrip()
File Pointer
Every file maintains a file pointer which
tells the current position in the file where
reading and writing operation will take.
When we perform any read/write operation two
things happens:
The operation at the current position of file
pointer
File pointer advances by the specified number
of bytes.
Example
myfile = open(“ipl.txt”,”r”)
File pointer will be by default at first position i.e. first character
ch = myfile.read(1)
ch will store first character i.e. first character is consumed, and file pointer will
move to next character
File Modes and Opening position
of file pointer
FILE MODE OPENING POSITION
r, r+, rb, rb+, r+b Beginning of file
w, w+, wb, wb+, w+b Beginning of file (overwrites the file if
file already exists
a, ab, a+, ab+, a+b At the end of file if file exists otherwis
e creates a new file
Standard INPUT, OUTPUT and ERROR STREAM
Standard Input : Keyboard
Standard Output : Monitor
Standard error : Monitor
Standard Input devices(stdin) reads from
keyboard
Standard output devices(stdout) display output
on monitor
Standard error devices(stderr) same as stdout
but normally for errors only.
Standard INPUT, OUTPUT and ERROR STREAM
The standard devices are implemented as fil
es called standard streams in Python and we
can use them by using sys module.
After importing sys module we can use stand
ard streams stdin, stdout, stderr
SACHIN BHARDWAJ, PGT(CS), KV NO.1 TEZPUR
“with” statement
Python’s “with” statement for file handling is very
handy when you have two related operations which you
would like to execute as a pair, with a block of code
in between:
with open(filename[, mode]) as filehandle: file_mani
pulation_statement
The advantage of “with” is it will automatically close
the file after nested block of code. It guarantees to
close the file how nested block exits even if any run
time error occurs
Example
Binary file operations
If we want to write a structure such as list
or dictionary to a file and read it and we have
to use the Python module pickle.
Pickling is the process of converting structure
to a byte stream before writing to a file and
while reading the content of file a reverse
process called Unpickling is used to convert
the byte stream back to the original format.
Steps to perform binary file operations
First we need to import the module called
pickle.
This module provides 2 main functions:
dump() : to write the object in file which is
loaded in binary mode
◾Syntax : dump(object_to_write, filehandle)
load() : dumped data can be read from file
using load()
(i.e) it is used to read object from pickle file.
◾Syntax:object = load(filehandle)
Operations in Binary File.
• pickle.dump():
This function is used to store the object data to the file.
It takes 3 arguments.
• First argument is the object that we want to store. The
second argument is the file object we get by opening the
desired file in write-binary (wb) mode. And the third
argument is the key-value argument. This argument defines
the protocol.
• There are two types of protocol – pickle.HIGHEST_PROTOCOL
and pickle.DEFAULT_PROTOCOL.
• Pickle.load():
• This function is used to retrieve pickled data.
• The primary argument of pickle load function is the file
object that you get by opening the file in read-binary
(rb) mode.
Example: dump()
See the content is some kind
of encrypted format, and it is
not in complete readable
form
Example: load()
Operations in Binary File
Iteration over Binary file - pickle module
import pickle
output_file = open("d:\\a.bin", "wb")
myint = 42
mystring = "Python.mykvs.in!"
mylist = ["python", "sql", "mysql"]
mydict = { "name": "ABC", "job": "XYZ"}
pickle.dump(myint, output_file)
pickle.dump(mystring, output_file)
pickle.dump(mylist, output_file)
pickle.dump(mydict, output_file)
output_file.close()
with open("d:\\a.bin", "rb") as f:
while True:
try:
r=pickle.load(f)
print(r)
print(“Next data") Read objects one by one
except EOFError:
break
f.close()
Operations in Binary File
Insert/append record in a Binary file - pickle module
rollno = int(input('Enter roll number:'))
name = input('Enter Name:')
marks = int(input('Enter Marks'))
#Creating the dictionary
rec = {'Rollno':rollno,'Name':name,'Marks':marks}
#Writing the Dictionary
f = open('d:/student.dat','ab')
pickle.dump(rec,f)
f.close()
Operations in Binary File
Read records from a Binary file - pickle module
f = open('d:/student.dat','rb')
while True:
try:
rec = pickle.load(f)
print('Roll Num:',rec['Rollno'])
print('Name:',rec['Name'])
print('Marks:',rec['Marks'])
except EOFError: Here we are creating
break dictionary rec to dum
f.close()
p it in student.dat file
Relative and Absolute Paths
• We all know that the files are kept in directory which
are also known as folders.
• Every running program has a current directory. Which
is generally a default directory and python always
see the default directory first.
• OS module provides many such functions which can be us
ed to work with files and directories. OS means
Operating System.
• getcwd( ) is a very function which can be used to
identify the current working directory.
Absolute Path
Absolute path is the full address of any
file or folder from the Drive i.e. from
ROOT FOLDER. It is like:
Drive_Name:\Folder\Folder…\filename
For e.g. the Absolute path of file
REVENUE.TXT will be
C:\SALES\2018\REVENUE.TXT
Absolute path of SEC_12.PPT is
C:\PROD\NOIDA\Sec_12.ppt
Relative Path
Relative Path is the location of file/folder
from the current folder.To use Relative path
special symbols are:
Single Dot ( . ) : single dot ( . ) refers to
current folder.
Double Dot ( .. ) : double dot ( .. ) refers to
parent folder
Backslash ( \ ): first backslash before (.) and
double dot( .. )refers to ROOT folder.
Getting name of current working
directory
import os
pwd = os.getcwd()
print("Current Directory :",pwd)
Getting and Resetting file positions
• The tell() method of python tells us the current position within the file
• The seek(offset[, from]) method changes the current file position.
• If from is 0, the beginning of the file to seek.
• If it is set to 1, the current position is used .
• If it is set to 2 then the end of the file would be taken as seek positio
n.
• The offset argument indicates the number of bytes to be moved.
e.g.program
f = open("a.txt", 'rb+')
print(f.tell( ))
print(f.read(7)) # read seven characters
print(f.tell())
print(f.read())
print(f.tell())
f.seek(9,0) # moves to 9 position from begining
print(f.read(5))
f.seek(4, 1) # moves to 4 position from current location
print(f.read(5))
f.seek(-5, 2) # Go to the 5th byte before the end
print(f.read(5))
Standard File Strea
• ms I/O Streams to get better performance from
We use standard
different I/O devices.
• Some Standard Streams in python are as follows -
– Standard input Stream sys.stdin
– Standard output Stream sys.stdout
– Standard error Stream sys.stderr