1
Python Programming
ing Problem Solving Approach
Reema Thareja
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.
CHAPTER 7
File Handling
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.
File
A file is a collection of data stored on a secondary storage device like hard disk.
A file is basically used because real-life applications involve large amounts of data and in
such situations the console oriented I/O operations pose two major problems:
• First, it becomes cumbersome and time consuming to handle huge amount of data
through terminals.
• Second, when doing I/O using terminal, the entire data is lost when either the program
is terminated or computer is turned off. Therefore, it becomes necessary to store data on
a permanent storage (the disks) and read whenever necessary, without destroying 3the
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
data. RESERVED.
File Path
Files that we use are stored on a storage medium like the hard disk
in such a way that they can be easily retrieved as and when required.
Every file is identified by its path that begins from the root node or
the root folder. In Windows, C:\ (also known as C drive) is the root
folder but you can also have a path that starts from other drives like
D:\, E:\, etc. The file path is also known as pathname.
Relative Path and Absolute Path
A file path can be either relative or absolute. While an absolute path
always contains the root and the complete directory list to specify
the exact location the file, relative path needs to be combined with
another path in order to access a file. It starts with respect to the
current working directory and therefore lacks the leading slashes. 4
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
For example, C:\Students\Under Graduate\BTech_CS.docx but Under RESERVED.
ASCII Text Files
A text file is a stream of characters that can be sequentially processed by a computer in forward
direction. For this reason a text file is usually opened for only one kind of operation (reading,
writing, or appending) at any given time. Because text files can process characters, they can
only read or write data one character at a time. In Python, a text stream is treated as a special
kind of file.
Depending on the requirements of the operating system and on the operation that has to be
performed (read/write operation) on the file, the newline characters may be converted to or
from carriage-return/linefeed combinations. Besides this, other character conversions may also
be done to satisfy the storage requirements of the operating system. However, these
conversions occur transparently to process a text file. In a text file, each line contains zero or
more characters and ends with one or more characters
Another important thing is that when a text file is used, there are actually two representations
5
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
of data- internal or external. For example, an integer value will be represented
RESERVED. as a number that
Binary Files
A binary file is a file which may contain any type of data, encoded in binary form for
computer storage and processing purposes. It includes files such as word processing
documents, PDFs, images, spreadsheets, videos, zip files and other executable
programs. Like a text file, a binary file is a collection of bytes. A binary file is also
referred to as a character stream with following two essential differences.
• A binary file does not require any special processing of the data and each byte of data
is transferred to or from the disk unprocessed.
• Python places no constructs on the file, and it may be read from, or written to, in any
manner the programmer wants.
While text files can be processed sequentially, binary files, on the other hand, can be
either processed sequentially or randomly depending on the needs of the application. 6
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.
The Open() Function
Before reading from or writing to a file, you must first open it using Python’s built-in
open() function. This function creates a file object, which will be used to invoke methods
associated with it. The syntax of open() is:
fileObj = open(file_name [, access_mode])
Here,
file_name is a string value that specifies name of the file that you want to access.
access_mode indicates the mode in which the file has to be opened, i.e., read, write,
Exampl
append,
e: etc.
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.
The open() Function – Access Modes
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.
The File Object Attributes
Once a file is successfully opened, a file object is returned. Using this file object, you can
easily access different type of information related to that file. This information can be
obtained by reading values of specific attributes of the file.
Exampl
e:
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.
The close () Method
The close() method is used to close the file object. Once a file object is closed, you
cannot further read from or write into the file associated with the file object. While
closing the file object the close() flushes any unwritten information. Although, Python
automatically closes a file when the reference object of a file is reassigned to another
file, but as a good programming habit you should always explicitly use the close()
method to close a file. The syntax of close() is [Link]()
The close() method frees up any system resources such as file descriptors, file locks,
etc. that are associated with the file. Moreover, there is an upper limit to the number
of files a program can open. If that limit is exceeded then the program may even crash
or work in unexpected manner. Thus, you can waste lots of memory if you keep many
files open unnecessarily and also remember that open files always stand a chance 10of
corruption and data loss. © OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.
The write() and writelines() Methods
The write() method is used to write a string to an already opened file. Of course this
string may include numbers, special characters or other symbols. While writing data to a
file, you must remember that the write() method does not add a newline character ('\n')
to the end of the string. The syntax of write() method is: [Link](string)
The writelines() method is used to write a list of strings.
Examples:
11
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.
append() Method
Once you have stored some data in a file, you can always open that file again to write
more data or append data to it. To append a file, you must open it using 'a' or 'ab' mode
depending on whether it is a text file or a binary file. Note that if you open a file in 'w'
or 'wb' mode and then start writing data into it, then its existing contents would be
overwritten. So always open the file in 'a' or 'ab' mode to add more data to existing data
stored in the file.
Appending data is especially essential when creating a log of events or combining a
Exampl
large set of data into one file.
e:
12
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.
The read() and readline() Methods
The read() method is used to read a string from an already opened file. As said before,
the string can include, alphabets, numbers, characters or other symbols. The syntax of
read() method is [Link]([count])
In the above syntax, count is an optional parameter which if passed to the read() method
specifies the number of bytes to be read from the opened file. The read() method starts
reading from the beginning of the file and if count is missing or has a negative value
then, it reads the entire contents of the file (i.e., till the end of file).
The readlines() method is used to read all the lines in the file
Exampl
e:
13
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.
Opening Files using “with” Keyword
It is good programming habit to use the with keyword when working with file objects.
This has the advantage that the file is properly closed after it is used even if an error
occurs during read or write operation or even when you forget to explicitly close the file.
Exampl
es:
14
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.
Splitting Words
Python allows you to read line(s) from a file and splits the line (treated as a string) based
on a character. By default, this character is space but you can even specify any other
character
Exampl to split words in the string.
e:
15
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.
Some Other Useful File Methods
16
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.
File Positions
With every file, the file management system associates a pointer often known as file
pointer that facilitate the movement across the file for reading and/ or writing data. The
file pointer specifies a location from where the current read or write operation is
initiated. Once the read/write operation is completed, the pointer is automatically
updated.
Python has various methods that tells or sets the position of the file pointer. For
example, the tell() method tells the current position within the file at which the next
read or write operation will occur. It is specified as number of bytes from the beginning
of the file. When you just open a file for reading, the file pointer is positioned at location
0, which is the beginning of the file.
The seek(offset[, from]) method is used to set the position of the file pointer or17 in
simpler terms, move the file pointer to a new location. The offset argument indicates the
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.
File Positions - Example
18
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.
Renaming and Deleting Files
The os module in Python has various methods that can be used to perform file-processing
operations like renaming and deleting files. To use the methods defined in the os module,
you should first import it in your program then call any related functions.
The rename() Method: The rename() method takes two arguments, the current filename
and the new filename. Its syntax is: [Link](old_file_name, new_file_name)
The remove() Method: This method can be used to delete file(s). The method takes a
filename (name of the file to be deleted) as an argument and deletes that file. Its syntax
Exampl
is: [Link](file_name)
es:
19
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.
Directory Methods
The mkdir() Method: The mkdir()method of the OS module is used to create directories in
the current directory. The method takes the name of the directory (the one to be created)
as an argument. The syntax of mkdir() is, [Link]("new_dir_name")
The getcwd() Method: The getcwd() method is used to display the current working
directory (cwd).
[Link]()
The chdir() Method: The chdir() method is used to change the current directory. The
method takes the name of the directory which you want to make the current directory as
an argument. Its syntax is
[Link]("dir_name")
The rmdir() Method: The rmdir() method is used to remove or delete a directory. For this,
20
it accepts name of the directory to be deleted as an argument. However, before
© OXFORD UNIVERSITY PRESS 2017. removing
ALL RIGHTS
RESERVED.
Directory Methods - Examples
21
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.
Methods from the os Module
The [Link]() method uses the string value passed to it to form an absolute
path. Thus, it is another way to convert a relative path to an absolute path
The [Link](path) method accepts a file path as an argument and returns True if
the path is an absolute path and False otherwise.
The [Link](path, start) method accepts a file path and a start string as an
argument and returns a relative path that begins from the start. If start is not given, the
current directory is taken as start.
The [Link](path) Method returns a string that includes everything specified in
the path (passed as argument to the method) that comes before the last slash.
The [Link](path) Method returns a string that includes everything specified
22
in the path (passed as argument to the method) that comes after the last slash.
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.
Methods from the os Module
The [Link](path) Method: This method accepts a file path and returns its directory
name as well as the . So it is equivalent to using two separate methods
[Link]() and [Link]()
The [Link](path) Method: This method returns the size of the file specified in
the path argument.
The [Link](path) Method: The method returns a list of filenames in the specified path.
The [Link](path) Method: The method as the name suggests accepts a path as an
argument and returns True if the file or folder specified in the path exists and False
otherwise.
The [Link](path) Method: The method as the name suggests accepts a path as an
23
argument and returns True if the path specifies a file and False otherwise.
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.
The [Link](path) Method: The method as the name suggests accepts a path as an
Methods from the os Module — Examples
24
© OXFORD UNIVERSITY PRESS 2017. ALL RIGHTS
RESERVED.