0% found this document useful (0 votes)

71 views67 pages

Module 1 Part2

This document discusses fundamental concepts of file structures and managing files of records. It covers stream files, field structures, record structures, and using classes to manipulate buffers. Some key points include: - Stream files store data as a continuous stream without structure, making it hard to retrieve organized records. Field and record structures add organization. - Common field structure methods include fixed-length fields, length indicators, and delimiters. Record structures group related fields and use similar methods like length indicators. - Classes can represent buffers to read, write and unpack variable-length records that use length indicators or delimiters. Fixed-length buffers use simpler methods. Proper file structures are important for organizing, reading and writing data in

Uploaded by

Chirag Srinivas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

71 views67 pages

Module 1 Part2

Uploaded by

Chirag Srinivas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 67

Module-1: chapter4 & 5

Fundamental File Structure

Concepts & Managing Files of
Records

1
Outline I: Fundamental File
Structure Concepts
• Stream Files
• Field Structures
• Reading a Stream of Fields
• Record Structures
• Record Structures that use a length
indicator

2
Outline II: Managing Files of
Records
• Record Access
• More About Record Structures
• File Access and File Organization
• More Complex File Organization and
Access
• Portability and Standardization

3
Field and Record Organization:
Overview
• When we deal with file structures :
– Data to be persistent
– i.e. data read by a file/ written by another file data
should be same.
• The basic logical unit of data is the field
which contains a single data value.
• Fields are organized into aggregates, either as
many copies of a single field (an array) or as
a list of different fields (a record).
4
Field and Record Organization:
Overview
• When a record is stored in memory, we
refer to it as an object and refer to its
fields as members.
• Here we will study the ways that objects
can be represented as records in files.

5
Stream Files
• Here we deal with how data is handled
in streams.
• For E.g.

6
Stream Files

• If our input is as follows

Input 1 Input 2
•Mary Ames •Alan Mason
•123 Maple •90 Eastgate
•S llwater, OK 74075 •Ada, OK 74820

7
Stream Files
• In Stream Files, the information is written as a
stream of bytes containing no added
information as follows:

AmesMary123 MapleStillwaterOK74075MasonAlan90 EastgateAdaOK74820

• Problem: There is no way to get the

information
back in the organized record format.
8
Field Structures
• Due to the above problem we should use
some types of structures.
• There are many ways of adding structure to
files to maintain the identity of fields:
– Force the field into a predictable length
– Begin each field with a length indicator
– Place a delimiter at the end of each field to
separate from next field.
– Use a “keyword = value” expression to identify
each field and its content.
9
Field Structures
• Method 1:Force the field into a predictable length

The last byte

is used for
‘\0’

• Each field is fixed length specified in the above

class/ structure.
• In above class one record =>10+10+15+15+2+9=>61
bytes
10
Field Structures
Method 1 Contd…
• Result looks as follows:

• Problems:
– Wastage of space
• Ames requires 4 bytes but we use 10 bytes.
– If require more space than allotted.
• Solve these by fixing the lengths to larger space.
. 11
Field Structures
Method 2:Begin each field with a length indicator
• Begin each field with the length of that field
value.
• If length is too long we require more space for
length.
• Looks as follows:

12
Field Structures
Method 3: Place a delimiter at the end of each field to
separate from next field.
• Each field is separated by a delimiter.
• Delimiter can be white space characters like blank,
new line, tab
• The above can be used with in the values like blank
can be used in address.
• Hence we use vertical bar character.

13
Field Structures
Method 3:Use a “keyword = value” expression to
identify each ﬁeld and its content.

This type of method is self-describing.

A unknown person can also understand the contents.
Use full for identifying missing values.
Overhead for few applications which doesn’t demand
this much information.
14
Reading a Stream of Fields
• A Program can easily read a stream of
ﬁelds and output ===>
Output

15
Reading a Stream of Fields
• This time, we do preserve the notion of
ﬁelds, but something is missing:
– Rather than a stream of ﬁelds
– These should be two records

16
Record Structure I
• A record can be defined as a set of fields that
belong together when the file is viewed in
terms of a higher level of organization.
• Like the notion of a field, a record is another
conceptual tool which needs not exist in the
file in any physical sense.
• Yet, they are an important logical notion
included in the file’s structure.

17
Record Structures II
• Methods for organizing the records of a file
include:
– Requiring that the records be a predictable number of
bytes in length.
– Requiring that the records be a predictable number of
fields in length.
– Beginning each record with a length indicator
consisting of a count of the number of bytes that the
record contains.
– Using a second file to keep track of the beginning byte
address for each record.
– Placing a delimiter at the end of each record to
separate it from the next record.
18
Record Structures II
Method 1:Requiring that the records be a predictable number of
bytes in length.(fixed length not for field it is for record)

Method 2: Requiring that the records be a predictable number of

ﬁelds in length.

19
Record Structures II
Method 3:Beginning each record with a length indicator
consisting of a count of the number of bytes that the
record contains.

Method 4:Using a second ﬁle to keep track of the

beginning byte address for each record.

20
Record Structures II
Method 5:Placing a delimiter at the end of each
record to separate it from the next record.

21
Record Structures that Use a
Length Indicator
• To known how the record structure are dealt
we will consider length indicator method.
• Implementation:
– Writing the variable-length records to the
ﬁle
– Representing the record length
– Reading the variable-length record from the
ﬁle.

22
Record Structures that Use a
Length Indicator
Writing the variable-length records to the ﬁle:
–If we want to write length of a record to the initial position.
–We need to know the length of a record
–Hence we will read the data to a buffer then identify the length
using strlen function

23
Record Structures that Use a
Length Indicator
Representing the record length:
• 2 byte binary integer
• Convert into character string.
fprintf(ﬁle, ’%d’, length); //C stream
stream<<length<<‘ ’; //C++ sream
The above 2 functions inserts the length and places a
space as delimiter.

24
Record Structures that Use a
Length Indicator
Reading the variable-length record from the ﬁle:
–Read the records from a ﬁle
– records is read into buffer
–Then to object p.
–The value from buffer is read into character string
strbuff.

25
Mixing numbers & characters:
Use a ﬁle dump Contd..
• The actual length represented in a ﬁle
as a character string is as follows:

• If the data needs to be represented as a

2 byte integer:

26
Mixing numbers & characters:
Use a ﬁle dump Contd…

• Finally the data will be viewed in a ﬁle

as follows:
• When it is 2 byte representation.

27
Mixing numbers & characters:
Use a ﬁle dump
• In UNIX platform the data is dumped as
shown.(od – UNIX command)

28
Using Classes to Manipulate
Buffers
• Buffers mainly depends upon whether
they are:
– Fixed length
– Variable length
• It also depends on:
– Delimiter

29
Using Classes to Manipulate
Buffers-I
• Class with delimiter:

30
Using Classes to Manipulate
Buffers-I
• Pack function of a delimiter:

• Practically the data is packed is as

follows:

31
Using Classes to Manipulate
Buffers-I
• Unpack Function (Fields):

// Next ﬁeld to be read hence NextByte is initialized

32
Using Classes to Manipulate
Buffers-II
• For Fixed length buffers:

33
Using Classes to Manipulate
Buffers-II
• There is initialize function which will
initializes the ﬁelds of the ﬁle.

34
Using Inheritance for Record
Buffer Classes
• Here we use Inheritance to remove
duplication of code if same procedures
are used by more classes.
• We have seen classes
– fstream , istream, ostream
– fstream inherits input/output operations
from parent class iostream.
– Which is nothing but inherits istream,
ostream
35
Using Inheritance for Record
Buffer Classes

• They have used multiple inheritance:-

more than one base class.
• Virtual :- ensure that the class ios is
included only once in the hierarchy.
36
Using Inheritance for Record
Buffer Classes
• 2 main classes
– Iostream (basic stream operations)
– fstreambase( to access the OS ﬁle
operations)

37
Using Inheritance for Record
Buffer Classes

• Class hierarchy for record buffer

objects

38
Using Inheritance for Record
Buffer Classes
• IOBuffer is the base class
• Protected members- to be used by only
inherited classes

39
Using Inheritance for Record
Buffer Classes
• All methods are declared virtual : allows
subclass for there own implementation.
• =0 (pure virtual class):-
– IOBuffer doesn’t include implementation of
any method.
– No objects can be created.

40
Using Inheritance for Record
Buffer Classes
• Write function of variable length buffer class.
• Tellp() : returns position in the output
sequence.
• Returns the address where it has written.

41
Using Inheritance for Record
Buffer Classes

Here we are checking which function is called.

We are calling DelimFieldBuffer function

42
Assignment-1
• Explain with a program how data is
packed, unpacked with ﬁxed length
records.
• Explain with a program how data is
packed, unpacked with variable length
records.

43
Record Access: Keys

• When looking for an individual record, it is

convenient to identify the record with a key
based on the record’s content (e.g., the Ames
record).
• When we consider to retrieve the record using
key then the key should having following
constraints:
– Canonical form ( rules to deﬁne a key)
– uniquely deﬁne a record
44
Record Access: Keys
• Rules:
– E.g. if key = AMES
• Then data can be written as Ames / AMES /
ames
• We should design a rule so that what
ever is input :
– It should convert any input to all Caps.

45
Record Access: Keys
• Uniquely key:
– i.e. if there are many records of same
• key : AMES
• To prevent the above:
– Deﬁne a primary key
– Which is unique to a record
• We can also create a secondary key in
support to the primary key.
46
Record Access: Keys
• When we choose a primary key we
should be careful as it contains real
data:
• Key should be unchangeable.
• To avoid the above problem we should
not choose data of a record as key
discussed later.

47
Record Access:
Using Sequential Search
• Evaluating Performance of Sequential
Search.
• Improving Sequential Search
Performance with Record Blocking.
• When is Sequential Search Useful?

48
Record Access:
Using Sequential Search
Evaluating Performance of Sequential
Search:
– Best case: 1
– Average case: n/2
– Worst case: n
Sequential search steps:
– Read calls for each record
– To perform read the seek required to read a record.
– E.g.. 10 records=>10 read calls => 10 seek
– Seeking takes more time than read.
49
Record Access:
Using Sequential Search
Improving Sequential Search
Performance with Record Blocking:
•If we have 100 records =>100 read calls
•Hence make a block of records
– E.g. 1 Block => 10 records
– Then 10 read calls => 10 blocks
– Block size will almost be of sector oriented.
– If 1 sector => 512 bytes => 10 records
50
Record Access:
Using Sequential Search
Points of record blocking:
– Searching is still O(n) as no of records are
same.
– Seek time is reduced
– The amount of data transfer is more.
• Even if need to access the ﬁrst record.
– Too expensive

51
Record Access:
Using Sequential Search
When is Sequential Search Good?
– It is extremely easy to program
– Simple ﬁle structures
Mainly depends on:
• Processor speed
Mainly used:
• Tapes
• Lesser number of records

52
Record Access:
UNIX tools for sequential
processing

File structure in UNIX:

•ASCII file:- new line character => record delimiter
White space => field delimiter
•Provides rich no. of tools:- which are sequential
•cat myfile:- contents of my file

•wc(word count):- no. of lines, words, characters

–2 12 76
53
Record Access:
UNIX tools for sequential
processing
• grep (generalized regular expression):-
searches for a pattern
– grep Ada my ﬁle:displays as follows

– grep Ada my ﬁle | wc

• 1 6 36

54
Direct Access
• How do we know where the beginning of the
required record is?
Ü It may be in an Index (discussed in a different
unit)
Ü We know the relative record number (RRN)
Ü Position of a record relative to begining
Ü E.g. First record=> RRN 0, next record=> RRN 1
and so on

55
Direct Access
• RRN are not useful when working with variable
length-records: the access is still sequential.
• In order to work with RRN we need to work with
fixed-length records.
– If records are of fixed length:
• Using RRN we can calculate ByteOffset
• Byteoffset = n* r n=> no. of bytes
r => RRN no. of a record.
– If fixed length is 512 bytes & RRN=500 then
byteoffset?
56
Record Structure
 Choosing a Record Structure and Record
Length
 Header Records
 Adding Headers to C++ Buffer classes

57
Record Structure
Choosing a Record Structure and Record Length:
•To use RRN no. for direct access:
– First we should fix record length.
– Record length means: size of the field to be fixed
•Two ways to do:
– Fixed length field

– Fixed record length

58
Record Structure
1. Fixed length field approach:
• Simplicity
2. Fixed record length
• More efficient as a fixed amount of space at the end.
In the above 2 methods => 1 identification to be made:
– Differentiate between real data / unused space in
the record.
– The above can be done as follows:
• Record length indicator
• Delimiter
• Count fields
59
Record Structure
Header records:
•General information of a file.
•Header record at the beginning of the file to
hold this information.
•Information in header file:
– Count of no. of records
– Length of data records
– Date and time of the file updated.
– Name of the file
60
Record Structure
• Header record will be self describing
object
• Any to access a file will know about:
– File structures used in the file
– Helps in access of a record
– E.g. header record:

61
Record Structure
• Header record an example:

62
Encapsulating Record I/O
Operation in a single class
• Till now we have done a read/ write
operation :
– Two steps:
• Read/ write to a buffer
• Then buffer to a ﬁle
• Here we will use a class that hides
buffer.
• It looks as though we have read/ written
with a ﬁle. 63
Encapsulating Record I/O
Operation in a single class
• RecordFile is a class inherits BufferFile
• BufferFile contains functions to read/ write from a
buffer.
• Only we will use this functions.

64
Encapsulating Record I/O
Operation in a single class
• Shows how read/write functions of a
BufferFile is used to perform our task of
reading / writing.

65
File Access and File
Organization: A Summary

• File organization depends on:

– What use you want to make of the
file?
• Since using a file implies:
– File access
– File organization
– Both are linked.
66
File Access and File
Organization: A Summary
• Example:
– Fixed-length records makes direct access easier.
– If the documents have variable lengths, fixed-
length records is not a good solution
– The application determines our choice of both
access and organization.
– Hence we need to determine both access and
organization of a file.

File Organization-Lec8
No ratings yet
File Organization-Lec8
31 pages
Managing File Structures and Records
No ratings yet
Managing File Structures and Records
49 pages
C++ Data Structures: Records and Fields
No ratings yet
C++ Data Structures: Records and Fields
25 pages
File Organization File Access
No ratings yet
File Organization File Access
16 pages
15is62 FS 25QB Prasadbs
No ratings yet
15is62 FS 25QB Prasadbs
21 pages
FS Iat1
No ratings yet
FS Iat1
23 pages
FP-Lecture-6 01
No ratings yet
FP-Lecture-6 01
33 pages
Introduction To File Structure: FS Lab Mini Project Placement Statistics
No ratings yet
Introduction To File Structure: FS Lab Mini Project Placement Statistics
44 pages
Lec2 PDF
No ratings yet
Lec2 PDF
38 pages
File Structure and Organization Concepts
No ratings yet
File Structure and Organization Concepts
17 pages
Chapter 4. Fundamental File Structure Concepts: DR K. Srinivas Adv Data Artuctures
No ratings yet
Chapter 4. Fundamental File Structure Concepts: DR K. Srinivas Adv Data Artuctures
18 pages
Fs Report
No ratings yet
Fs Report
28 pages
Module - 3 - Study Session - 2
No ratings yet
Module - 3 - Study Session - 2
11 pages
Basic File Structure
No ratings yet
Basic File Structure
17 pages
Chapter 4 Fundamental File Structure Concepts
No ratings yet
Chapter 4 Fundamental File Structure Concepts
13 pages
CSC 204 - Study Session 4
No ratings yet
CSC 204 - Study Session 4
18 pages
FS M1 Part1
No ratings yet
FS M1 Part1
151 pages
Lecture3 FileOrganization
No ratings yet
Lecture3 FileOrganization
16 pages
File Processing
No ratings yet
File Processing
55 pages
Intro File2
No ratings yet
Intro File2
36 pages
Lesson One-Data Structures
No ratings yet
Lesson One-Data Structures
6 pages
File Structure and Indexing
No ratings yet
File Structure and Indexing
18 pages
Data File
No ratings yet
Data File
22 pages
Fundamental File Structure Concepts & Managing Files of Records
No ratings yet
Fundamental File Structure Concepts & Managing Files of Records
18 pages
Unit 1 Introduction To Dbms
No ratings yet
Unit 1 Introduction To Dbms
27 pages
CSC 204 Session 4
No ratings yet
CSC 204 Session 4
16 pages
Vallurupalli Nageswara Rao Vignana Jyothi Institute of Engineering &technology
No ratings yet
Vallurupalli Nageswara Rao Vignana Jyothi Institute of Engineering &technology
38 pages
Os Unit 4
No ratings yet
Os Unit 4
20 pages
Sequential Files
No ratings yet
Sequential Files
26 pages
Unit-Iv File Management
No ratings yet
Unit-Iv File Management
21 pages
Data Structure
No ratings yet
Data Structure
15 pages
File Organisation Simple Structure1
No ratings yet
File Organisation Simple Structure1
31 pages
Employee Management System Project
No ratings yet
Employee Management System Project
21 pages
OSY Chapter 6 SSP
No ratings yet
OSY Chapter 6 SSP
24 pages
Lecture 8
No ratings yet
Lecture 8
42 pages
Chapter-6 Data Types
No ratings yet
Chapter-6 Data Types
37 pages
Os 5TH
No ratings yet
Os 5TH
38 pages
Business Objects Design
No ratings yet
Business Objects Design
5 pages
Intro to Data Structures Basics
No ratings yet
Intro to Data Structures Basics
136 pages
(As Per Choice Based Credit System (CBCS) Scheme) (Effective From The Academic Year 2016 - 2017)
No ratings yet
(As Per Choice Based Credit System (CBCS) Scheme) (Effective From The Academic Year 2016 - 2017)
3 pages
Caie A2 Level Computer Science 9618 Theory v1
100% (1)
Caie A2 Level Computer Science 9618 Theory v1
21 pages
File System
No ratings yet
File System
8 pages
Unit 6
No ratings yet
Unit 6
56 pages
C - Data Structures+IO
No ratings yet
C - Data Structures+IO
42 pages
File Structures for CS Students
No ratings yet
File Structures for CS Students
6 pages
CH 13
No ratings yet
CH 13
6 pages
UNIT-5 File System Interface and Operations
No ratings yet
UNIT-5 File System Interface and Operations
30 pages
File Organization and Processing
No ratings yet
File Organization and Processing
45 pages
File Organization & Processing Guide
No ratings yet
File Organization & Processing Guide
386 pages
Data Structures & Algorithm in Java - Robert Lafore - PPT
No ratings yet
Data Structures & Algorithm in Java - Robert Lafore - PPT
682 pages
File System Notes UNIT V
No ratings yet
File System Notes UNIT V
24 pages
Unit 6 File Management
No ratings yet
Unit 6 File Management
70 pages
Term2 Week 1 Lesson CSC Year 13
No ratings yet
Term2 Week 1 Lesson CSC Year 13
8 pages
A Journey of A Byte
No ratings yet
A Journey of A Byte
18 pages
Os Unit 5
No ratings yet
Os Unit 5
21 pages
File System: 1.1 Metadata
No ratings yet
File System: 1.1 Metadata
9 pages
1.file Organization
No ratings yet
1.file Organization
90 pages
17isl68 Manual
No ratings yet
17isl68 Manual
77 pages
File Structures: Data Representation in Memory
No ratings yet
File Structures: Data Representation in Memory
107 pages
The Oak Tree: A Fable of Unexpected Gifts
No ratings yet
The Oak Tree: A Fable of Unexpected Gifts
5 pages
A Great Vim Cheat Sheet
No ratings yet
A Great Vim Cheat Sheet
5 pages
Right of Petition General Insurances Sura
No ratings yet
Right of Petition General Insurances Sura
8 pages
Swiss Cash Coin Expands to New Exchanges
No ratings yet
Swiss Cash Coin Expands to New Exchanges
3 pages
A Guide To Case Analysis
100% (1)
A Guide To Case Analysis
5 pages
Bhopal Gas Tragedy Analysis
No ratings yet
Bhopal Gas Tragedy Analysis
16 pages
Distracted Driving - Safety Toolbox Talks Meeting Topics
No ratings yet
Distracted Driving - Safety Toolbox Talks Meeting Topics
2 pages
Q2 2022 Competency Failures by GL
No ratings yet
Q2 2022 Competency Failures by GL
2 pages
Urilyzer Auto User Manual
No ratings yet
Urilyzer Auto User Manual
49 pages
GBU410
No ratings yet
GBU410
2 pages
Gulfood 2020 Exhibitor List
No ratings yet
Gulfood 2020 Exhibitor List
4 pages
Unit II
No ratings yet
Unit II
76 pages
2023 08 28 Budget Book 2022 23 Composed
No ratings yet
2023 08 28 Budget Book 2022 23 Composed
28 pages
Giancoli Chapter 5
No ratings yet
Giancoli Chapter 5
38 pages
My Hand Was The One You
No ratings yet
My Hand Was The One You
559 pages
The Signpost of God 6x9 BAN ISBN 978-1-7322721-0-1 v1 - 618
No ratings yet
The Signpost of God 6x9 BAN ISBN 978-1-7322721-0-1 v1 - 618
250 pages
ICT As Medium For Advocacy
No ratings yet
ICT As Medium For Advocacy
11 pages
Steps in Preparing A Presentation PDF
No ratings yet
Steps in Preparing A Presentation PDF
6 pages
The Eternal Soul and Life's Impermanence
No ratings yet
The Eternal Soul and Life's Impermanence
6 pages
Literature Review On Customer Care Services
100% (2)
Literature Review On Customer Care Services
4 pages
3rd Golaghat Open Order of Events 2025
No ratings yet
3rd Golaghat Open Order of Events 2025
3 pages
Legal Nuances of Marriage Termination
No ratings yet
Legal Nuances of Marriage Termination
3 pages
Ethical Issues in Tobacco Marketing
No ratings yet
Ethical Issues in Tobacco Marketing
19 pages
Quiz in BA5 PDF
No ratings yet
Quiz in BA5 PDF
3 pages
301 Lab Report Final
No ratings yet
301 Lab Report Final
14 pages
Community Research Methods and Strategies
No ratings yet
Community Research Methods and Strategies
8 pages
g7 Recognition Day Programme Final
No ratings yet
g7 Recognition Day Programme Final
5 pages
Math LP Grade 9 2025
No ratings yet
Math LP Grade 9 2025
7 pages
Training Input & Training Calendar
No ratings yet
Training Input & Training Calendar
11 pages
MS Grade C-D Rates of Reaction and Physical and Chemical Changes
No ratings yet
MS Grade C-D Rates of Reaction and Physical and Chemical Changes
15 pages

Module 1 Part2

Uploaded by

Module 1 Part2

Uploaded by

Module-1: chapter4 & 5

Fundamental File Structure

• If our input is as follows

AmesMary123 MapleStillwaterOK74075MasonAlan90 EastgateAdaOK74820

• Problem: There is no way to get the

The last byte

• Each field is fixed length specified in the above

This type of method is self-describing.

Method 2: Requiring that the records be a predictable number of

Method 4:Using a second ﬁle to keep track of the

• If the data needs to be represented as a

• Finally the data will be viewed in a ﬁle

• Practically the data is packed is as

// Next ﬁeld to be read hence NextByte is initialized

• They have used multiple inheritance:-

• Class hierarchy for record buffer

Here we are checking which function is called.

• When looking for an individual record, it is

File structure in UNIX:

•wc(word count):- no. of lines, words, characters

– grep Ada my ﬁle | wc

– Fixed record length

• File organization depends on:

You might also like