0% found this document useful (0 votes)
761 views5 pages

ss2 DPR Second Term

File organization refers to the logical structure and physical placement of records in a file. There are several methods of file organization including serial, sequential, indexed, random, heap, hash, and clustered. Each method has advantages and disadvantages for storage efficiency and speed of different file operations like search, insert, delete, etc. Common computer file types are master files, transaction files, and reference files which are classified based on attributes like content, organization method, and storage medium.

Uploaded by

Taze Utoro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
761 views5 pages

ss2 DPR Second Term

File organization refers to the logical structure and physical placement of records in a file. There are several methods of file organization including serial, sequential, indexed, random, heap, hash, and clustered. Each method has advantages and disadvantages for storage efficiency and speed of different file operations like search, insert, delete, etc. Common computer file types are master files, transaction files, and reference files which are classified based on attributes like content, organization method, and storage medium.

Uploaded by

Taze Utoro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

TOPIC 1: FILE ORGANIZATION

File Organization is the logical relationships among various records that constitute
the file, particularly with respect to the means of identification and access to any
specific record. In simple terms, storing the files in a certain order is called file
organization. File structure refers to the format of the label and data blocks and
of any logical control record

Types of file organization method


Serial:

A serial file is one which the records have been stored in the order in which they have arisen.
They have not been sorted into any particular order. An example of a serial file is an unsorted
transaction file. A shopping list is an example of a non-computerized serial file. Serial files can
be stored on tape, disc or in memory. Sequential: in sequential file organization, records are
organized in the sequence by which they were added. A sequential file contains records
organized in the order they were entered. The order of the records is fixed. The records are
stored and sorted in physical, contiguous blocks within each block the records are in sequence.
Records in these files can only be read or written sequentially.

Indexed:

An indexed file organization contains reference numbers, like employee numbers, that identify
a record in relation to other records. These references are called the primary keys that are
unique to a particular record. Alternate keys can also be defined to allow alternate methods of
accessing the record. For example, instead of accessing an employee's record using employee
numbers, you can use an alternate key that reference employees by departments. This allows
greater flexibility for users to randomly search through thousands of records in a file. However,
it employs complex programming to be implemented.

Random file:

This is the file organized via an index. Also called a "direct file" or "direct access file," it enables
quick access to specific records or other elements within the file rather than having to read the
file sequentially. The index points to a specific location within the file, and the file is read from
that point.

Heap File Organization


An unordered file, sometimes called a heap file, is the simplest type of file organization.
Records are placed in file in the same order as they are inserted. A new record is inserted in the
last page of the file; if there is insufficient space in the last page, a new page is added to the file.
This makes insertion very efficient. However, as a heap file has no particular ordering with
respect to field values, a linear search must be performed to access a record. A linear search
involves reading pages from the file until the required is found.

Sequential File Organization:

In a sequential file organization, records are organized in the sequence by which they were
added. You cannot insert a new record between existing records, but only at the end of the last
record. It is a simple file organization that allows you to process batches of records in the file
without adding or deleting anything. However, to access a particular record, processing must
run through all the other records above it because it does not generate any random key to
identify the location of the record. 

Hash File Organization


In a hash file, records are not stored sequentially in a file instead a hash function is used to
calculate the address of the page in which the record is to be stored.
The field on which hash function is calculated is called as Hash field and if that field acts as the
key of the relation then it is called as Hash key. Records are randomly distributed in the file so it
is also called as Random or Direct files. Commonly some arithmetic function is applied to the
hash field so that records will be evenly distributed throughout the file.

Cluster File Organization


A traditional file system is a hierarchical tree of directions and files Implemented on a raw
device partition through the file system. Clustered file organization is not considered good for
large databases. In this mechanism, related records from one or more relations are kept in the
same disk block, that is ordering of records is not based on primary key or search key.

Indexes Sequential Access Method (ISAM) 


In an ISAM system, data is organized into records which are composed of fixed length fields.
Records are stored sequentially. A secondary set of hash tables known as indexes contain
"pointers" into the tables, allowing individual records to be retrieved without having to search
the entire data set.
It is a data structure that allows the DBMS to locate particular records in a file more quickly and
thereby speed response to user queries. An index in a database is similar to an index in a book.
It is an auxiliary structure associated with a file that can be referred to when searching for an
item of information, just like searching the index of a book, in which we look up a keyword to
get a list of one or more pages the keyword appears on. 

Methods of accessing files

Serial files: To access a serially organized file is serially.

Sequential files: the method of access used is still serial but of course the files are now in
sequence, and for this reason the term sequential is often used in describing serial access of a
sequential tape file. It is important to note that to process (e.g. update) a sequential master
tape file, the transaction file must also be in the sequence of the master file. Access is achieved
by first reading the transaction file and then reading the master file until the matching record
(using the record keys) is found. Note therefore that if the record required is the twentieth
record on the file, in order to get it into storage to process it the computer will first have to
read in all nineteen proceeding records.

Random files: Generally speaking the method of accessing random files is RANDOM. The
transaction record keys will be put through the same mathematical formula as were the keys of
the master records, thus creating the appropriate bucket address. The transactions in random
order are then processed against the master file, the bucket address providing the address of
the record required.

Computer files classification:

1. Master file: there are files of a fairly permanent nature, e.g. customer ledger, payroll,
inventory, and so on. A feature to know is the regular updating of these files to show a current
position. For example, a customer's order will be processed, increasing the "balance owing
"figure on a customer ledger record. It is seen therefore that master records will contain both
data of a static nature, e.g. a customer name, address, and data that, by its nature will change
each time a transaction occurs, e.g. the" balance" figure already mentioned.

2. Transaction file: This is also known as movement file. This is made up of various transactions
created from the source documents. In a sales ledger application, the file will contain all the
orders received at a particular time. This file will be used to update the master file. As soon as it
had been used for this purpose it is no longer required. It will therefore have a very short life,
because it will be replace by a file containing the next batch of orders.

3. Reference files: A file with a reasonable amount of permanency. Examples of data used for
reference purposes are price lists, tables of rates of pay, names and addresses.

Criteria for classifying computer files

By nature of the content: refers to the nature of file content.

By organization method: it refers to the way files are arranged e.g. Serial, sequential, random
and so on.

By storage medium: it refers to storage devices in which a file's' could only be stored such as
magnetic or optical disk and magnetic tape and so on.

SUBTOPIC 2
Basic Operation

Scan: Fetch all records in the file. The pages in the file must be fetched from the disk into the
buffer pool. There is also a CPU overhead per record for locating the record on the page .

Search with equality selection: Fetch all records that satisfy an equality selection, for
example, find the student record for the student with score 23. Pages that contain qualifying
records must be fetched from the disk, and qualifying records must be located within retrieved
pages.
Search with range selection: Fetch all records that satisfy a range selection. For example,
find all students records with name alphabetically after smith .

Insert: Insert a given record into the file. We must identify the page in the file into which the
new record must be inserted, fetch that page from the disk, modify it to include the new record
and then write back the modified page.

Delete: Delete a record that is specified using its record id. We must identify the page in the
file into which the new record must be inserted, fetch that page from the disk, modify and then
write it back.
Locate: Every file has a file pointer, which tells the current position where the data is to be
read or written.

Write: User can select to open a file in write mode, the file enables them to edit its
contents. It can be deletion, insertion or modification.

Read: By default, when file are opened in read mode, the file pointer points to the beginning
of the file.

Comparison between the Three Files Organization

- A hashed file does not utilize space quite as well as a sorted file, but
insertions and deletions are fast, and equality selections are very fast.
- A heap file has good storage efficiency and supports fast scan, insertion and
deletion or records. However, it is slow for searching.
- A sorted file also offers good storage efficiency, but insertion and deletion
of records are slow. It is quite fast for searching, and it is the best structure
for range selections.

You might also like