0% found this document useful (0 votes)
281 views8 pages

Database Indexing Techniques Guide

There are three main types of indexing in databases: 1. Primary indexing is done on the primary key field and orders records by this field. 2. Secondary indexing is done on other fields and may index unique candidate keys or duplicate non-key fields. It requires pointers to all records with the same search key value. 3. Clustering indexing orders records by a non-key field, rather than the primary key. Indexing structures can be dense, with an index record for every search key value, or sparse, with index records only for some keys. Multilevel indexing breaks large indexes into smaller nested indexes for more efficient searching.

Uploaded by

Vijay Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
281 views8 pages

Database Indexing Techniques Guide

There are three main types of indexing in databases: 1. Primary indexing is done on the primary key field and orders records by this field. 2. Secondary indexing is done on other fields and may index unique candidate keys or duplicate non-key fields. It requires pointers to all records with the same search key value. 3. Clustering indexing orders records by a non-key field, rather than the primary key. Indexing structures can be dense, with an index record for every search key value, or sparse, with index records only for some keys. Multilevel indexing breaks large indexes into smaller nested indexes for more efficient searching.

Uploaded by

Vijay Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

We know that data is stored in the form of records.

Every record has a key field, which helps it to


be recognized uniquely.
Indexing is a data structure technique to efficiently retrieve records from the database files based
on some attributes on which the indexing has been done. Indexing in database systems is similar
to what we see in books.
Indexing is defined based on its indexing attributes. Indexing can be of the following types

Primary Index Primary index is defined on an ordered data file. The data file is
ordered on a key field. The key field is generally the primary key of the relation.

Secondary Index Secondary index may be generated from a field which is a candidate
key and has a unique value in every record, or a non-key with duplicate values.

Clustering Index Clustering index is defined on an ordered data file. The data file is
ordered on a non-key field.

Ordered Indexing is of two types

Dense Index

Sparse Index

Dense Index
In dense index, there is an index record for every search key value in the database. This makes
searching faster but requires more space to store index records itself. Index records contain
search key value and a pointer to the actual record on the disk.

Sparse Index
In sparse index, index records are not created for every search key. An index record here contains
a search key and an actual pointer to the data on the disk. To search a record, we first proceed by
index record and reach at the actual location of the data. If the data we are looking for is not

where we directly reach by following the index, then the system starts sequential search until the
desired data is found.

Multilevel Index
Index records comprise search-key values and data pointers. Multilevel index is stored on the
disk along with the actual database files. As the size of the database grows, so does the size of
the indices. There is an immense need to keep the index records in the main memory so as to
speed up the search operations. If single-level index is used, then a large size index cannot be
kept in memory which leads to multiple disk accesses.

Multi-level Index helps in breaking down the index into several smaller indices in order to make
the outermost level so small that it can be saved in a single disk block, which can easily be
accommodated anywhere in the main memory.
Dense and Sparse Indices
1. There are Two types of ordered indices:

Dense Index:
o

An index record appears for every search key value in file.

This record contains search key value and a pointer to the actual
record.

Sparse Index:

Index records are created only for some of the records.

To locate a record, we find the index record with the largest search key
value less than or equal to the search key value we are looking for.

We start at that record pointed to by the index record, and proceed


along the pointers in the file (that is, sequentially) until we find the
desired record.

2. Figures 11.2 and 11.3 show dense and sparse indices for the deposit file.

Figure 11.2: Dense index.


3. Notice how we would find records for Perryridge branch using both methods.
(Do it!)

Figure 11.3: Sparse index.


4. Dense indices are faster in general, but sparse indices require less space and
impose less maintenance for insertions and deletions. (Why?)

5. A good compromise: to have a sparse index with one entry per block.

Why is this good?


o

Biggest cost is in bringing a block into main memory.

We are guaranteed to have the correct block with this method, unless
record is on an overflow block (actually could be several blocks).

Index size still small.

Multi-Level Indices
1. Even with a sparse index, index size may still grow too large. For 100,000
records, 10 per block, at one index record per block, that's 10,000 index
records! Even if we can fit 100 index records per block, this is 100 blocks.
2. If index is too large to be kept in main memory, a search results in several
disk reads.
o

If there are no overflow blocks in the index, we can use binary search.

This will read as many as


blocks).

If index has overflow blocks, then sequential search typically used,


reading all b index blocks.

blocks (as many as 7 for our 100

3. Solution: Construct a sparse index on the index (Figure 11.4).

Figure 11.4: Two-level sparse index.


4. Use binary search on outer index. Scan index block found until correct index
record found. Use index record as before - scan block pointed to for desired
record.
5. For very large files, additional levels of indexing may be required.
6. Indices must be updated at all levels when insertions or deletions require it.
7. Frequently, each level of index corresponds to a unit of physical storage (e.g.
indices at the level of track, cylinder and disk).
Index Update

Regardless of what form of index is used, every index must be updated whenever a record is
either inserted into or deleted from the file.
1. Deletion:
o

Find (look up) the record

If the last record with a particular search key value, delete that search
key value from index.

For dense indices, this is like deleting a record in a file.

For sparse indices, delete a key value by replacing key value's entry in
index by next search key value. If that value already has an index
entry, delete the entry.

2. Insertion:
o

Find place to insert.

Dense index: insert search key value if not present.

Sparse index: no change unless new block is created. (In this case, the
first search key value appearing in the new block is inserted into the
index).

Secondary Indices
1. If the search key of a secondary index is not a candidate key, it is not enough to point to
just the first record with each search-key value because the remaining records with the
same search-key value could be anywhere in the file. Therefore, a secondary index must
contain pointers to all the records.

Figure 11.5: Sparse secondary index on cname.


2. We can use an extra-level of indirection to implement secondary indices on search keys
that are not candidate keys. A pointer does not point directly to the file but to a bucket
that contains pointers to the file.
o See Figure 11.5 on secondary key cname.

o To perform a lookup on Peterson, we must read all three records pointed to by


entries in bucket 2.
o Only one entry points to a Peterson record, but three records need to be read.
o As file is not ordered physically by cname, this may take 3 block accesses.
3. Secondary indices must be dense, with an index entry for every search-key value, and a
pointer to every record in the file.
4. Secondary indices improve the performance of queries on non-primary keys.
5. They also impose serious overhead on database modification: whenever a file is updated,
every index must be updated.
6. Designer must decide whether to use secondary indices or not.

You might also like