Indexing is used to optimize the performance of a database by minimizing the number of disk accesses required when a query is processed
The index is a type of data structure, It is used to locate and access the data in a database table quickly
Speeds up operation with read operations like SELECT queries, WHERE clause etc
Search Key
Contains copy of primary key or candidate key of the table or something else
Data Reference => Pointer holding the address of disk block where the value of the corresponding key is stored
Indexing is optional, but increases access speed
It is not the primary mean to access the tuple, it is the secondary mean
Index file is always sorted
Indexing Methods
Primary Index (Clustering Index)
A file may have several indices, on different search keys. If the data file containing the records is sequentially ordered, a Primary index is an index whose search key also defines the sequential order of the file
All files are ordered sequentially on some search key. It could be Primary Key or non-primary key
Ideas
Dense Index
The dense index contains an index record for every search key value in the data file
The index record contains the search-key value and a pointer to the first data record with that search-key value
The rest of the records with the same search-key value would be stored sequentially after the first record
It needs more space to store index record itself. The index records have the search key and a pointer to the actual record on the disk
Ordering based on Non-Key attribute
Data file is sorted w.r.t non-key attribute
No. Of entries in the index = unique non-key attribute value in the data file
This is dense index as, all the unique values have an entry in the index file
Sparse Index
An index record appears for only some of the search-key values
Sparse Index helps you to resolve the issues of dense Indexing in DBMS. In this method of indexing technique, a range of index columns stores the same data block address, and when data needs to be retrieved, the block address will be fetched
Ordering based on Key attribute
Data file is sorted w.r.t primary key attribute
PK will be used as search-key in Index
Sparse Index will be formed i.e., no. of entries in the index file = no. of blocks in datafile
Primary Indexing can be based on Data file is sorted w.r.t Primary Key attribute or non-key attributes
Multi-level Index
Index with two or more levels
If the single level index become enough large that the binary search it self would take much time, we can break down indexing into multiple levels
Secondary Index (Non-Clustering Index)
Datafile is unsorted, Primary Indexing is not possible
Can be done on key or non-key attribute
Called secondary indexing because normally one indexing is already applied
No. Of entries in the index file = no. of records in the data file
It's an example of Dense index
Advantages of Indexing
Faster access and retrieval of data
IO is less
Limitations of Indexing
Additional space to store index table
Indexing Decrease performance in INSERT, DELETE, and UPDATE query