Indexing - D B M S

  • Indexing is used to optimize the performance of a database by minimizing the number of disk accesses required when a query is processed
  • The index is a type of data structure, It is used to locate and access the data in a database table quickly
  • Speeds up operation with read operations like SELECT queries, WHERE clause etc
  • Search Key
    • Contains copy of primary key or candidate key of the table or something else
  • Data Reference => Pointer holding the address of disk block where the value of the corresponding key is stored
  • Indexing is optional, but increases access speed
    • It is not the primary mean to access the tuple, it is the secondary mean
  • Index file is always sorted
  • Indexing Methods
    • Primary Index (Clustering Index)
      • A file may have several indices, on different search keys. If the data file containing the records is sequentially ordered, a Primary index is an index whose search key also defines the sequential order of the file
      • All files are ordered sequentially on some search key. It could be Primary Key or non-primary key
      • Ideas
        • Dense Index
          • The dense index contains an index record for every search key value in the data file
          • The index record contains the search-key value and a pointer to the first data record with that search-key value
          • The rest of the records with the same search-key value would be stored sequentially after the first record
          • It needs more space to store index record itself. The index records have the search key and a pointer to the actual record on the disk
          • Ordering based on Non-Key attribute
            • Data file is sorted w.r.t non-key attribute
            • No. Of entries in the index = unique non-key attribute value in the data file
            • This is dense index as, all the unique values have an entry in the index file
        • Sparse Index
          • An index record appears for only some of the search-key values
          • Sparse Index helps you to resolve the issues of dense Indexing in DBMS. In this method of indexing technique, a range of index columns stores the same data block address, and when data needs to be retrieved, the block address will be fetched
          • Ordering based on Key attribute
            • Data file is sorted w.r.t primary key attribute
            • PK will be used as search-key in Index
            • Sparse Index will be formed i.e., no. of entries in the index file = no. of blocks in datafile
      • Primary Indexing can be based on Data file is sorted w.r.t Primary Key attribute or non-key attributes
      • Multi-level Index
        • Index with two or more levels
        • If the single level index become enough large that the binary search it self would take much time, we can break down indexing into multiple levels
    • Secondary Index (Non-Clustering Index)
      • Datafile is unsorted, Primary Indexing is not possible
      • Can be done on key or non-key attribute
      • Called secondary indexing because normally one indexing is already applied
      • No. Of entries in the index file = no. of records in the data file
      • It's an example of Dense index
  • Advantages of Indexing
    • Faster access and retrieval of data
    • IO is less
  • Limitations of Indexing
    • Additional space to store index table
    • Indexing Decrease performance in INSERT, DELETE, and UPDATE query
Share: