What is Secondary Index in DBMS: Explanation and Examples
secondary index in a database is an additional data structure that helps speed up queries on columns other than the primary key. Unlike the primary index, it does not determine the physical order of data but provides a quick lookup for non-primary key fields.How It Works
Imagine a library where books are arranged by their unique ID numbers (primary key). If you want to find a book by its author or title, you would have to look through all the books one by one. A secondary index is like a separate catalog that lists books by author or title, pointing you to the exact shelf where the book is located.
In databases, the primary index organizes data physically by the primary key. The secondary index creates a separate list for other columns, storing the column value and a reference to the data's location. When you search using a secondary index, the database quickly finds the reference and then fetches the full data.
Example
This example shows how a secondary index can be created and used in SQL to speed up queries on a non-primary key column.
CREATE TABLE Employees ( EmployeeID INT PRIMARY KEY, Name VARCHAR(100), Department VARCHAR(50) ); -- Create a secondary index on Department CREATE INDEX idx_department ON Employees(Department); -- Query using the secondary index SELECT * FROM Employees WHERE Department = 'Sales';
When to Use
Use a secondary index when you often query a database table by columns other than the primary key. For example, if you frequently search employees by department or customers by city, a secondary index speeds up these lookups.
However, secondary indexes add overhead when inserting or updating data because the index must also be updated. So, use them when read performance on non-primary columns is more important than write speed.
Key Points
- A secondary index is an extra data structure for fast lookups on non-primary key columns.
- It does not affect the physical order of data in the table.
- Secondary indexes improve read query speed but add overhead on writes.
- They are useful when queries often filter or sort by columns other than the primary key.