Indexes Explained: Boosting Query Performance Database

Welcome to week 2 Day 4 of our Database Decoded series! After mastering table relationships and JOIN operations, it’s time to tackle a critical tool for optimizing your database: indexes. Today, you’ll learn how indexes work, when to use them, and how to measure their impact on query speed. Let’s unlock the secrets to faster databases!

Why Indexes Matter

Imagine searching for a word in a 1,000-page book without an index. You’d have to scan every page—slow and inefficient. Databases face the same problem when querying unindexed tables.

Indexes solve this by:

Reducing the number of rows a query needs to scan.
Speeding up WHERE, JOIN, and ORDER BY operations.
Improving performance for large tables (10,000+ rows).

How Indexes Work

An index is a separate data structure (like a B-tree) that stores a sorted copy of selected columns. Think of it as a roadmap to your data.

Analogy:

Table: The book.
Index: The book’s index page, telling you exactly where to find specific topics.

When you search for data, the database checks the index first, then jumps directly to the relevant rows.

Types of Indexes

1. Clustered Index

Determines the physical order of data in a table.
Only one clustered index per table (usually the primary key).

Example:

-- Created automatically for PRIMARY KEY  
CREATE TABLE Employees (  
    EmployeeID INT PRIMARY KEY,  -- Clustered index  
    FirstName VARCHAR(50)  
);

2. Non-Clustered Index

A separate structure that points to the actual data.
Multiple non-clustered indexes allowed.

Example:

CREATE INDEX IX_DepartmentID ON Employees (DepartmentID);

When to Create Indexes

Create indexes for:

Columns frequently used in WHERE clauses (e.g., DepartmentID, Salary).
Columns used in JOIN conditions.
Columns in ORDER BY or GROUP BY statements.

Avoid over-indexing:

Indexes slow down INSERT, UPDATE, and DELETE operations.
Use them strategically on large, read-heavy tables.

Creating Indexes in SQL Server

Step 1: Create a Non-Clustered Index

Let’s speed up searches on the DepartmentID column in the Employees table:

CREATE INDEX IX_Employees_DepartmentID  
ON Employees (DepartmentID);

Step 2: Create a Composite Index

Index multiple columns used together in queries:

CREATE INDEX IX_Employees_Name  
ON Employees (FirstName, LastName);

Measuring Index Impact

Use SQL Server Management Studio (SSMS) to analyze performance:

Enable Execution Plan:
- In SSMS, click Include Actual Execution Plan (Ctrl + M).
- Run a query before and after creating an index.
Compare Results:
- Look for Index Seek (efficient) vs. Table Scan (inefficient).

Example:

-- Before indexing  
SELECT * FROM Employees WHERE DepartmentID = 2;  -- Table Scan  

-- After indexing  
SELECT * FROM Employees WHERE DepartmentID = 2;  -- Index Seek

Practice Tasks

Task 1: Create and Test Indexes

Add a non-clustered index to the Salary column in Employees.
Run a query to find employees earning over $60,000. Compare execution times with/without the index.

Task 2: Experiment with JOINs

Create a Projects table:

CREATE TABLE Projects (  
    ProjectID INT PRIMARY KEY,  
    ProjectName VARCHAR(100),  
    EmployeeID INT FOREIGN KEY REFERENCES Employees(EmployeeID)  
);

Insert sample data and write a LEFT JOIN query to list all employees and their projects (even if they have none).

Task 3: Analyze Query Plans

Run a SELECT query with an ORDER BY Salary clause.
Create an index on Salary and check if the execution plan changes from a Sort operator to an Index Scan.

Pro Tips

Use LEFT JOIN and RIGHT JOIN:
- LEFT JOIN: Returns all rows from the left table, even if there’s no match in the right table.
- RIGHT JOIN: Returns all rows from the right table.

Example:

-- List all employees and their projects (including unassigned employees)  
SELECT E.FirstName, P.ProjectName  
FROM Employees E  
LEFT JOIN Projects P ON E.EmployeeID = P.EmployeeID;

Drop Unused Indexes:
- Use DROP INDEX IX_Name ON TableName; to remove redundant indexes.

Common Pitfalls

Over-Indexing: Too many indexes slow down writes.
Ignoring Statistics: Use UPDATE STATISTICS TableName; to keep index performance optimal.
Forgetting Fragmentation: Rebuild indexes periodically for large tables.