Understanding Index 1

To get a clear idea on indexing and its importance, we have to understand the following questions.

1. What is an index ?

Index is an alternative access path to data. In the context of RDBMS, an Index provides alternative access path to a table row. In general data processors, an index may point to any structured/unstructured data set.

Assume we have a table Employee with definition {FirstName, LastName, EmployeeID, Designation, Department, …}. Every database uses a primary key when storing the row, so that a particular row can be identified and retrieved by the primary key value.

Assume the primary key is EmployeeID, then we can read the required row directly for above query. But what happens when we qualify the row based on Name.

To execute this query, the DBMS has to read all the rows and compare the FirstName value with string ‘Tyler’. This is very slow if the table has more rows, say – 1 Million rows. This problem can be avoided by creating an index on column “FirstName”. This way we can reach the row directly and read the designation.

Similarly if more than one column has been specified in the search query, we can create a multi-column index on those columns. This improves the query performance drastically.

In the context of unstrurtured data, an index may be just a direct link/reference to the actual data set.


2. When should you create an index?

If the table size is small there is no meaning of creating indexes. When the table size grows, if the select and update queries frequently use a non primary key column for identifying out the desired row, their performance goes bad.

The column(s) which are used in search criteria frequently should be identified and index should be created on these columns. If multiple columns are frequently used in search condition together, we may consider creating a composite index.

We should avoid creating index when the search condition typically has range conditions and selects more number of rows.

3. What are the benefits of creating index?

Indexes are created to improve performance of search queries. Index provides faster access to data record. If an index exists on a column, instead of searching the table row, we can search the index and from the index we can reach the table row.

When a table has large number of rows or many columns, the overall table size becomes huge. Accessing the table should be avoided to prevent costly IO operations. If the select list of query is present in the index, Indexes can eliminate the need for reading the base table. This saves I/O.

Indexes are internally helpful for enforcing column level constraints like uniqueness on the column.

                                                                                                                                                    Continued …