Back to articles list August 14, 2018 - 4 minutes read SQL Indexing 101 Aldo Zelen Aldo is a data architect with a passion for the cloud. From leading a team of data professionals to coding a data warehouse in the cloud, Aldo has experience with the whole lifecycle of data-intensive projects. Aldo spends his free time forecasting geopolitical events in forecasting tournaments. Tags: data analysis get started with SQL how to in sql SQL basics Indexes are one of the most misused and misunderstood entities in physical database design. A good understanding of indexes and how they solve database performance problems is necessary for any database novice. In this article, we'll look at basic database indexes and their role in database development. To picture what an index is, consider a textbook. At the end of most textbooks is an index listing all the terms one can find in the text and the pages on which they appear. The same is true of a database index—its purpose is to store the locations of indexed data, with the indexed data stored in a certain order, so we can access the data faster. An index is stored as a separate structure inside a database; it does not alter the structure of a database table in any way when it's created. The most important index in modern RDBMSes is known as a B-tree index. There are other types of indexes, too, like bitmap indexes and bloom indexes, but we're not going to worry about those in this article. Let's get started! B-tree indexes Why are the most common database indexes known as B-tree indexes? Well, quite simply, the "B" stands for balanced. B-tree indexes are balanced in the sense that the depth of any given node in such a "balanced" tree is equal for every indexed data location. In other words, with B-tree indexing, it takes the same amount of time to access the data stored in every location in the table, regardless of that data's depth in the table. To see how B-trees are constructed, let's look at a simple customers table for a pet store: Let's explore a simple B-tree structure for an index on the id column, which holds unique integer values from 1 through 34. In our case, the root of the B-tree consists of two values: 32 and 17. This root branches out to two nodes that consist of four values each (32, 28, 24, 21 on the left and 17, 13, 9, 4 on the right). In turn, each of these two nodes branches out to four leaf nodes, for a total of eight leaves in the tree. How does SQL indexing work? Let's say you need to retrieve every column of the row with an id of 10. If the database uses the index to conduct this search, it first starts with the root node and asks if the value 10 is less than or equal to or greater than 17. The database finds that 10 is less than 17, so it branches to the right. In the branch node, we see that 10 is between 13 and 9, so we're routed to the leaf node that starts with 13. In this leaf node, we find our id and the corresponding address of the row. We see that to get to our desired row, the database had to use only three jumps. Primary key indexes Creating primary keys for a table will assign a unique index to the column(s) on which the primary key was constructed. Primary keys are tied to basic CREATE TABLE commands. If you're not familiar with primary keys, don't panic. They're easier than they seem. To create a UNIQUE index for the name column of the customers table, we would write: CREATE UNIQUE INDEX name_unique_index ON customers (name); You can probably guess that this index doesn't make sense, since you can have customers who share the same name. A better fit for this table would be the creation of a NON-UNIQUE index. The syntax is even simpler; you just omit the UNIQUE keyword. CREATE INDEX name_index ON customers (name); Final thoughts Indexing a database model is a primary database developer task but in some organizations, indexing is also done by database administrators, as they know the inner workings of the database best. The developers need to know how to index properly, as indexes speed up data retrieval. But bad indexing can break applications. The best way to learn SQL indexing is to open a database terminal and start executing queries. Try querying a table with and without indexes; the difference in speed will be quite noticeable, especially for large datasets. Tags: data analysis get started with SQL how to in sql SQL basics You may also like You Want to Learn SQL? You've Come to the Right Place! If you want to learn SQL basics or enhance your SQL skills, check out LearnSQL.com for a wide range of SQL courses and tracks. Read more 5 Advanced SQL Constructions Every Data Analyst Should Learn Learn five advanced SQL concepts every data analyst should know. Improve your SQL knowledge with real-life examples and detailed explanations. Read more Advanced SQL Interview Questions (and Answers) Here are ten common questions you might hear during an advanced SQL interview. We also have the answers for you. Read more Advanced SQL Courses for Data Analysts – A Complete Review Data analysts, do you want to improve your skills and move to a higher level of SQL? Here’s an overview of 8 online SQL courses for advancing analysts. Read more What SQL Practice Do You Need to Prepare for a Technical Job Interview? Need some SQL practice before a technical job interview? Here are six ways to prepare for various types of database and SQL jobs! Read more High-Performance Statistical Queries: Dependencies Between Discrete Variables Contingency tables and the chi-squared test—which method is better to test dependencies that do not rely on numeric values? Read more Digging Into Data: Explore and Analyze Survey Results With SQL You've carried out a customer survey and collected all the results. What's next? It's time to dig into your data and here comes SQL to help you out! Read more Introduction to Reporting with SQL — the Ultimate Tutorial for Business Professionals The right business decision-making is not based on intuition but data analysis. Learn how to perform simple SQL queries for data reporting. Read more High Performance Statistical Queries –Skewness and Kurtosis In this article, I am explaining the third and fourth population moments, the skewness and the kurtosis, and how to calculate them. Read more SQL Statistical Analysis Part 3: Measuring Spread of Distribution In this article, we’ll explain how to find the spread of a distribution in SQL to take your SQL statistical analysis skills to the next level! Read more SQL Statistical Analysis Part 2: Calculating Centers of Distribution In this article, we’ll focus on calculating centers of distribution. We’ll learn how to calculate the SQL median, SQL mode, and various types of mean. Read more SQL Statistical Analysis Part 1: Calculating Frequencies and Histograms If you wonder whether you can perform statistical analysis in SQL, the answer is ‘yes’. Read my article to learn how to do this! Read more Subscribe to our newsletter Join our weekly newsletter to be notified about the latest posts.