I Just Felt The Awesome Power Of SQL Table Indexing (Thanks Nate Nelson!)

Posted July 12, 2007 at 7:28 PM by Ben Nadel

Tags: SQL

I have to admit, I was never formally trained as a database administrator. Other than an introductory database course that I took about 7 years ago, most of what I know about databases I have learned on my own or by talking to other people. As such, I am not the best database schema developer nor do I know much about SQL database optimization. As such, it shouldn't be that surprising that yesterday I found myself trying to optimize a page that was taking 4-5 seconds to process each record (for an estimated total of 45-minutes... if it didn't keep timing out).

Normally, I might look at this page and its fairly long and complex SQL algorithms and think to myself, 45 minutes is long, but maybe that's as good as its gonna get.

Well, Thank God I attended Nate Nelson's CFUNITED sessions on Database Performance Tuning and Beyond Basic SQL. Nate spent about a quarter of the time driving home just how important indexes are when it comes to database performance. He also repeatedly stressing the fact that databases are really amazing and that they should be super fast; that you should rarely expect a well built database to perform slowly. It was this message that I kept hearing in my head as I kept pouring over my huge ass SQL queries.

Instead of settling for less than even mediocre database performance, I took one of his [Nate's] optimization tips and I went through the various tables involved and I indexed all of the foreign keys. This is the first time I have ever indexed a foreign key. Heck, it's the first time that I've ever indexed anything but the primary key fields.

Re-running the page, I actually got giddy with excitement and drew many sideways glances with my uncontrollable school-girl giggling. The page, which had been taking an estimated 45 minutes to process had just finished processing in 16 seconds.

Let me make that as clear as possible: Taking Nate Nelson's SQL optimization advice and adding foreign key indexes to my tables, the page went from taking 45-minutes down to 16 seconds.

What?!?!?

And so it was that on July 11th in the year of 2007, Ben Nadel saw the power of indexes and it was good.


 
 
 

 
Gettin' It Done In 16 Seconds - Now That's A Performance That Puts A Smile On My Face  
 
 
 


Reader Comments

Jul 12, 2007 at 7:50 PM // reply »
76 Comments

That is insane, I cannot believe just how much the increase is. I also have never bothered to index foreign keys either so obviously should do so :D

BTW nice pic :P


Jul 12, 2007 at 7:53 PM // reply »
11,243 Comments

Yeah, its pretty crazy! It went from 2700 (estimated) seconds down to 16 seconds. That's .006% of the original time. It's funny, when I was listening to Nate speak, he really was pushing the indexing which got a little bit repetitive, but in hindsight, it was exactly the right move.

Now, I need to learn more about indexes. Apparently they come with costs (both for upkeep time and for storage I assume). I gotta figure out where the good balance is.


Jul 12, 2007 at 7:54 PM // reply »
76 Comments

Umm also do what exactly do you mean by indexing the foreign keys? Does that mean that you are adding a new "foreign key" or a new "index" that sorts the foreign key's column?


Jul 12, 2007 at 7:56 PM // reply »
76 Comments

Yeah I have used secondary indexes before and the upkeep time I think is each new record will mean that the index will have to be recalculated.


Jul 12, 2007 at 8:00 PM // reply »
11,243 Comments

I have a primary key, which is usually an auto-incrementer. Then I have other row-related columns. Then, I usually have one or two (or more) foreign keys, which are just IDs that correspond to other primary keys of other tables.

So, for example, if I had an phone record:

ID (primary key)
Number (the phone number)
Type (type - cell, home, work, etc)
ContactID (the primary key of the related contact)

The ID, which is a primary key, is automatically indexed. What I did was then put an index on the ContactID column. That way, if / when I join the Phone table to the Contact table (on the Phone.ContactID = Contact.ID columns), the Database does super fast look ups on both columns.... rather than just a super fast look up on Contact.ID and a full table scan on the Phone.ContactID table.

Does that clear it up at all? I had more complicated joins and several tables, but same idea.


Jul 12, 2007 at 8:10 PM // reply »
76 Comments

Yeah I get it, I have done that before on occasions. I got confused I thought that you were refering to creating a foreign key constraint son foreign keys which I haven't done

Good info mate :)


Jul 12, 2007 at 8:13 PM // reply »
11,243 Comments

I have never set up constraints... I always do that in my business logic / code, which I am told is a no-no.


Jul 12, 2007 at 8:15 PM // reply »
76 Comments

Yeah I agree :)


Jul 12, 2007 at 8:24 PM // reply »
2 Comments

This book really blew my mind as far as SQL goes, I don't know how I ever got a long without it!

http://www.amazon.com/Art-SQL-Stephane-Faroult/dp/0596008945/ref=pd_bbs_sr_1/102-6682395-3552161?ie=UTF8&s=books&qid=1184286210&sr=8-1


Jul 12, 2007 at 8:49 PM // reply »
45 Comments

Holy crap! :)

First, it's amazing that you had the patience to let something run for 45 minutes, so congrats on that, heheh. But cutting the execution time down to 16 seconds just by putting indexes on your FK's is amazing... I never thought a bunch of unindexed queries would run that badly, let alone optimize so well.

Nice one.


Jul 12, 2007 at 9:06 PM // reply »
11 Comments

Hi Ben,

First congratulations. Indexing is very important.

In SQL Server 2005, the functionality of non-clustered indexes is extended by adding non-key columns to the leaf level of the non-clustered index. Non-key columns, can help to create cover indexes.By including non-key columns, you can create non-clustered indexes that cover more queries. The Database Engine does not consider non-key columns when calculating the number of index key columns or index key size. Non-key columns can be included in non-clustered index to avoid exceeding the current index size limitations of a maximum of 16 key columns and a maximum index key size of 900 bytes. Another advantage is that using non-key column in index we can have index data types not allowed as index key columns generally.

Kind Regards,
Pinal Dave


Jul 12, 2007 at 9:12 PM // reply »
32 Comments

That's cool! I love it when things like that happen. I was wondering how you got that much improvement on a foreign key index, but then in your comments section it made sense. The more unique the values are in a column, the more an index will be beneficial. There is a cost associated with maintaining indexes though, so don't get carried away. But, joining or frequent searches on a column make them good candidates to consider for an index.

CoolJJ


Jul 12, 2007 at 9:15 PM // reply »
32 Comments

Oh yeah, I was gonna also say I used to have abs like that a couple of years ago. Working on gettin em back. Maybe another month. Sweet pic!


Jul 12, 2007 at 9:28 PM // reply »
56 Comments

Welcome grasshopper, we've been waiting for you.

Now you need to learn the value of clustered indexes and indexed views and you can squeeze even more out of your database.

Just be careful when using indexes that you don't go ape and start indexing even single column in your table.

A good and very basic rule of thumb is this:

If a column is used in a WHERE or and ON clause it should be indexed.


Jul 12, 2007 at 9:32 PM // reply »
56 Comments

BTW

To learn more about databases and performance, I would suggest you read the SQL Authority blog:

http://blog.sqlauthority.com/

and who could forget the SQL Server Performance website:

http://www.sql-server-performance.com/


Jul 13, 2007 at 2:57 AM // reply »
11 Comments

I recommend http://blog.sqlauthority.com/2007/07/05/sql-server-2005-best-practices-analyzer-tutorial-sample-example

Thanks Tony.

Pinal Dave


Jul 13, 2007 at 4:47 AM // reply »
8 Comments

This is a strange coincidence, that on my birthday you also become enlightened with the power of SQL... maybe something more to add to my Synchronicity post? http://www.markdrew.co.uk/blog/index.cfm/2007/7/9/Eclipse-ColdFusion-and-Mark-Drew-Synchronised


Ed
Jul 13, 2007 at 6:24 AM // reply »
12 Comments

There are ways also to see what indexes may be needed or what indexed are not really needed. Because often too many indexes is same bad as no indexes at all.

I Think you may be interested in this:
http://www.1smartsolution.com/blog/index.cfm?mode=entry&entry=7C39E62B-B03E-D116-E0541B860E7F4D7B

and this:
http://www.1smartsolution.com/blog/index.cfm?mode=entry&entry=700D2E01-D4E9-EDC5-40ED8D24D88E8F92


Jul 13, 2007 at 9:03 AM // reply »
11,243 Comments

@Joe,

Thanks for the book recommendation (and one that's not too expensive ;)). I will try to check that out.

@Justin,

I didn't have a chance to wait 45 minutes. The page was set to timeout after 10 minutes. However, it was taking 4-5 seconds to process each record from an imported text file. Taking that, and extrapolating out, I estimate that, based on a known number of records, it would take 45 minutes at 4-5 seconds a record to finish processing.

@Pinal Dave,

I hardly understand what you just said :) But, I just found your site through Ben Forta's blog, so I am sure I will be reading up on all of your SQL insight.

@CoolJJ,

Yeah, I know there are costs, but I know nothing about the trade-offs at this point. I can tell you that this page was run many times a day so it needs to be super fast.

@Tony,

Glad to finally have arrived ;) Yeah, I am starting to see the whole ON clause optimization. I always just assumed that since these were numeric fields, they would be really fast anyway... live and learn.

@Mark,

Happy (belated) Birthday!

@Others,

I will check out those links. This stuff looks very exciting (in a really geeky, not exciting to most people sort of way) :)


Sep 13, 2007 at 2:14 PM // reply »
3 Comments

Agreed on indexes I remember 3 years back one of the application was taking atleast 5 minutes to show the results from one of the query and once my senior db admin added couple of indexes it took seconds to run that query ....always use them now but as mentioned too many of them can do more harm than good so you need to be sure what to index :).



Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
May 23, 2013 at 4:26 PM
ColdFusion QueryAppend( qOne, qTwo )
@Heather, Glad people are still getting value out of this! ... read »
May 23, 2013 at 3:49 PM
Strange Interaction Between DeserializeJson(), ArrayContains(), And Database Values In ColdFusion
@WebManWalking, I meant the code at the bottom (not the video). I did try to experiment with an intermediary variable, like: value = users.id[ i ]; arrayContains( userIDs, value ); ... but t ... read »
May 23, 2013 at 11:06 AM
Strange Interaction Between DeserializeJson(), ArrayContains(), And Database Values In ColdFusion
@Ben, Are you talking about As Number: YES As String: YES As Java: YES? If so, that's with 3 different ways of referencing the constant 1, not users.id[1]. Query object references(*) are what seem ... read »
May 23, 2013 at 9:55 AM
Strange Interaction Between DeserializeJson(), ArrayContains(), And Database Values In ColdFusion
@Dan, According to the CF Admin, I'm running Java "1.6.0_45". As far as the DB column, in the database it's an INT. I'll see if I can dig into what CF sees it as. @WebManWalking, But h ... read »
May 23, 2013 at 9:49 AM
Strange Interaction Between DeserializeJson(), ArrayContains(), And Database Values In ColdFusion
@Ben, I think the problem is that we're used to loose typing in ColdFusion, like JavaScript. If a value is a number but it's needed in an expression to be a string, noooo problem. I've encountered ... read »
May 23, 2013 at 9:47 AM
ColdFusion QueryAppend( qOne, qTwo )
You rock! Thank you, thank you, thank you!!! ... read »
May 23, 2013 at 5:19 AM
Ask Ben: Print Part Of A Web Page With jQuery
How to print also the background color of table cells and table lines ... read »
May 23, 2013 at 3:55 AM
Javascript Array Methods: Unshift(), Shift(), Push(), And Pop()
very interesting and helpful too. ... read »
InVision App - Prototyping Made Beautiful With Prototyping Tools