SQL Pivot Tables Are Rockin' Awesome (Thanks Rick Osborne)

Posted January 29, 2007 at 3:47 PM

Tags: SQL

A few months ago, Rick Osborne introduced me to the idea of a SQL database Pivot Table. These pivot tables consist of a table with a single column that has an index that counts up from one (I'm sure there are variations on this) to a set number. For instance the pivot table [pivot100] has an ID column with values from 1 to 100. Rick told me that he has tables that go to 100, 1000, and 10000, but so far I have only rocked the 100 pivot table.

So anyway, this pivot table just cut a HUGE, NASTY, UGLY SQL query from about 90ms down to about 20ms. That is an outstanding performance increase. This query was huge and nasty and for reporting so it did a tone of things, but I just wanted to show you a small part of it. It is for a recycling program that has a pickups table [pickup] and join table from users to pickups [pickup_useraccount_jn]. As PART of this massive query, I had to find the number of users whose net weight of recycling over a certain date period fell into a given 5 pound bucket (ie. 0-5 lbs, 5-10 lbs, etc.) all the way up to 80 (which was 80+ NOT 80-85 lbs).

First, let's look at the original SQL query. You can assume that all the SQL variables have already been defined and that lstAvailableUsers is list of User IDs to be included in the report (I know that is lame... I am overhauling everything):

 Launch code in new window » Download code as text file »

  • <!---
  • Now, we have to loop over the 5 lb increments and find
  • the number of users that recycled a net pickup weight
  • in that sum category.
  • --->
  • <cfloop
  • index="i"
  • from="0"
  • to="80"
  • step="5">
  •  
  • SET @number_#i#_#(i + 5)# = ISNULL(
  • (
  • SELECT
  • COUNT(*)
  • FROM
  • (
  •  
  • SELECT
  • SUM(p.net_weight) AS net_weight
  • FROM
  • pickup p
  • WHERE
  • p.date_pickup >= @date_from
  • AND
  • p.date_pickup < @date_to
  • AND
  • p.user_id IN ( #lstAvailableUsers# )
  •  
  • GROUP BY
  • p.user_id
  •  
  • ) AS TempWeightTable#i#
  • WHERE
  • net_weight >= #i#
  •  
  • <!--- Check for the catch-all weight bucket. --->
  • <cfif (i LT 80)>
  • AND
  • net_weight < #(i + 5)#
  • </cfif>
  • ),
  •  
  • <!--- If above is null, return zero. --->
  • 0
  • );
  •  
  • </cfloop>

As you can see, I am basically running that inner query 16 times stepping from 0 to 80. That is a SLOW query to be running so many times. That's where the pivot table comes into play. Well, not just the pivot table, but a pivot table and temporary table variable. To clean up the query, I start out be creating a table variable for the steps (0-5, 5-10, etc):

 Launch code in new window » Download code as text file »

  • DECLARE @number_step_5 TABLE (
  • step INT,
  • user_count INT
  • );

Then, I go about populating that table using the help of the SQL pivot table:

 Launch code in new window » Download code as text file »

  • <!---
  • Set up the step-5 table. This will step from 0 to 80
  • getting counts of users who recycled a sum weight in
  • that 5 lb weight bucket.
  • --->
  • INSERT INTO @number_step_5
  • (
  • step,
  • user_count
  • )(
  • SELECT
  • tpiv.id AS step,
  • COUNT( * )
  • FROM
  • <!---
  • Take the pivot100 talbe and trim it so that we
  • are only returning pivot IDs that are a 5lb
  • increment. Also, we need to substract one from our
  • 1-based ID column as our span weibht buckets
  • start at zero.
  • --->
  • (
  •  
  • SELECT
  • (piv.id - 1) AS id
  • FROM
  • pivot100 piv
  • WHERE
  • piv.id < 82
  • AND
  • ((piv.id - 1) % 5) = 0
  •  
  • ) AS tpiv
  • INNER JOIN
  • <!---
  • Create an in-memory temp table that has all the
  • different net recycling weight sums by user. We
  • don't care about the user at this point, just the
  • total list of sums.
  • --->
  • (
  •  
  • SELECT
  • SUM( p.net ) AS net_sum
  • FROM
  • pickup p
  • INNER JOIN
  • pickup_useraccount_jn pujn
  • ON
  • (
  • p.id = pujn.pickup_id
  • AND
  • p.date_picked_up BETWEEN @date_from AND @date_to
  • AND
  • pujn.useraccount_id IN ( #lstAvailableUsers# )
  • )
  • GROUP BY
  • pujn.useraccount_id
  •  
  • ) AS tsum
  • ON
  • <!---
  • When joining the sum table to the pivot table,
  • we want the sum to be between the intial pivot
  • value (our start value) and the increment value
  • (5 lbs). However, the last bucket is a catch all
  • for 80lbs - (infinisty).... which we will get by
  • using 999999 (good enough).
  • --->
  • (
  • tsum.net_sum >= tpiv.id
  • AND
  • tsum.net_sum <
  • (
  • CASE
  • WHEN
  • tpiv.id < 80
  • THEN
  • (tpiv.id + 5)
  • ELSE
  • 999999
  • END
  • )
  • )
  • GROUP BY
  • tpiv.id
  • );

Now, this might not be the easiest query to follow, but let me tell you that it works, and it is quite a bit faster than the original query. Notice that the BIG chunk of a query (the INNER JOIN temp table) is only being run once (and then joined to) as opposed to the 16 or so times it was being run in the original query. But, we are not quite done yet. Now that I have this temp table populated, I still have to set the variables that I set in the original query:

 Launch code in new window » Download code as text file »

  • <!---
  • Now, we have to loop over the increments and find the
  • number of users that recyecled a net weight in the
  • given bucket.
  • --->
  • <cfloop
  • index="i"
  • from="0"
  • to="80"
  • step="5">
  •  
  • SET @number_#i#_#(i + 5)# = ISNULL(
  • (
  • SELECT
  • user_count
  • FROM
  • @number_step_5
  • WHERE
  • step = #i#
  • ),
  • 0
  • );
  •  
  • </cfloop>

This queries the temp table to find the value for that weight group. Since there might not be any pickups in that weight group, I must use an ISNULL() method call. The genius here though, is that the temp table [@number_step_5] should never have more than 16 or 17 rows in it and most likely less than that. That means that each of these subsequent queries that sets a variable will execute almost immediately with such a minimal amount of data to search.

Pivot tables are totally awesome! Thanks Rick, you helped me rock this SQL query hard core!

Download Code Snippet ZIP File

Comments (3)  |  Post Comment  |  Ask Ben  |  Permalink  |  Other Searches  |  Print Page




ColdFusion Jobs - Find or Post A ColdFusion Job Through DeveloperCircuit.com

Reader Comments

Hey great explanation my old lead developer explained them like this

http://labs.redbd.net/blog/index.cfm/2006/9/4/Build-a-Pivot-Table-in-ColdFusion-in-four-easy-steps

Posted by Rob on Jan 30, 2007 at 12:26 AM


I just learned of pivots today, and found this post when looking for the MSDN reference. I don't know how I missed it the first time around, but I thought it was cool that you were in the top 10 for my search.

Posted by Sammy Larbi on Apr 24, 2008 at 11:06 AM


@Sammy,

Yeah, Rick saved me a lot of time / trouble with this idea.

Posted by Ben Nadel on Apr 24, 2008 at 11:12 AM


Post Comment  |  Ask Ben


Home   |   Web Log   |   ColdFusion   |   Projects   |   Resume   |   Job Form   |   Search   |   Contact
Epicenter Consulting - Custom Software Solutions for Business Evolution HostMySite.com - The Leader In ColdFusion Hosting