Creating In-Memory Tables In SQL For Joining And Sumarizing

Posted September 11, 2006 at 2:49 PM

Tags: ColdFusion, SQL

I use this method all the time and it works great so I thought I would share it in case any one else might fancy it. Often times, I will need to get a summary of data form a SQL data table based on dates (either by day, month, year, or what have you - but this could be applied to non-date situation quite the same). The problem is, that I don't want to make too many calls to the database and I don't want to make some really complicated GROUP BY statement.

To overcome this, I create a temporary in-memory table that has, as its records, the dates that I need to summarize. I can't just create this table with initial data, so I have to declare the data table first:

 Launch code in new window » Download code as text file »

  • DECLARE @dates TABLE (
  • today DATETIME
  • );

In this case, the data table only has only column of type DATETIME which will be the date that I am sumarizing. Then, I use ColdFusion to insert dates into the temp table. In this case, I am inserting two weeks worth of days:

 Launch code in new window » Download code as text file »

  • <!--- Loop over the dates to create the table values. --->
  • <cfloop index="intOffset" from="14" to="0" step="-1">
  •  
  • INSERT INTO @dates (
  • today
  • )(
  • SELECT
  • <cfqueryparam
  • value="#Fix( Now() - intOffset )#"
  • cfsqltype="CF_SQL_TIMESTAMP"
  • />
  • );
  •  
  • </cfloop>

Notice that for each iteration, I am inserting one record into the @dates table. Also notice how I am creating the date; I am just subratracting the offset from Now() and Fix()'ing it to get the day-only value. REMEMBER the wicked sweet date-math features of ColdFusion.

Once I have this table, I select right out of it, and for each date, I create a summary of information from another table.

 Launch code in new window » Download code as text file »

  • SELECT
  • d.today,
  •  
  • <!--- Get the number of logins for this day. --->
  • (
  • SELECT
  • COUNT( * )
  • FROM
  • useraccount_login l
  • WHERE
  • l.date_logged_in BETWEEN d.today AND (d.today + 1)
  • ) AS login_count,
  •  
  • <!--- Get me the number of unique logins for this day. --->
  • (
  • SELECT
  • COUNT( DISTINCT l.useraccount_id )
  • FROM
  • useraccount_login l
  • WHERE
  • l.date_logged_in BETWEEN d.today AND (d.today + 1)
  • ) AS unique_login_count
  • FROM
  • @dates d
  • ;

In this case, we are not performaing a Join, but this methodology could easily be used for joining on temporary tables. What I really like about this is that I only go to the database once.

One other excellent feature of this is that I never have to worry about NULL values or missing values. If I was to try and do this with some sort of GROUP BY clause, I might not get all days. Imagine a situation where no user logged into the system on a given day. Even if I did a GROUP BY, I would not get that day returned in the result set. When you manually create the temporary table with all the required dates, you KNOW you are coming back with the right values.

Putting it all together we get:

 Launch code in new window » Download code as text file »

  • <!--- Declare temporary table. --->
  • DECLARE @dates TABLE (
  • today DATETIME
  • );
  •  
  • <!--- Loop over the dates to create the table values. --->
  • <cfloop index="intOffset" from="14" to="0" step="-1">
  •  
  • INSERT INTO @dates (
  • today
  • )(
  • SELECT
  • <cfqueryparam
  • value="#Fix( Now() - intOffset )#"
  • cfsqltype="CF_SQL_TIMESTAMP"
  • />
  • );
  •  
  • </cfloop>
  •  
  • <!--- Get data summaries for log-ins. --->
  • SELECT
  • d.today,
  •  
  • <!--- Get the number of logins for this day. --->
  • (
  • SELECT
  • COUNT( * )
  • FROM
  • useraccount_login l
  • WHERE
  • l.date_logged_in BETWEEN d.today AND (d.today + 1)
  • ) AS login_count,
  •  
  • <!--- Get me the number of unique logins for this day. --->
  • (
  • SELECT
  • COUNT( DISTINCT l.useraccount_id )
  • FROM
  • useraccount_login l
  • WHERE
  • l.date_logged_in BETWEEN d.today AND (d.today + 1)
  • ) AS unique_login_count
  • FROM
  • @dates d
  • ;

Download Code Snippet ZIP File

Post Comment  |  Ask Ben  |  Other Searches  |  Print Page





Reader Comments

Sep 11, 2006 at 4:45 PM // reply »
153 Comments

You might be better suited to have a permanent Pivot table in your database, that has 1 column (i) and however many rows you want (I actually have 3 tables with 100, 1000, and 10000 rows). Then, instead of having to worry about a table with days, your query becomes:

SELECT
DATEADD(DAY,0-p.i,dates.today) AS TheDate,
COUNT( * ) AS LoginCount,
COUNT(DISTINCT l.useraccountid) AS UniqueLoginCount
FROM
( SELECT DATEADD(DAY,DAY(GETDATE())-1,DATEADD(YEAR,YEAR(GETDATE())-2000,{d '2000-01-01'})) AS Today ) AS dates
INNER JOIN Pivot100 AS p ON (p.i BETWEEN 0 AND 13)
LEFT OUTER JOIN useraccount
login AS l ON (l.dateloggedin = DATEADD(DAY,1-p.i,dates.today))
GROUP BY p.i
ORDER BY 1 DESC

I know it looks a bit screwy, but bear with me.

The dates table is now made without CF's help by creating a subquery that returns one row with today's date, minus the hours, minutes, and seconds.

As I mentioned, your Pivot100 table was 1 column (i) with 101 rows of values 0-100 (yeah, yeah, I know, but it's useful to have a 0 sometimes).

The INNER JOIN looks weird because the ON clause doesn't actually relate the two tables. But it doesn't have to! The ON clause limits the Pivot100 table to just 14 rows, and the INNER JOIN then becomes a cartesian product. (You could do the same thing with a CROSS JOIN and then put the restriction in the WHERE clause, but I like the locality of having it in the ON clause. And not all DMBSs support CROSS JOINs.) You thus get a table that looks like:

today / i
2006-09-11 / 0
2006-09-11 / 1
2006-09-11 / 2
...
2006-09-11 / 13

You can then join to the login table based on (Today - i) days matching up with the login date. Since you now have all of your dates, you don't have to worry about it eliminating any of them, because it's a LEFT OUTER JOIN.

Voila! Same result, but in pure SQL.

Every database should have pivot tables. They make life sooo* much easier.


Sep 11, 2006 at 6:15 PM // reply »
7,572 Comments

Rick, you magnificent bastard. I never thought of having pivot tables just sitting there to use. That is a great idea! The SQL looks a bit more crazy, but I totally understand what you are doing. And, what's nice about have the in-line "view" for the pivot table is that you can create as many columns as you like. Very cool - a lot of times I do create several columns.

Pure dynamite!


Don
Feb 4, 2009 at 2:27 PM // reply »
42 Comments

This looks hopeful for solving my problem. Old stuff but maybe we can revisit it. I know about the pivot table trick, but the thing I see is that the date stays the same.

What I want is to be able to show (using this example) logins for a month for each person.

What I'm doing now is creating a temp table with 1 field (theDate) and filling it from startDate to endDate for the report. Then doing the joins to get the data out. I'm thinking there must be an easier way using a pivot table to maybe add the pivot data to the startDate and pulling the data that way.

But one other question that your code left me wondering about. Are you able to join a table in memory and a table on a server? Or did you pull all the data from the server and then do the join to the table in memory?


Feb 6, 2009 at 9:08 AM // reply »
7,572 Comments

@Don,

You can definitely join an in-memory table to a table on the server - that is one of the biggest benefits to using the in-memory tables.

As far as the dates and pivot tables, what I will often do for that is use a pivot table to create a set of dates based on month offsets. For example, I might use the first 12 rows of a pivot table to get the last 12 months worth of data where each pivot ID is a month of dates:

DECLARE @ThisMonth DATETIME;

SET @ThisMonth = '2009/02/01';

SELECT
. . . . ( 1 - p.id ) AS month_offset,
. . . . (
. . . . . . . . DATEADD( M, (1 - p.id), @ThisMonth )
. . . . ) AS month_start
FROM
. . . . pivot1000 p
WHERE
. . . . p.id <= 12

This will get the last 12 months. This resultant table can then be joined to another table to get grouped data.


Don
Feb 6, 2009 at 12:32 PM // reply »
42 Comments

Silly me. I was thinking of something else. I use ColdFusion and was thinking of a query of query going joining to a db, which can't be done as far as I know.

Anyway, what I was looking for was coming up with a query result showing the next 2 weeks and both free and taken days. This is for a small hotel. They want potential guests to pick a date and see the number and types of rooms available for 2 weeks on either side of that date.

This put me on the right track. One thing for the dates, I now use the day of the year vice actual dates. That way I just need 2 pivot tables, 1 for normal years and 1 for leapyears. Numbered from 1 to 365/366. When I get my date numbers I just add that many days to Dec 31 of the previous year and get the date.


Post Comment  |  Ask Ben

Recent Blog Comments
Mar 19, 2010 at 7:26 PM
MySQL 3/4 - com.mysql.jdbc.Driver And allowMultiQueries=true
Thank you very much for this post. Adding allowMultiQueries="true" in context.xml didn't help until I added it to url as allowMultiQueries=true Good idea is to use prepared statements and it will he ... read »
Jim
Mar 19, 2010 at 4:49 PM
Nobody Puts Baby In The Corner!
Wow. This is like suddenly finding a support group for your secret shame. I'm not alone! I always liked this movie, even though it is extremely cheesy. I just wish Jennifer Grey hadn't gotten the ... read »
Mar 19, 2010 at 4:47 PM
Application.cfc OnRequest() Method Affects OnError() Arguments
@Jason and @Ben, I've been doing some CF9 refactoring on our systems and noticed an odd occurrence with onError as well. Found a way to work around my problem, but what I saw was... Background: Our ... read »
Jim
Mar 19, 2010 at 4:44 PM
Shoot 'Em Up Starring Clive Owen And Paul Giamatti
I actually enjoyed this movie quite a lot. It was different, certainly, but I think they were going for more of a Quentin Tarentino-"wow, that was weird"-vibe than an actual spoof. Once I realize ... read »
Mar 19, 2010 at 4:34 PM
An Intensive Exploration Of jQuery With Ben Nadel (Video Presentation)
Hey I guess the video is down. Is there anyway you can upload to youtube or vimeo or some other service? Greatly appreciated. ... read »
Mar 19, 2010 at 4:24 PM
ColdFusion CFPOP - My First Look
@Ben Thanks for the follow up! The root of the problem had to do with being able to trace bounced emails to specific records in a DB table. Let's say you run an email campaign and you get 1,000 bou ... read »
Mar 19, 2010 at 4:15 PM
SQL COUNT( NULLIF( .. ) ) Is Totally Awesome
Thank you Ben and Tony! Either of these work for the summary report I am working on and the info is much appreciated! I think I like Tony's a little better because I won't have to educate every ... read »
Mar 19, 2010 at 3:35 PM
ColdFusion Path Usage And Manipulation Overview
@Ben, Sorry. Clarification. expandpath worked for me in application.cfc, but not in other templates. ... read »