Caching ColdFusion Queries Inside Other ColdFusion Queries
Posted November 8, 2010 at 9:27 AM by Ben Nadel
Over the weekend, I was playing around with some query caching for the blog when I found myself in a situation where I had two primary tables and a join table. Up until then, I had only had to deal with the caching of a single table, so persisting multiple tables in a relationship wasn't immediately obvious. I figured I could have cached all three tables and then just performed a query-of-queries in my output, joining the three tables at render-time; but, this felt somewhat sloppy. Then it occurred to me - ColdFusion query objects can hold any type of data. Why not perform the JOIN at cache-time and then simply store the resultant query inside of the primary query recordset.
Since I don't have a true caching framework set up for this demo, we're just going to use the ColdFusion Request scope as our makeshift "cache." After all, this post isn't really about caching best practices - it's about the awesome flexibility of the ColdFusion query object and how it can be leveraged within a caching mechanism.
For the demo, we need to build two primary tables - girlTable, traitTable - and our join table - girlTraitJNTable. Without having to read the build-code below, the girlTraitJNTable contains a girlID and a traitID which allows us to define a many-to-many relationship between girls and traits.
- <!---
- The first thing we are going to do is build three data tables
- to describe our girl data. We need the girls, the traits, and
- then the table that joins them:
-
- girl: id, name
- trait: id, name
- girl_trait_jn: girlID, traitID
- --->
-
- <!--- Build our girls query. --->
- <cfset girlTable = queryNew( "" ) />
-
- <!--- Add ID column. --->
- <cfset queryAddColumn(
- girlTable,
- "id",
- "cf_sql_integer",
- listToArray( "1,2,3" )
- ) />
-
- <!--- Add NAME column. --->
- <cfset queryAddColumn(
- girlTable,
- "name",
- "cf_sql_varchar",
- listToArray( "Sarah,Joanna,Tricia" )
- ) />
-
-
- <!--- ----------------------------------------------------- --->
- <!--- ----------------------------------------------------- --->
-
-
- <!--- Build our traits query. --->
- <cfset traitTable = queryNew( "" ) />
-
- <!--- Add ID column. --->
- <cfset queryAddColumn(
- traitTable,
- "id",
- "cf_sql_integer",
- listToArray( "1,2,3" )
- ) />
-
- <!--- Add NAME column. --->
- <cfset queryAddColumn(
- traitTable,
- "name",
- "cf_sql_varchar",
- listToArray( "Spunky,Seductive,Silly" )
- ) />
-
-
- <!--- ----------------------------------------------------- --->
- <!--- ----------------------------------------------------- --->
-
-
- <!--- Build our join query. --->
- <cfset girlTraitJNTable = queryNew( "" ) />
-
- <!--- Add girlID column. --->
- <cfset queryAddColumn(
- girlTraitJNTable,
- "girlID",
- "cf_sql_integer",
- listToArray( "1,1,2,3,3,3" )
- ) />
-
- <!--- Add traitID column. --->
- <cfset queryAddColumn(
- girlTraitJNTable,
- "traitID",
- "cf_sql_integer",
- listToArray( "1,3,2,1,2,3" )
- ) />
Now that we have our fake database in place, let's take a look at a page request that needs to perform a double-check lock in order cache our girls query in memory:
- <!---
- NOTE: I am using the REQUEST scope here as my cache scope. This,
- of course, makes no sense as the request scope is short lived.
- For demo purposes, however, this will work.
- --->
-
- <!---
- Param the query as a simple value. Normally, we would just get
- our cached item back from the cache; but, since we don't have a
- true caching mechanism for this demo, we can fake our experiment
- with CFParam.
- --->
- <cfparam
- name="request.girls"
- type="any"
- default=""
- />
-
- <!--- Check to see if the cached query is a query. --->
- <cfif !isQuery( request.girls )>
-
- <!---
- The cached query does not exist in the proper format; now,
- we have to create it. Let's implement a lock (a double-
- check lock) to make sure we only create this cached query
- as needed.
- --->
- <cflock
- name="girls_cache_lock"
- type="exclusive"
- timeout="5">
-
- <!---
- Perform the double-check to make sure that another
- thread didn't create the cached query while we were
- waiting to obtain the lock.
- --->
- <cfif !isQuery( request.girls )>
-
-
- <!--- Query for the girls. --->
- <cfquery name="girls" dbtype="query">
- SELECT
- id,
- name,
-
- <!---
- When we query for the girls, we are going
- to create an empty column to hold our JOIN
- table. This way, we can cache all of the
- related tables.
- --->
- ( '' ) AS traitsQuery
- FROM
- girlTable
- ORDER BY
- name ASC
- </cfquery>
-
-
- <!---
- Now that we have the girls query, let's populate
- the traitQuery column with the traits that apply to
- this girl.
- --->
- <cfloop query="girls">
-
- <!--- Query for the linked-traits for this girl. --->
- <cfquery name="traits" dbtype="query">
- SELECT
- id,
- name
- FROM
- traitTable,
- girlTraitJNTable
- WHERE
- traitTable.id = girlTraitJNTable.traitID
- AND
- girlTraitJNTable.girlID = #girls.id#
- </cfquery>
-
- <!---
- Cache the associated traits directly in the girls
- query as the traitsQuery proprety.
- --->
- <cfset girls[ "traitsQuery" ][ girls.currentRow ] = traits />
-
- </cfloop>
-
-
- <!---
- Now that we have created the girls, cache the
- query in the request scope (our make-shift
- caching mechanism).
- --->
- <cfset request.girls = girls />
-
-
- </cfif>
-
- </cflock>
-
- </cfif>
-
-
- <!--- ----------------------------------------------------- --->
- <!--- ----------------------------------------------------- --->
-
-
- <!---
- Now that we know that our cached query exists, let's out
- the girls and their traits.
- --->
- <cfoutput>
-
- <!--- Loop over the cached girls. --->
- <cfloop query="request.girls">
-
- #request.girls.name#<br />
-
- <!---
- Get teh traits from the girls record.
-
- NOTE: We purposely made the query column a non-practical
- name so that it wouldn't conflict with our local
- variable. An unfortunate side-effect of the fact that
- query objects allow non-scoped columns.
- --->
- <cfset traits = request.girls.traitsQuery />
-
- <!--- Output traits for this girl. --->
- <cfloop query="traits">
-
- --- #traits.name#<br />
-
- </cfloop>
-
- <br />
-
- </cfloop>
-
- </cfoutput>
When the request comes in, we check to see if the girls cache item is a query object. If it is not, we then single-thread the cache creation and before gathering the girls data. As we do this, notice that we are performing a sub-query for each girl record in order to find the traits associated with the given girl. Once we have this sub-query, we store it as a column in our girls query. If we were to CFDump the result query out, it would look like this:
| | | | | |
| | ![]() | | ||
| | | |
As you can see, each related traits query is stored as a property of a particular girl's record. In this way, when we cache the girls query, we are caching all of the necessary derived traits queries with it. This allows us to render the future output without having to do any kind of render-time query-of-queries; all of our relationships have been found and cached ahead of time and can be used with a few simple CFLoop tags.
When we run the above code, we get the following output:
Joanna
--- Seductive
Sarah
--- Spunky
--- Silly
Tricia
--- Spunky
--- Seductive
--- Silly
As you can see, this worked extremely well.
When storing related queries within another query, it is best to use a column name that is not practical. For example, I used "traitsQuery" as my sub-query column name rather than something more convenient like "traits." The reason for doing this is that ColdFusion, unfortunately, allows for non-scoped query column references. By using a less practical column name, traitsQuery, it allows us to create reference to the sub-query, traits, without having to worry about naming collisions between our sub-query reference and our primary query column.
If you are using ColdFusion 9 with ORM integration, I am sure that this post is moot as you can immediately load an entity with all of its composed properties. If you're not dealing with an ORM-enabled system, however, being able to cache queries inside of other queries is simply beautiful. Doing this over the weekend made me fall in love with the ColdFusion query object all over again.
Reader Comments
I'd be a bit cautious with storing complex data in a query. Have you tried to QoQ against your final query? CF scans the first few rows to make guesses on the types and I wonder if it will throw a fit on the complex data.
@Ray,
I haven't tried doing query of queries with this particular setup only because the intent was designed to not have to do any querying at render time (on looping).
But, I know that Elliott Sprehn used the ColdFusion query object to do query-of-queries with complex data in one of his CFUNITED talks (the second one about trusting the application code). In that, he basically stored a complex object and then "sort" column and used the sort column to sort the list of complex objects. I don't recall him talking about any complications.
As awesome as ColdFusion query objects are, I know they are fraught with all kinds of type-casting headaches; however, I think those only apply to the column you are actually trying to do something within when queries. Meaning, ORDER BY and WHERE clauses where you are performing logic on the given column. I am not sure that ColdFusion does much with the data in column that only simply being retrieved.
If he used a sort col to allow for sorting of the complex data, then it must mean CF won't barf on sorting the other columns. That was my concern (since it wouldn't make sense to sort on the complex data itself). So if CF blissfully ignores those columns than it should be good to go.
@Ray,
I'm pretty sure that's what I remember happening.
Hi Ben,
More insight into QoQ, great topic, as it is innovative to build custom queries...Thanks for posting with good explanation as always...
@CFFan,
Glad you like. The ColdFusion query object is - beautiful - for lack of a better term :)
I have a question about QoQs that I can't quite resolve. I read your post about joining on the LOCAL var scope and having to escape it as [local].table...
But it absolutely will not allow me to join a QoQ like this:
SELECT *
FROM Application.MainCache1,Application.MainCache2
WHERE
Application.MainCache1.id = Application.MainCache2.id
It basically says (custom error catcher):
Date: {ts '2010-11-11 20:51:49'}
Message: Error Executing Database Query.
Detail:
Query Of Queries syntax error.
Encountered "Application . MainCache1 .. Incorrect conditional expression, Expected one of [like|null|between|in|comparison] condition,
Type: Database
SQL: SELECT * FROM Application.MainCache1 , Application.MainCache2 WHERE Application.MainCache1.id = Application.MainCache2.id (more stuff edited out)
I *think* the problem is in all that extra whitespace markup.. but in my code I have NONE of that whitespace. I've tried various escape techniques ( [name], ['name']), and even trimming the string.
Any thoughts?
It's CF8 running on a W2k3 Server. If I remove
Application.MainCache1.id = Application.MainCache2.id (more stuff edited out)
As expected it just spins it's wheels and goes nowhere, consuming ALL ram if allowed to.
In the meantime, I have just run two separate queries and passed the results to the other object, but it'd be nice to streamline it.
Is the problem with the extended struct? perhaps application?
@Brad,
Have you tied:
1. Escaping [Application] to see if that is messing things up?
2. Creating intermediary variables for the query values before trying to join them (ie. c1 = application.MainCache1, c2 = Application.MainCache2).
That might help.
Works for me.
I don't use this for cache oriented systems. When I freelance, I often get clients who change the spec a dozen times before we get to the final product.
A neat little 'RAD' trick is to serialize forms or data structures that change often. In this way, the application still works, and I don't spend time optimizing a table structure, or even changing a single variable. I just add or remove fields.
I use this to set nested structures as columns. This would probably require my own blog post, but we define forms as YAML structures. We parse them into Structs, render them and cache them. Structs are stored straight into the database and read like query.column.struct...
No problems at all, ever.
@Brad,
I've definitely used some XML-style approaches to data storage on small apps. It definitely makes for some sweet RAD approaches.
Have you looked into the NoSQL databases? I went to a preso recently on CouchDB and MongoDB. Both of them use "documented" databases in which each key can point to a non-uniform document. This sounds like something you might like to look into.
@Ben
I am still in my NoSQL discovery phase. I have a hundred ideas swimming in my 'lizard brain.' In fact, it's why I am up right now! The tactics from my previous post are from previous experience and are tried and tested (by me at least). Until I become more versed in NoSQL I refrain from selling my ideas.
I actually came across this today:
http://experts.adobeconnect.com/p93766981/?launcher=false&fcsContent=true&pbMode=normal
And it covers Railo + CouchDB. I fear Railo has stolen my heart. It's a far departure from what most of us are used to, but I am thinking this is great for almost any sort of user centric project.
There is simply so much new material to learn that it's crazy! The fact that couch is HTTP based means almost infinite distributed scalability ! and replication looks like it's a snap.
I'd be interested in anything you'd write on it if your thinking about learning it too.
@Brad,
Yeah, I know what you mean, re: "There is simply so much new material to learn that it's crazy". I couldn't agree more :) It's both at the same time super exciting and intensely daunting. Hopefully, I'll get some time to try out the NoSQL stuff for myself - if I can, you better believe I'll be writing about it.
I think most of the NoSQL engines are HTTP/REST based. It definitely makes it a very interesting setup. Heck, any time you add HTTP to invocation approach, it just makes for more possibilities.
Any demo of CouchDB has always been based on Apache. I am curious to see if Litespeed offers any performance gain ?
Going to try to bench the two this week =]
Ben,
Your examples have helped me tons! Thanks.
Regarding the "double check lock", this would lock out all users while the cache is repopulated, yes?
I'm trying to come up with a way to allow users to get data, even while a cache is being populated. Because, if it takes 5 minutes to populate the cache, as soon as one user hits the expired cache and locks the segment of code to repopulate, all users will be sitting for 5 minutes.
Seems like this would be a common issue.




