My web log results pages are now paginating. I have been meaning to get around to it, but have been too busy. Yesterday, Ray Camden pointed out that on pages with a LOT of entries, my code blocks were coming up all grey. I am not 100% sure, but I think the page was having trouble rendering because there was SO much information to display. This prompted me to get off my butt and paginate. Hopefully this has fixed that today.
I thought I would take this time to explain how I perform pagination as I know it is never an easy task in ColdFusion (or any other language for that matter). Over the years I have tried many different methods for pagination. Some were good, some were complete crap. What I have now seems to work the best of all things I have tried.
Before I dive into it, let me get out a few things that I consider when working on pagination:
That being said, let me quickly just outline my pagination, then I will go into detail about the individual steps:
That's the basic idea. Now, let's get down and dirty with some detail information. The player in the game is my pagination object. I start off by setting up default values for it:
Launch code in new window » Download code as text file »
First, I get the a new instance of the DataPager.cfc from my service factory. The service factory is just returning a CreateObject() call combined with the proper Init() method. Then I set my page offset. This either defaults to zero (or one - I can never remember what the proper thing is) or is based on what is passed via the URL / FORM object. Then, I set the page size of the pagination results (in this case 10 items per page). Then, I set up the URL for all the pagination links. The paginator uses a constant [OFFSET] to dynamically create url's when outputting the pages and the previous / next links.
So, regarding the [OFFSET], if we take the above example, where the script name is "blog-entries.htm", the static url would be:
Launch code in new window » Download code as text file »
... and the dynamic urls for pages 1 - 4 would be:
/blog-entries.htm?search_offset=1
/blog-entries.htm?search_offset=11
/blog-entries.htm?search_offset=21
/blog-entries.htm?search_offset=31
As you can see, the pager automatically substitutes the proper offset when outputting the urls.
I think now would be a good time to explore what is really going on under the pagination object. Here is a list of the methods available in the object (public and private):
Init()
Calculate()
GetBucket()
GetBucketEndPage()
GetBuckets()
GetBucketSize()
GetBucketStartPage()
GetEnd()
GetNext()
GetOffset()
GetPage()
GetPages()
GetPageSize()
GetPrev()
GetRecordCount()
GetStart()
GetURL()
HasMultiplePages()
ReplaceOffset()
SetBucketSize()
SetOffset()
SetPageSize()
SetRecordCount()
SetUrl()
And, here is the code for the DataPager.cfc (I will be adding this to snippets later):
Launch code in new window » Download code as text file »
I haven't updated this code in a bit, so it doesn't quite hold up to my new standards, but that is personal. If you take a quick look though, you will see that this object does a fair amount of validation to make sure that only valid pagination items get returned.
Now, let's get into the first database hit: getting the full set of IDs. This is the only part that returns extra data. If our search criteria matches 500 records, then we are returning 500 IDs (even if we are only outputting 10). I know this is lame, but it's the best way I could figure out how to perform and validation pagination in ColdFusion while duplicating the least amount logic and data returns.
Let's look at the query:
Launch code in new window » Download code as text file »
As you can see here, I using all the search criteria (years, months, tag ids) but I am ONLY returning ID's of the blog entries. You will also notice that I am returning the IDs in proper ORDER BY fashion. This will come into play later on, so keep it in the back of your mind.
Now that we have the default paginator and the ID list, I update the pagination object:
Launch code in new window » Download code as text file »
As you can see, I am telling the paginator how many records we are dealing with (via SetRecordCount()). Then we tell the paginator to calcualte(). This takes the record count and the offset and creates all the buckets, page numbers, starting and stopping offsets, urls, and performs all validation (ie. making sure the offset is not beyond the scope of the results set). For more in depth look, see the CFC code listed above.
Once the pager is done calculating we have everything we need to perform pagination. Except of course, the data for the current page of results. This accounts for my second call to the database. It returns ALL information requires for JUST THIS SET (ie. 10 items worth of information based on 10 items per page).
Launch code in new window » Download code as text file »
As you can see, this query gets a full SELECT statement. We are returning all relevant information here. However, as I stated before, we are ONLY returning information for the results we what to display. How is that done? Through the ID loop in the JOIN clause (would be in the WHERE clause if I didn't have a JOIN to work with):
Launch code in new window » Download code as text file »
The CFLoop tag here is just a simple index loop that asks the pager where to START and STOP. If we are on our first page of results, then REQUEST.Pager.GetStart() and REQUEST.Pager.GetEnd() would return 1 and 10 respectively. If we were on our second page of results, it would return 11 and 20 respective (and so on). These values are determined using the Calculate() method discussed above. So, just to recap, this CFLoop determines which records are going to be returned by only selecting the records whose IDs match the subset of IDs returned in the original ID query.
Determining what records to return is one thing, sorting them is another. Remember that the original ID query returned the IDs in proper sort order. We are using that information to order the second query:
Launch code in new window » Download code as text file »
This CFLoop creates a CASE statement in which records are ordered based on where their IDs fall in the order of the original query. If the current ID (in the query) matches the ID in the original query then it's ORDER BY value is the record number of the ID in the original query. Since record numbers are ascending, the ORDER BY will sort the records properly.
There is a lot of stuff going on there, so it might take some time to understand. Getting your head wrapped around the ORDER BY might be the hardest part. But the bottom line here is that I am returning a minimal amount of excess data (in this case lots of IDs even for a subset if IDs being display). And, I am NOT repeating any search logic. The search criteria is ONLY used in one query. The second query merely uses the IDs returned by the first. I think one of the coolest things though, just to drive it home, is that the main query (the second one that returns complete detail data) returns data ONLY FOR THE RECORDS BEING DISPLAYED.
Just really beat a dead horse, to contrast this method, if you were to use the CFOutput method using STARTROW and MAXROWS, or the CFLoop method using STARTROW and ENDROW, both of those would require returning more data from the database than what was required for display. My method, while returning excess ID data, minimizes the return of detailed information (such as large text fields).
Ok, so that is how the data is returned. Now, let me quick cover how I output my pagination code:
Launch code in new window » Download code as text file »
This code is a bit crammed and hard to read, but if you pick through it, it is using the pagination object to loop over the pages / buckets to display page links and previous / next links. The key to it all is the ReplaceOffset() method called on the paginator:
Launch code in new window » Download code as text file »
This is just one instance of the use, but it is a good example. The GetNext() method returns the offset of the next Page of results (would return 11 if we were on page one, showing results 1 - 10). The ReplaceOffset() method then takes the passed in offset and returns the dynamic URL based on the given offset:
/blog-entries.htm?search_offset=11
How easy is that? I most of my pagination display stuff in custom tag widgets so you deal with it once and then it is done. I know that this was a lot to take in. Hopefully I can put a demo of it up in the Snippets directory shortly.
Now, this is NOT the best method out there I am sure, but it is the best method that I feel comfortable with.
Download Code Snippet ZIP File
Comments (7) | Post Comment | Ask Ben | Permalink | Other Searches | Print Page
Project HUGE : Physical Therapy Is Over
Randomly Sort A ColdFusion Array (Updated) - Thanks Mark Mandel
Ben,
Here's a MSSQL Stored Proc I wrote for handling pagination:
http://blog.pengoworks.com/blogger/index.cfm?action=blog:546
This stored proc is very efficient and it never requires returning extra data. I used to handle it via CF, but it always ended up requiring doing something that really started causing performance hits on large datasets.
This solution has worked well even over millions of records.
Posted by Dan G. Switzer, II on Oct 9, 2006 at 3:23 PM
Dan,
That looks very impressive. Unfortunately, it's a bit beyond me at this point. I have never used a stored procedure. I tried to read through it, but I am getting a bit lost. How do you return the number of overall records (other than the current page of records ... ie. 100 of 1-10). Is that a feature of stored procedures (I know you can return more than one set of data) or are you returning it as an additional query column?
Thanks for the link though, excellent food for thought.
Posted by Ben Nadel on Oct 9, 2006 at 6:00 PM
@Ben:
The first 8 parameters are input parameters--values that are being passed in to the stored procedure.
The last 4 paramters are output parameters--which are values that will be returned from the stored procedure. It's the output parameters that contain information such as: total number of records, total number of pages, the current page (while this is usually the same value as the SetPage input parameter, it can be differently--like if you specified a SetPage value outside of the actual page boundaries.)
I recommend first playing around w/the stored procedure in Query Analyzer--just to get familiar with it. The usage example I give should run fine in Query Analyzer--provided you supply real table information.
The stored procedure always outputs a lot of information in the message blocks to tell you how it processed the query.
So the first step would be to create the stored procedure on the database you want to use it on. You should be able to cut and paste the code from my blog into query anaylzer and run it on your database.
Here would be an example of some CF code, I'm using the MSSQL Northwind Database as an example (which would need the stored proc in order for the code to work.)
<cfstoredproc procedure="spSelectNextN" datasource="Northwind">
<!---// pass in the following parameters //--->
<cfprocparam cfsqltype="cfsqlvarchar" type="in" value="Customers" />
<cfprocparam cfsqltype="cfsqlvarchar" type="in" value="CustomerID, CompanyName, ContactName, ContactTitle, Address, City, PostalCode, Phone, Fax" />
<cfprocparam cfsqltype="cfsqlvarchar" type="in" value="CustomerID" />
<cfprocparam cfsqltype="cfsqlinteger" type="in" value="2" />
<cfprocparam cfsqltype="cfsqlinteger" type="in" value="10" />
<cfprocparam cfsqltype="cfsqlvarchar" type="in" value="Country <> 'USA'" />
<cfprocparam cfsqltype="cfsqlvarchar" type="in" value="CompanyName ASC, ContactName ASC" />
<cfprocparam cfsqltype="cfsqlbit" type="in" value="0" />
<!---// return the following parameters //--->
<cfprocparam cfsqltype="cfsqlinteger" type="out" variable="iTotalRecords" />
<cfprocparam cfsqltype="cfsqlinteger" type="out" variable="iTotalPages" />
<cfprocparam cfsqltype="cfsqlinteger" type="out" variable="iCurrentPage" />
<cfprocparam cfsqltype="cfsqlvarchar" type="out" variable="sSqlString" />
<!---// return the following resultsets //--->
<cfprocresult name="GetCustomers" resultset="1" />
</cfstoredproc>
Here's what it all means:
1) The 1st parameter is the table name to use.
2) The 2nd parameter is the columns to retrieve.
3) The 3rd parameter is the identity column from the table
4) The 4th parameter is the page of results you'd like to display
5) The 5th parameter is the number of records each page contains
6) The 6th parameter is the WHERE clause you want to use--this can be NULL.
7) The 7th parameter is the ORDER BY clase--which can be null as well.
8) The 8th parameter tells the stored proc whether or not to use the DISTINCT keyword on the query (1=DISTINCT, 0=normal)
Now we get to the output parameters.
1) The 9th parameter would return the variable iTotalRecords which would contain the total number of records the query would return.
2) The 10th parameter would return the variable iTotalPages which would contain the total number of pages found.
3) The 11th parameter would return the variable iCurrentPage which would return the actual page number the stored procedure returned.
4) The 12th parameter would return the variable sSqlString which would contain the actual SQL query statement that was executed.
Lastly, the actual results of the query are returned in the variable GetCustomers.
It's actually much simiplier than it may look. If you have really complex SQL statements you need to run pagination on, convert them to views. Usually those complex statements are used in various spots in your application and views will give you better performance. Converting the complex SQL to a view will allow you to use this stored procedure to paginate the results.
Posted by Dan G. Switzer, II on Oct 10, 2006 at 8:50 AM
Dan... thanks for the most excellent explanation. Letting it set in :)
Posted by Ben Nadel on Oct 11, 2006 at 10:29 PM
Ben and Dan,
The ColdFusion method and the stored proc are both great, thank you!
I will certainly end up basing my own pagination logic on either one or another (or both methods). I am still unsure on why you choose to use a CASE statement in the second query... Can't we simply copy the ORDER BY clause from the query that gives us the ID's?
Thanks again!
Posted by Mark on Oct 26, 2006 at 10:48 PM
Mark,
You could just repeat the ORDER BY clause from the ID query. The idea of the CASE statement, however, was to try and keep it generic. I was doing my best not to repeat any logic. I did the ordering based on DB values when I got the IDs. Since I am doing this for all pagination methods, I can more or less copy and paste the ORDER BY in the final query and it works.
Basically, its a combo of not repeating myself and making copy/paste easier.
Posted by Ben Nadel on Oct 27, 2006 at 7:25 AM
Ben,
I like your simple SQL Stored Proc. However, you did not include any code on how you do things on the CF side. Can you give me an exmple of how you call the proc in your CF page and display the pages, etc?
Thanks,
Matthew
Posted by Matthew Mitchell on Oct 17, 2007 at 11:36 AM