I have never used the ColdFusion CFQuery blockfactor attribute before. I have heard about it, but never payed much attention. I came across it in an article the other day, so I thought I'd give it a spin to see how it performs... and I have to say, I am not really seeing any great results. To test it, I am doing a simple query that returns the union repeated union of the largest table I have in the database.
Here is the standard CFQuery tag, no blockfactor in use:
<!--- Test the speed of the query. ---> <cftimer label="Standard Query" type="outline"> <cfquery name="qHit" datasource="..."> SELECT t.id FROM ( ( SELECT id FROM web_stats_hit ) UNION ALL ( SELECT (id * 100) FROM web_stats_hit ) UNION ALL ( SELECT (id * 1000) FROM web_stats_hit ) UNION ALL ( SELECT (id * 10000) FROM web_stats_hit ) ) AS t ORDER BY t.id ASC </cfquery> <p> #NumberFormat( qHit.RecordCount, "," )# web hits found </p> </cftimer>
This returns 210,868 records in about a 800 ms (on average). That's a pretty good record set I think. Now, let's throw in the blockfactor attribute. In ColdFusion, this can be between 1 and 100 and stands for the number of records that the SQL server can return at a given time.
<!--- Test the speed of the query with BlockFactor. ---> <cftimer label="BlockFactor Query" type="outline"> <cfquery name="qHit" blockfactor="100" datasource="..."> SELECT t.id FROM ( ( SELECT id FROM web_stats_hit ) UNION ALL ( SELECT (id * 100) FROM web_stats_hit ) UNION ALL ( SELECT (id * 1000) FROM web_stats_hit ) UNION ALL ( SELECT (id * 10000) FROM web_stats_hit ) ) AS t ORDER BY t.id ASC </cfquery> <p> #NumberFormat( qHit.RecordCount, "," )# web hits found </p> </cftimer>
Again, this returns 210,868 records. This ran in about 900 ms on average. That's 100 ms SLOWER than the same query with no blockfactor. I tried adjusting the size of the blockfactor, but none of the numbers I used seemed to make a difference.
I understand that not all databases understand the blockfactor. But, I'm using Microsoft SQL Server. I don't know, but I am guessing that it understand blockfactor as it is like THE sql server out there (not intending to stir up conflict here).
So, operating under the assumption that MS SQL Server does indeed work with blockfactor, my testing shows (repeatedly) that the CFQuery that uses blockfactor performs slower than the non-blockfactor CFQuery. What gives? Is there something here that I am totally not getting? Is 211,000 records not enough for the blockfactor to make a dent? Why is the blockfactor running slower?
It's been a while, but as I remember it, blockfactor seemed to work best with oracle databases.
Damn you Oracle... Damn you :) Thanks for tip Rob.
A couple of things I've noticed over the years about blockfactor. First off, in Oracle they can make a HUGE difference. For very intense queries, where without the processing time could be 40,000+ milliseconds without blockfactor, I've seen them drop to 1,000+. Quite a difference.
Another thing, I think I remember about them, it isn't that MS SQL Server doesn't know how to use them, it might be that the driver you are using to connect to the database doesn't use them.
Last but not least, you don't always see the speed difference in a query if you only run it once. Sometimes you have to run it several time to see that there is in fact a difference.
Just my two cents, hope it sheds some light...
I appreciate the comment. I never thought about the actual driver having anything to do with it. I am a bit removed from all of that stuff. I also live in a MS SQL Server-centric universe :) Good to know though. Thanks.
One thing I remember about blockfactor that may be of large importance to you is that the blockfactor is not the NUMBER OF ROWS that the database should return in a single packet, but the SIZE OF THE DATA it should return in a single transfer packet.
You may find this quote from MM's coding guidlines useful:
Use blockFactor to increase query performance:
Adding blockFactor to a query can significantly improve performance. To add blockFactor, examine the data that is being returned. Determine the maximum size (in bytes) of each row. Take that size and determine how many times that number would divide into 32k. That number is your blockFactor, but be aware that the max blockFactor is 100. So, if for example you were getting 200 bytes per row, you could easily fit over 100 rows into the 32k buffer that CF 'grabs' at one time.
If you know at runtime that you will have less then 100 rows returned, for example you're writing a query that either returns 0 or 1 rows, do not bother adding the blockFactor attribute.
I have seen this attribute make a big difference when used correctly. Also consider how fast the network between your database and CF server are (did you mention where your DB is relative to your CF Server?)...on a faster network, the benefit may not show up until larger amounts of data are being passed or the complexity of the work blockFactor adds to the process is less than the benefit it provides to the data transfer process.
Anyway, I would suggest some more playing around on this, not that the CF Docs really give an accurate depiction of the blockFactor attribute in the first place.
Thanks for the good comment. Our local DB is on a different machine than our CF server. Same network... same shelf :) I will do some more playing around, but at over 200,000 records in my example, I was very suprised to only see no advantage, but to actually see a disadvantage to using blockfactor.
Working on a large query using the blockfactor tag attribute did make a difference in the return of the data. Using a data source connecting to a MS SQL Server 2005.
Thank you for the awesome site...