# Ask Ben: Querying Groups With Conditional Elements

Posted January 14, 2008 at 7:00 AM by Ben Nadel

I hav a situation in my academic project, and was struggling to get a proper approach towards the solution, then i decided to ask 'kind-hearted' ppl for help ... while browsing net for forums wid posts with similar situation, i 'stumbled' upon you site. it's a nice and great job u r doin here ben. thanks a ton... do consider to direct me towards the solution for the following situation.

The following table reflects the One-to-Many relationship between TransID and TagID

--------------------
|TransID | TagID |
--------------------
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
| 3 | 3 |
| 3 | 4 |
--------------------

i would need to know how exactly i should be aproaching the problem to find out solution for following situations:

STRICT MODE:
------------

* TransID with TagID (1 AND 4 AND Not 2)

Result : i should get only TransID:3 as the result. (ALL Conditions shud return true, so in the interface the user would have selected Having 1,4 and not having 2).

Relaxed Mode:
-------------

* TransID with TagID (1 OR 4 OR Not 2)

Result : i should get TransID:1,2,3 as the result, since each of the TransID satisfies one or other condition. (ANY Condition returning true the TransID qualifies to be returned, and in the interface the user would hav selected Having 1,4 and not Having 2)

Before we get into this, let's start out by building a SQL temp table into which I can insert the test data:

• <!---
• Create the SQL build scripts so that we can reuse it
• within the different CFQuery tags.
• --->
• <cfsavecontent variable="strBuildSQL">
• DECLARE @data TABLE (
• trans_id INT,
• tag_id INT
• );
•
•
• <!--- Populate data. --->
• INSERT INTO @data
• (
• trans_id,
• tag_id
• )(
• SELECT 1, 1 UNION ALL
• SELECT 1, 2 UNION ALL
• SELECT 2, 1 UNION ALL
• SELECT 2, 2 UNION ALL
• SELECT 2, 3 UNION ALL
• SELECT 3, 3 UNION ALL
• SELECT 3, 4
• );
• </cfsavecontent>

Now that we have that in place, our @data table should mimic what the user has described (although I have changed the column names to line up more appropriately with my own database naming conventions).

Ok, so let's take a look at the problem. For the Strict mode part, the first thing that jumps to mind is grouping. We want to check out the group of TransIDs and make sure that each group contains either TagID 1 or 4, but definitely not TagID 2. To examine groups in SQL Server, we can use the GROUP BY clause. Then, to examine the contents of each group, we can leverage the fact that the COUNT() aggregate can handle CASE statements. If you remember my exploration of the COUNT() / CASE scenario, you will remember that the COUNT() aggregate does not include NULL values in its tally. Therefore, if we can conditionally include values, we can use the COUNT() of each group to narrow down the selection.

Taking this understanding and applying it to the problem at hand, we want to make sure that each TransID group has at least one count of the TagID 1 or 4 and a zero count of the TagID 2:

• <!--- Query for TransID values. --->
• <cfquery name="qStrict" datasource="#REQUEST.DSN.Source#">
• <!--- Build SQL table and populate. --->
• #PreserveSingleQuotes( strBuildSQL )#
•
•
• <!--- Get the appropriate trans IDs. --->
• SELECT
• d.trans_id
• FROM
• @data d
• GROUP BY
• d.trans_id
• HAVING
•
• <!---
• Make sure that the group has either
• tag ID 1 or 4.
• --->
• COUNT(
• CASE
• WHEN
• d.tag_id IN ( 1,4 )
• THEN
• d.tag_id
• ELSE
• NULL
• END
• ) > 0
• AND
• <!---
• Make sure that the group does not also contain
• the tag id, 2.
• --->
• COUNT(
• CASE
• WHEN
• d.tag_id = 2
• THEN
• d.tag_id
• ELSE
• NULL
• END
• ) = 0
• ;
• </cfquery>
•
•
• <!--- Output out results. --->
• TransID: #ValueList( qStrict.trans_id, ', ' )#

Running this SQL, we get the following output:

TransID: 3

This works, but it looks really complicated. We can start to simplify the query if we look at the problem in a slightly different way. We can think of the 1 and 4 requirements as filter and then the NOT 2 as a secondary layer of filter. Let's run a query that includes just the first part of that filtering:

• <!--- Query for TransID values. --->
• <cfquery name="qStrict" datasource="#REQUEST.DSN.Source#">
• <!--- Build SQL table and populate. --->
• #PreserveSingleQuotes( strBuildSQL )#
•
•
• <!--- Get the appropriate trans IDs. --->
• SELECT
• d.trans_id
• FROM
• @data d
• WHERE
• d.tag_id IN ( 1,4 )
• GROUP BY
• d.trans_id
• ;
• </cfquery>
•
•
• <!--- Output out results. --->
• TransID: #ValueList( qStrict.trans_id, ', ' )#

This gives us the following output:

TransID: 1, 2, 3

Good, the first part works. Now, we want to apply the second filter, not having TagID 2. We can't just add another simple WHERE clause for != 2 since any TagID that is 1 or 4 is ALSO not equal to 2. What we need to do is apply the not 2 to each group of TransIDs. To do this, we are going to add a NOT EXISTS clause:

• <!--- Query for TransID values. --->
• <cfquery name="qStrict" datasource="#REQUEST.DSN.Source#">
• <!--- Build SQL table and populate. --->
• #PreserveSingleQuotes( strBuildSQL )#
•
•
• <!--- Get the appropriate trans IDs. --->
• SELECT
• d.trans_id
• FROM
• @data d
• WHERE
• d.tag_id IN ( 1,4 )
• AND
• <!---
• Make sure that this TransID value does not
• exists in a group that ALSO contains a TagId, 2.
• --->
• NOT EXISTS
• (
• SELECT
• 1
• FROM
• @data d2
• WHERE
• d2.trans_id = d.trans_id
• AND
• d2.tag_id = 2
• )
• GROUP BY
• d.trans_id
• ;
• </cfquery>
•
•
• <!--- Output out results. --->
• TransID: #ValueList( qStrict.trans_id, ', ' )#

Running the above code, we get the following output:

TransID: 3

This version works as well. To me, this is somewhat easier to read than the leveraged COUNT() aggregate; however, I am not sure exactly which one is going to be faster. I am sure that it will depend somewhat on the size of the table and on the indexes that are in place.

Now that we have the Strict Matching out of the way, let's take a look at the Relaxed Matching mode. Relaxed matching is much easier as it doesn't really apply to the group level, but rather to the individual records being examined. As such, we can simply use a straightforward WHERE clause to limit each returned TransID and then group by the TransID to make sure we don't get duplicate values:

• <!--- Query for TransID values. --->
• <cfquery name="qRelaxed" datasource="#REQUEST.DSN.Source#">
• <!--- Build SQL table and populate. --->
• #PreserveSingleQuotes( strBuildSQL )#
•
•
• <!--- Get the appropriate trans IDs. --->
• SELECT
• d.trans_id
• FROM
• @data d
• WHERE
• d.tag_id = 1
• OR
• d.tag_id = 4
• OR
• d.tag_id != 2
• GROUP BY
• d.trans_id
• ;
• </cfquery>
•
•
• <!--- Output out results. --->
• TransID: #ValueList( qRelaxed.trans_id, ', ' )#

Running the above code, we get the following output:

TransID: 1, 2, 3

Works fine and as you can see, relaxed mode matching is much easier and faster to perform.

### Looking For a New Job?

25% of job board revenue is donated to Kiva. Loans that change lives - Find out more »

Jan 15, 2008 at 10:11 AM // reply »

Very thanks for the solution. please consider the situation,

if the 'Strict mode' is made more strict,

i.e when TransID with TagID (1 AND 4 AND Not 2) means :
TransIDs are expected to have both TagID 1 AND 4 (in the solution you coded, using TagID (1,4), will result in '1 OR 4')

so only when user is searching for TransID with TagID (1 AND 2 AND NOT 3) only the TransID: 1 satisfies the condition and should be returned ...

i am using a query like

SELECT TransID
FROM Trans_Tags
WHERE TransID IN (SELECT TransID in Trans_Tags WHERE TagID = 1) AND
WHERE TransID IN (SELECT TransID in Trans_Tags WHERE TagID = 2)
AND
WHERE TransID NOT IN (SELECT TransID in Trans_Tags WHERE TagID = 3)

will this query run efficiently in a huge tables with say a million records ?! or could you suggest me any better approach ?!

Jan 25, 2008 at 9:47 AM // reply »

@Vasanth,

I have been told that it is more efficient to use the EXISTS construct rather that the IN construct. Also, if the tables are indexed on the proper columns, then the look up should be fairly fast.

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

 Author Name: Author Email: Author Website: Comment: Supported HTML tags for formatting: bold   italic   code Remember my information Subscribe to comments Send me a copy of this comment
InVision App - Prototyping Made Beautiful With Prototyping Tools Recent Blog Comments
Mar 11, 2014 at 6:20 PM
Reading In File Data One Line At A Time Using ColdFusion's CFLoop Tag Or Java's LineNumberReader
Thanks Ben for you very interesting articles on handling large files. As I am not a programmer, I am quite insecure whether my issue is directly related to your description or not. My problem is, tha ... read »
Mar 11, 2014 at 12:21 PM
Compound Transclusion Prevented In AngularJS 1.2
Yeah this hit our team too and locked us as 1.1.5. It seemed like an irresponsible unilateral decision, but our graver concern is the future decisions the Angular team will make with other fundamenta ... read »
Mar 11, 2014 at 2:00 AM
ColdFusion, jQuery, And "AJAX" File Upload Demo
Mar 10, 2014 at 8:24 PM
Nested Views, Routing, And Deep Linking With AngularJS
@Steven, the action property is not something from AngularJS but rather part of the concept that Ben is describing in his blogpost here. it is a custom added property, which is read by the requestCo ... read »
Mar 10, 2014 at 2:03 PM
Nested Views, Routing, And Deep Linking With AngularJS
Where is the angular documentation for 'action:' it is not here: http://docs.angularjs.org/api/ngRoute/provider/ \$routeProvider Thanks. ... read »
Mar 10, 2014 at 12:06 PM
Using Track-By With ngRepeat In AngularJS 1.2
I was hoping that this will work with pagination using ng-repeat. My use case scenario is that I have an images object. images[0] = [im1,im2,im3] // First Page images[1] = [im4, im5, im6] // Second ... read »
Mar 9, 2014 at 6:11 PM
For Better Security Use HtmlEditFormat() In Conjunction With JSStringFormat() In ColdFusion
It looks like htmleditformat() will be deprecated in CF 11 https://wikidocs.adobe.com/wiki/display/coldfusionen/New +in+ColdFusion ... read »
Mar 9, 2014 at 10:55 AM
\$.stop() vs. \$.finish() In jQuery Animations
Nice feature! Thanks for sharing. :) Good for when you are making a 100% AJAX controlled site. ... read »