Often times in an application, I have wanted to output a simple list of items that have a one-to-many relationship with other items in the database. One scenario that I come across very often involves attorneys and offices; many attorneys will work out of more than one office, but in an attorney directory, they only want to be listed with their primary office. This is an easy query IF (and only if) "primary office" is explicitly defined using some sort of "is_primary" flag or "sort" indicator, but often times, this filtering is not available.
As such, if you were just to create the SQL JOIN between attorneys and offices, you would get a lot of duplicate names (one for each office relationship), but this is way too much data. Yesterday, I was dealing with a very similar situation and was convinced that I could do all of this in a single query, rather than use query of queries or output shenanigans. What I came up with works, but only with small sets of data; it is not a very efficient method and should be avoided when dealing with large data sets.
To explore this idea, let's create the SQL tables we are going to be used. For this demo, I am going to be using a Contact table and Phone table. The contact table holds people's names. The phone table holds associated phone numbers:
<!--- SQL to create and populate the data tables. ---> <cfsavecontent variable="strSQL"> <!--- Create table for contact data. ---> DECLARE @contact TABLE ( id INT IDENTITY( 1, 1 ), name VARCHAR( 20 ) ); <!--- Create table for phone data (each phone number will be associated with exactly one contact). ---> DECLARE @phone TABLE ( id INT IDENTITY( 1, 1 ), name VARCHAR( 20 ), ext VARCHAR( 7 ), is_preferred BIT, contact_id INT ); <!--- Populate the contact data. ---> INSERT INTO @contact ( name )( SELECT 'Ben Nadel' UNION ALL SELECT 'Maria Bello' UNION ALL SELECT 'Jodie Foster' UNION ALL SELECT 'Christina Cox' ); <!--- Populate the phone data. ---> INSERT INTO @phone ( name, ext, is_preferred, contact_id )( <!--- Ben's numbers. ---> SELECT '212-555-BEN1', '', 1, 1 UNION ALL SELECT '212-555-BEN2', '123', 0, 1 UNION ALL SELECT '212-555-BEN3', '', 0, 1 UNION ALL <!--- Maria's phone numbers. ---> SELECT '917-555-MAR1', '', 0, 2 UNION ALL SELECT '917-555-MAR2', '', 0, 2 UNION ALL SELECT '917-555-MAR3', '', 0, 2 UNION ALL <!--- Christina's phone numbers. ---> SELECT '202-555-CHR1', '', 0, 4 UNION ALL SELECT '202-555-CHR2', '15', 1, 4 ); </cfsavecontent>
As you can see from this SQL code, we have 4 contacts. Three of them have phone numbers and only two of them have a phone number flagged as "preferred". This means that the "preferred" phone number will not always be easy to access, such as with Maria Bello, who has three numbers but no preference.
Before we get into anything tricky, let's look at a standard JOIN that would bring back both contacts and phone numbers:
<!--- Query for contacts and their phone numbers. ---> <cfquery name="qContact" datasource="#REQUEST.DSN.Source#"> <!--- Create the SQL data tables and populate. ---> #PreserveSingleQuotes( strSQL )# <!--- Select contacts as their numbers. ---> SELECT c.name, <!--- Phone data. ---> ( p.name ) AS phone_number, ( p.ext ) AS phone_ext, ( ISNULL( p.is_preferred, 0 ) ) AS is_preferred FROM @contact c LEFT OUTER JOIN @phone p ON c.id = p.contact_id ORDER BY c.name ASC </cfquery> <!--- Dump out contacts and their phone numbers. ---> <cfdump var="#qContact#" label="Contact Phone Numbers" />
As you can see, we just join the two tables based on the associated contact_id. Running the above SQL code, we get the following CFDump output:
In this query, we are getting back all contact-phone number associations. For a simple list, however, we don't want that; we only want one contact with a max of one phone number.
If we always had a preferred phone number or had some sort value that was always defined, getting this data would not be a problem - we would just add the sort=1 or is_preferred=1 logic to the LEFT OUTER JOIN. However, as you can see, that can't be done in our example. To get around this, our LEFT OUTER JOIN logic has to get a bit crazy. As part of our join, we want to only get the phone number that is most preferable (which does not mean is_preferred).
Before we look at the overall query, let's think about how we would get the most preferred number for any given contact. Well, we want to prioritize the phone numbers that are flagged as is_preferred. Of course, if that is not defined, or that is always zero, then we need to make an arbitrary decision and say that the first phone number created (as dictated by the auto-incrementing ID) is the most preferred.
Taking that logic, if we wanted to get the most preferred number for contact (ID:1), we would run SQL like this:
SELECT TOP 1 p.id, p.name, p.ext, p.is_preferred FROM @phone p WHERE p.contact_id = 1 ORDER BY p.is_preferred DESC, p.id ASC
This will return one (or zero) phone records with the most "preferred" number for contact:1. Now, we want to take this logic, and integrate it into our SQL LEFT OUTER JOIN logic:
<!--- Query for contacts and their phone numbers. ---> <cfquery name="qContact" datasource="#REQUEST.DSN.Source#"> <!--- Create the SQL data tables and populate. ---> #PreserveSingleQuotes( strSQL )# <!--- Select contacts as their numbers. ---> SELECT c.name, <!--- Phone data. ---> ( p.name ) AS phone_number, ( p.ext ) AS phone_ext, ( ISNULL( p.is_preferred, 0 ) ) AS is_preferred FROM @contact c LEFT OUTER JOIN @phone p ON ( c.id = p.contact_id AND <!--- As part of the JOIN condition, make sure that this ID of the phone record we return is equal to the *most* preferred one for that user. ---> p.id = ISNULL( ( SELECT TOP 1 p2.id FROM @phone p2 WHERE <!--- Tie to current user record. ---> p2.contact_id = c.id ORDER BY p2.is_preferred DESC, p2.id ASC ), 0 ) ) ORDER BY c.name ASC </cfquery> <!--- Show contact with max of ONLY one phone number. ---> <cfdump var="#qContact#" label="Contact Phone Numbers (Max: 1)" />
Now, our LEFT OUTER JOIN logic not only joins based on the contact ID, it also requires that the ID of the phone number in the join be the ID of the most preferred phone number for that contact. Running this code, you see that we get a much better and more useful query result set:
This query is easy to work with, in terms of results, but like I said, this is only good for small sets of data. Anytime you make a sub query be part of you JOIN logic, whether it's an INNER or OUTER JOIN, you've got problems; that's a lot of look-ups to perform and the query is going to be slow. But, like I said, if you just need a simple list with one record per group in a one-to-many query, this is an option that I discovered yesterday.
Want to use code from this post? Check out the license.