When it comes to SQL (Structured Query Language), I don't use the UNION construct all that often. But, when I do, one of the features that makes it so powerful is the fact that you can apply both an ORDER BY clause and a LIMIT clause to the derived result set of the union (at least in MySQL). Furthermore, the individual SELECT statements can also have their own ORDER BY and LIMIT clauses.
To see this in action, take a look at the following query. It's a nonsense example, I know, but I think it illustrates the point.
- NOTE: I am using a "UNION ALL" here instead of a "UNION" simply because I know that
- there will not being row conflicts (based on the pkey logic).
- <cfquery name="friends" datasource="testing">
- friend f1
- f1.id < 4
- UNION ALL
- friend f2
- f2.id >= 4
- ORDER BY
- f2.name DESC
- In this case, the ORDER BY applies to the result set derived from the UNION of
- both SELECT statements. Notice that it refers to the derived column name.
- ORDER BY
- name ASC
- <!--- In this case, the LIMIT applies to the derived result set. --->
- <!--- Output the query for debugging. --->
- <cfdump var="#friends#" label="friends" />
I happen to be using "UNION ALL," instead of "UNION" in this case, because I know that there will be no duplicate rows in the result set; the UNION ALL action saves a little bit of processing overhead. But, it wouldn't matter which one I chose for this demo.
That said, you can see that I am using an ORDER BY clause to sort the derived result set and then a LIMIT clause to limit the size of the returned result set. What's more, I am using an ASC sort even though one of the sub-selects is using a DESC sort. The sub-select sort is limited to the scope of that sub-select - the ORDER BY of the UNION will determine the sort of the final result set.
When we run the above code, we get the following CFDump output:
| || || |
| || |
| || || |
Definitely a small but very powerful SQL feature!
Looking For A New Job?
I would sometimes use this sort of functionality. For instance, I've used it in a sub-query to ensure that I had a default, non-NULL value.
Example: I wrote a query for tracking immunizations received against immunizations scheduled. The query needed to have a set number of rows for the immunization, but you may not always have that many immunizations received. The sub-query would be a SELECT that retrieved that cardinal immunization and performed a UNION with a SELECT that calculated when that cardinal immunization was scheduled. If the child had received the immunization, the sub-query got two results ("received" and "scheduled") and you could LIMIT it to the first result; if the child had not received the immunization, the sub-query only got one result and you would still LIMIT it to that one result.
That's one of the most frustrating scenarios - when a record may have zero-N associations. Zero or 1 is ok, since you can do a LEFT OUTER JOIN and know that, at MOST, you'll get a matching record. But, it gets more complicated when the LEFT OUTER JOIN may return 2 or more records. Then we gotta start jumping through hoops, like you're talking about.