Using ORDER BY And LIMIT Clauses In A UNION SQL Statement
When it comes to SQL (Structured Query Language), I don't use the UNION construct all that often. But, when I do, one of the features that makes it so powerful is the fact that you can apply both an ORDER BY clause and a LIMIT clause to the derived result set of the union (at least in MySQL). Furthermore, the individual SELECT statements can also have their own ORDER BY and LIMIT clauses.
To see this in action, take a look at the following query. It's a nonsense example, I know, but I think it illustrates the point.
<!--- NOTE: I am using a "UNION ALL" here instead of a "UNION" simply because I know that there will not being row conflicts (based on the pkey logic). ---> <cfquery name="friends" datasource="testing"> ( SELECT f1.id, f1.name FROM friend f1 WHERE f1.id < 4 ) UNION ALL ( SELECT f2.id, f2.name FROM friend f2 WHERE f2.id >= 4 ORDER BY f2.name DESC LIMIT 2 ) <!--- In this case, the ORDER BY applies to the result set derived from the UNION of both SELECT statements. Notice that it refers to the derived column name. ---> ORDER BY name ASC <!--- In this case, the LIMIT applies to the derived result set. ---> LIMIT 10 </cfquery> <!--- Output the query for debugging. ---> <cfdump var="#friends#" label="friends" />
I happen to be using "UNION ALL," instead of "UNION" in this case, because I know that there will be no duplicate rows in the result set; the UNION ALL action saves a little bit of processing overhead. But, it wouldn't matter which one I chose for this demo.
That said, you can see that I am using an ORDER BY clause to sort the derived result set and then a LIMIT clause to limit the size of the returned result set. What's more, I am using an ASC sort even though one of the sub-selects is using a DESC sort. The sub-select sort is limited to the scope of that sub-select - the ORDER BY of the UNION will determine the sort of the final result set.
When we run the above code, we get the following CFDump output:
Definitely a small but very powerful SQL feature!
Want to use code from this post? Check out the license.
I would sometimes use this sort of functionality. For instance, I've used it in a sub-query to ensure that I had a default, non-NULL value.
Example: I wrote a query for tracking immunizations received against immunizations scheduled. The query needed to have a set number of rows for the immunization, but you may not always have that many immunizations received. The sub-query would be a SELECT that retrieved that cardinal immunization and performed a UNION with a SELECT that calculated when that cardinal immunization was scheduled. If the child had received the immunization, the sub-query got two results ("received" and "scheduled") and you could LIMIT it to the first result; if the child had not received the immunization, the sub-query only got one result and you would still LIMIT it to that one result.
That's one of the most frustrating scenarios - when a record may have zero-N associations. Zero or 1 is ok, since you can do a LEFT OUTER JOIN and know that, at MOST, you'll get a matching record. But, it gets more complicated when the LEFT OUTER JOIN may return 2 or more records. Then we gotta start jumping through hoops, like you're talking about.
tnx ben, like it;
You code of UNION very good,
One thing that I've learned in the last year or two is that UNION, in general, can be not so great for performance. Especially when the UNION includes fields that are Text fields and have to be read off of disk.
At work, we were dealing with some query performance issues and the guys from Percona (MySQL consultants and builder of MySQL tools and databases) were helping us out. They explained that if you had a UNION (and an ORDER BY for that matter) that dealt with Text fields, the database would have to write a temp table to disk while performing the operation because it can't deal with Text in memory. If this is in a high-volume query, this can be a noticeable hit on performance.
As such, we ended up breaking a number of UNIONs apart and doing the zippering together in the code, rather than in the database.
Think about this:
I want to get results from two tables, but want to have a custom order for each of them.
Get all user.name from table User where id > 100 ORDER BY Name
Get all user.name from table User where user.name like 'Th%' ORDER BY ID