Skip to main content
Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.

Using SELECT And UNION ALL To JOIN SQL Tables To CSV Data In MySQL

By Ben Nadel on
Tags: SQL

I just love SQL. I've been writing web-applications on top of relational databases (RDBMS) for a while; and, SQL always feels like a luxurious combination of structure and flexibility. Yesterday, when working with some data scientists, I had to merge a few columns from a database table into a CSV (Comma-Separated Values) file that the data scientists had created. The CSV file was fairly small; so, I ended up using the awesome power of multiple-cursors in SublimeText 3 in order to write a SQL query that converted the CSV data into a derived table that I then consumed in an INNER JOIN in MySQL.

To be clear, this is not the only way to consume CSV data in a relational database. But, I was working with a database in which I had read only permissions. As such, I couldn't load the CSV file into a "real table". Given the relatively small size of the CSV payload, what follows just felt like the appropriate level of effort.

That said, let's paint a picture of what was going on. Imagine that I had a database table called friend with the following structure:

SHOW CREATE TABLE friend;
-- Generates the following SQL statement:

CREATE TABLE `friend` (
	`id` int(11) NOT NULL,
	`name` varchar(50) NOT NULL,
	`age` int(11) NOT NULL,
	`catchPhrase` varchar(250) NOT NULL,
	PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

Then, the data scientists, using an external data warehouse, generated a CSV file that contained the friend.id value as the first column and a favorite meal as the second column:

id	favorite_meal
1	"Pizza"
4	"Bella's Chicken Burritos"
5	"Vegan, Multi-Grain Pancakes"

Given this CSV file, they wanted to merge it with the friend table so that they could see the name value alongside the existing id and favorite_meal values. To do this, I took their CSV data and - using SublimeText 3 - translated it into series of SELECT and UNION ALL statements:

( SELECT 1 AS id, "Pizza" AS favoriteMeal ) UNION ALL
( SELECT 4 AS id, "Bella's Chicken Burrito" AS favoriteMeal ) UNION ALL
( SELECT 5 AS id, "Vegan, Multi-Grain Pancakes" AS favoriteMeal )

This SQL generates a derived result-set in which each SELECT statement represents a unique row. One of the hottest features of SQL is that I can then use a derived result-set as input to another query. As such, I was able to take this series of SELECT statements and consume it as part of an INNER JOIN on the friend table:

This left me with the following SQL:

SELECT
	f.id,
	f.name,
	tmp.favoriteMeal
FROM
	-- Here, we are using the SELECT / UNION ALL technique to crate a derived table
	-- in which each CSV value becomes a column within the derived table. We can then
	-- JOIN to this derived table in order to augment it.
	(

		( SELECT 1 AS id, "Pizza" AS favoriteMeal ) UNION ALL
		( SELECT 4 AS id, "Bella's Chicken Burrito" AS favoriteMeal ) UNION ALL
		( SELECT 5 AS id, "Vegan, Multi-Grain Pancakes" AS favoriteMeal )

	) AS tmp
INNER JOIN
	friend f
ON
	f.id = tmp.id -- Joining `friend` to the derived table.
ORDER BY
	f.id ASC
;

As you can see, I'm using my CSV-data-cum-temporary-table as the first part of my INNER JOIN. This allows me to map the friend.id column to the tmp.id column, which subsequently gives me access to the friend.name column in the results. And, when we run the above SQL in the MySQL database, we get the following output:

mysql> SELECT
    ->     f.id,
    ->     f.name,
    ->     tmp.favoriteMeal
    -> FROM
    ->     (
    -> 
    ->         ( SELECT 1 AS id, "Pizza" AS favoriteMeal ) UNION ALL
    ->         ( SELECT 4 AS id, "Bella's Chicken Burrito" AS favoriteMeal ) UNION ALL
    ->         ( SELECT 5 AS id, "Vegan, Multi-Grain Pancakes" AS favoriteMeal )
    -> 
    ->     ) AS tmp
    -> INNER JOIN
    ->     friend f
    -> ON
    ->     f.id = tmp.id
    -> ORDER BY
    ->     f.id ASC
    -> ;
+----+-------+-----------------------------+
| id | name  | favoriteMeal                |
+----+-------+-----------------------------+
|  1 | Tina  | Pizza                       |
|  4 | Ralph | Bella's Chicken Burrito     |
|  5 | Sandi | Vegan, Multi-Grain Pancakes |
+----+-------+-----------------------------+
3 rows in set (0.00 sec)

Woot woot! As you can see, by merging the CSV data and the friend table in a single SQL query, I was able to pull in the friend.name column in order to augment the existing CSV column data.

Isn't SQL just thrilling?! I am sure there are good number of ways to accomplish what I demonstrated. But, in the "fog of war" on a Zoom call with several other people watching me live-type SQL, this felt like the right approach. But dang it, if SQL isn't just the cat's pajamas!

Mugatu saying, That relational database, so hot right now!


Reader Comments

What has two thumbs and hopes you leave a comment? This Guy! (Ben Nadel).

Post A Comment

You — Get Out Of My Dreams, Get Into My Blog
Live in the Now
Oops!
NEW: Some basic markdown formatting is now supported: bold, italic, blockquotes, lists, fenced code-blocks. Read more about markdown syntax »
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.