Last week, I was trying to take an old, procedural style ColdFusion application and factor out queries into some sort of Service layer. I wasn't trying to create an Object Oriented ColdFusion application - I was merely trying to centralize some business logic such that it wasn't so distributed throughout the application. At a glance, this seems like a rather simple concept; and, when I've applied it to simple applications like OOPhoto, adding a Service layer was a snap. But, this is quite a complex application that I'm working with and I quickly found adding a Service layer to be an extremely painful task.
The first thing that tripped me up was figuring out what the "gesture" of the query was. Meaning, what was the query doing and how could I name the Service layer method to reflect this intention. Furthermore, how could I define the method in such a way that made it reusable in different situations since it's only in the reusable nature of centralized queries that make them worthwhile (otherwise, you might as well just have them inline).
It was this "reuse" part that really slammed me into a wall. As I examined my queries, I realized that they were almost all different in some way. And I don't mean that they had slightly different WHERE clauses - I mean that they had completely different JOIN clauses and calculated columns based on the page in which they were used. I was really stumped. The queries seemed to be right, but they all seemed to be a bit too different to reuse in any way.
This vexed me greatly as I started to become concerned that this same issue would challenge me on future projects. Then, after staring blankly at a query for no less than 20 minutes, it finally occurred to me why my queries were so distinct and non-reusable:
The queries themselves were used for a combination of both data retrieval and data validation in a contextual way.
It's the contextual data retrieval and validation aspects that was making them so page specific and non-reusable. Let me give you an example: Suppose I am on a page where a user get's to download a Asset with the given ID for a given Project. Well, rather than just getting the Asset at the given ID, I get the Asset at the given ID with it associated project details (project name) in the context of the given Project (untested code):
SELECT a.id, a.name, a.filename, a.project_id, ( p.name ) AS project_name FROM asset a INNER JOIN project p ON ( a.id = #asset_id# AND p.id = #project_id# AND a.project_id = p.id )
Yes, I am simply getting back the given Asset at the given ID with this query. But, the query doesn't just hit the Asset table; rather, it hits the asset table in the context of a Project-based JOIN on the project ID and returns project information as well. Meaning, it uses the query not only to get the Asset and the Project, but also to enforce the fact the given Asset belongs to the given project. In the end, I know that if the query doesn't return any rows, any of the following might be true:
- Asset ID was incorrect.
- Project ID was incorrect.
- Given Asset was not part of the given Project.
This was really useful because it made my post-query validation a snap. But, it is this type of contextual query that is making my Service layer a nightmare to create (I have, in fact, stopped for the time being). Because the query performs several actions (two explicit and one implicit), it really ties each query quite tightly to the page in which it is used, making it almost entirely non-reusable across the application.
So, how would I fix this going forward in future applications? I am not 100% sure that this is correct, but I suppose that I would have to separate multi-table data retrieval and pull the context validation out of the query and perform validation in a secondary step. For example (pseudo code):
qProject >> SELECT p.id, p.name FROM project p WHERE p.id = #project_id# qAsset >> SELECT a.id, a.name, a.filename, a.project_id FROM asset a WHERE a.id = #asset_id# Validation >> if (NOT qProject.RecordCount) OR (NOT qAsset.RecordCount) OR (qProject.id NEQ qAsset.project_id) then // Throw error that data was not valid.
Now that the asset query is just getting back the asset information and the project query is just getting back the project information and the context validation is performed after both queries have run, suddenly the two separated, cohesive queries become much more easily definable and reusable across the application.
Thinking back now, I have a serious habit of making my SQL queries complex with all kinds of JOINs to build data retrieval and data validation into a single step. Understanding this, I now realize that upgrading my existing applications to be less "spaghettified" is going to not only require factoring out queries, but also completely changing the way in which they were used. That's probably not going to happen; looks like a forward-only type of change.
As a final note, I just want to say that I really like the complex queries. They seemed to have so much more "intent". These little, table-access queries just seem so... I don't know... not exciting. But, I guess it's just that thinking that has come back to bite me.
Want to use code from this post? Check out the license.