Throughout my career, I've often heard that it is a best practice to design your "Domain Objects" and your "Business Logic" first and then, eventually, to design a database schema that allows your domain objects to be persisted. I've even seen many ORM (Object-Relational Mapping) systems that will happily churn-out database schemas based solely on your Objects (and their metadata). Personally, I've never done this. In fact, I find this approach to be antagonistic to how my brain operates. When I'm working on a ColdFusion application (or a feature therein), I always start with the database schema first and then layer the ColdFusion application upon it using an iterative, ground-up approach.
Relational Database Schemas Are Agile, Mostly Like Code
Relational database schemas are living, breathing organisms that have to evolve alongside the needs of the business. Adding tables, modifying tables, altering column definitions, adjusting database indexes - it's all par for the course. Some database operations are certainly more challenging than others: years ago, I used to manually perform online
ALTER TABLE migrations using shadow tables.
These days, however, even relational database systems (as opposed to NoSQL document databases) have greatly lowered the barrier to entry. More mutations can be applied without locking or downtime. And, for operations that do require a table rebuild, tools like Percona's
pt-online-schema-change exist to make this relatively painless (and transparent to your users).
Or course, locking really only becomes a concern with either high data volume or high user traffic - a "good problem" that most of don't actually have enough of. I'm sure for the majority of uses-cases, a good old
ALTER statement will just get the job done.
To be clear, I'm not trying to argue that a relational database schema is as dynamic as a NoSQL document database. But, we do need to dispense with any anachronistic notions that relational database schemas are poured in concrete and unable to be changed. That simply isn't the world we live in anymore.
Building Blocks, From The Ground-Up
When I say that I "build the database schema first", I don't mean that I build the entire schema first and then the entire application on top of it. That would be entirely foolish and a waste of time. After all, we never know as little about our applications as we do "right now". We simply don't have enough information on-hand to design out an entire anything at this moment, let alone a database schema.
When I say that I "build the database schema first", what I mean is that, for any facet or feature within an application, I design just enough database schema so that my brain has something on which to anchor. Like a toddler, my brain needs a "safe space" from whence it can start to explore. Without the safe space of a table definition, my brain feels lost - like it doesn't know which direction to go in; or even which directions it has available to it.
Once I have a
CREATE TABLE statement in front of me, I can start to build up the world around it in my head. I can start to think about data-access patterns that my application will likely use. And perhaps most importantly, I can start to consider uniqueness constraints and how those uniqueness constraints may be able to drive idempotent workflows.
Obviously, I won't know everything that I need to know about how this data will be used. But, there are some timeliness questions that always make sense:
Will I ever need to list-out the data in this table? If so, does the data need to be filtered or sorted? If so, should those columns be included as part of an index (spoiler alert, yes, most likely).
Will I ever need to aggregate any of the rows for different User Interfaces (UI) within the application? If so, do I have indexes that allow those aggregations to be calculated with good performance? Do I need different indexes to facilitate different UI aggregations?
If the table has columns that reference primary keys in another table, does it make sense for those values to be duplicated? Or, should I use a
UNIQUE KEYin order to enforce uniqueness as a safety fall-back beyond the business logic?
Does the table have any non-foreign references that need to be unique (such as an email address)? If so, should I use a
UNIQUE KEYin order to enforce that uniqueness for data integrity?
Will I ever need to "soft-delete" or "archive" any of the rows? If so, do I need to take that into consideration with my index design? Or, perhaps move the archived rows into an "archive table" in order to reduce the volume within the active data table?
Do any of the columns within one table merit a more robust data model? Meaning, should a column be "promoted" to it's own table so that I can store more relevant data about the concept being captured by that column?
Are any of the columns likely to change at a radically different tempo - perhaps I need to split my high-writes and my high-reads into different tables in order to reduce row-level locking?
Again, my goal here isn't to get it right; my goal is just to get the ball rolling - to give my brain something to start chewing on. I know that the database schema will change over time and I'm totally OK with that.
Data Access Layer
Once I have my
CREATE TABLE statement(s) in place, I start working on the next building block: my data access abstraction components (aka "Data Access Objects" (DAO), aka "Data Gateways", aka "Data Repositories"). These are the ColdFusion components that encapsulate the SQL statements for CRUD (Create, Read, Update, Delete) operations. Usually, these ColdFusion components will have methods like:
- GetThingByID( id )
- GetThingsByFilter( ...filters )
- DeleteThingByID( id )
- DeleteThingsByFilter( ...filters )
- UpdateThingByID( id, ...values )
Then, depending on much I've thought-through the application, I may have some more specialized operations like:
- IncrementThingColumnValue( id )
- DoesThingExistByID( id )
These data access objects are here for low-level CRUD-style operations only. These are not for reporting purposes! One of the biggest mistakes that I see teams make is an attempt to collocate CRUD queries and reporting queries into the same components - this becomes a maintenance nightmare (and often a performance nightmare) that only compounds over time. Reporting queries should be broken-out into their own specialized components.
Entity Access Layer (aka, Service Layer)
Once I have my data access components in place, I start working on the next building block: my entity access components. These ColdFusion components deal entirely with native ColdFusion data structures (think Structs and Arrays) and hide all low-level data-persistence operations through the use the data access layer.
I usually refer to this layer as the "Service Layer" of the application. It enforces business logic that is local to a given entity or "aggregate" (in Domain Driven Design parlance). Though, no all ColdFusion components in the service layer deal with "entities" - some are more infrastructure related (like sending emails and maintaining rate-limit counters).
Usually there is a 1-to-1 relationship between an Entity Access component and a Data Access component. So, for a theoretical
Thing entity, I would have:
ThingService.cfc- the entity access layer.
ThingGateway.cfc- the data access layer for the entity access layer.
... and, the
ThingGateway would be provided to the
ThingService using Inversion of Control (IoC), possibly wired together using a Dependency-Injection (DI) container; though, for simple applications, I manually wire this kind of stuff together in order to reduce complexity.
ASIDE: A data access object (DOA) should only ever be consumed by its relevant entity access object. The moment a data access object is being used by multiple higher-level components, I consider that a code-smell. Instead, the "entity access" component should be the shared point of interaction.
The methods on the entity access layer may be reminiscent of the underlying data access component, but they are usually more "business focused". Meaning, the entity access components never expose a
GetByFilter() style method; instead, they would expose several different
GetBy___() methods, each of which is focused on accessing the data through different facets and may have different constraints.
Commands (aka, Workflows) And Queries (aka, Partials) Layer
Once I have my
CREATE TABLE statements that feed my data access layer which is, in turn, utilized by my entity access layer, I have about as much "theoretical" code as I can create using nothing but my nascent mental model. At this point, I can start to work on the next building blocks: my commands and queries. Together, this is the layer that represents the "business rules" of the application. While the entity access layer may contain data-level constraints, the commands and queries contain business-level constraints; and, how those constraints need to be applied across multiple entities.
ASIDE: Entities don't know anything about each other. They may reference each other though foreign keys; but, this is just opaque data. All cross-entity constraints need to be enforced at the command layer.
The Command / Workflow layer is the set of ColdFusion components that tells the application to "do" something. Create this, delete that, upgrade this account, send out that alert. It's the way the state of the application is mutated over time.
The Query / Partial layer is the set of ColdFusion components that aggregates data for specific views within an application. And while I believe that this is an integral part of the application, if you squint hard enough, you might consider the Partials an "optimization technique." It is there to pull data out of the database with high-performance intended to answer a specific set of UI questions.
In theory, you could gather all your UI data using nothing but the entity access layer; though, I suspect this would quickly lead to N+1 problems. Which is why I consider the Query / Partial layer a necessity, not just an optimization.
This is where most of the iteration happens. All of the lower-layers within the application can be started using the power of thought and imagination. But, the Commands and Queries are where the rubber meets the road - where the theory meets the practice. At this point, I'll start to add missing entity layer methods that I need in order to power the business logic of the Command / Workflow components.
Much of the time, these new entity layer methods can turn around and consume existing data layer methods. However, sometimes new data layer methods need to be created as well. And, sometimes, new database table columns need to be created (or altered) in order to serve these new methods. A new table index may even have to be added in order to make sure that a newly-exposed data access pattern is performant.
The Query / Partial components also impact the structure of the database; however, this is usually more about indexes and performance. Though, sometimes, a partial will precipitate a rethinking of an existing structure and a potential decomposition of a single table into multiple tables.
Iterating Up And Down The Stack
When I start a new application or feature, I always start with the database design. This gives me a solid base on which to build my mental model and my ColdFusion code. I then work up the stack, creating larger "building blocks" that can be used to perform larger operations. As I build out these operations, though, I will often have to work back down the stack adding new capabilities. This in turn allows me to work back up the stack, fleshing out business logic.
And so on, up the stack and back down the stack, iterating as the needs of the application become more readily apparent; and, as the needs of the business change over time.
Certainly, I've made many mistakes in my ColdFusion application development. And, after maintaining a single, large application for a decade, I've learned a lot about what works and what doesn't work. And for me - and the way my brain functions - starting with the database schema first and then working my way up is a recipe for success.
Epilogue: ColdFusion Application Layering
I've been working with ColdFusion since 1999. And, in that time, I've learned a lot of ways to not build a maintainable application. This journey of discovery continues to this day, especially as technology evolves and the expectations of our users changes.
And while most this post talks about my iterative approach to designing a layered ColdFusion application, an older post of mine - Software Application Layers And Responsibilities, 2019 Edition - goes much more in depth on what each one of these layers does.
Of course all of this will continue to change as I get more experience. So, consider this my approach "as of now" - not a static concept frozen in time.