Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
Ben Nadel at cf.Objective() 2011 (Minneapolis, MN) with: Jason Long
Ben Nadel at cf.Objective() 2011 (Minneapolis, MN) with: Jason Long

Repositories And Data Access Layers Can Have As Many Methods As You Find Helpful

By Ben Nadel on
Tags: ColdFusion

When I was first learning about Abstractions in programming, one of the early patterns that I came across was the Data Access Layer (DAL), which attempted to hide the implementation details of the underlying data persistence mechanism. I believe that the Repository Pattern is a more specific type of Data Access Layer, relating to aggregate roots; however, I tend to use the two terms interchangeably to refer, generally speaking, to the "persistence abstraction".

As I was reading-up on these patterns, I came to believe that the API for these layers had to be kept simple, revolving primarily around CRUD (Create, Read, Update, Delete) methods. As I've gotten older, however, I've come to understand that this constraint is silly and artificial; and, that my persistence APIs can contain as many methods in their interface as I find helpful to get the job done.

At first - and for many years - I thought that all data abstractions had to conform to an interface that looked something like this:

  • getById()
  • getByFilter()
  • create()
  • updateById()
  • updateByFilter()
  • deleteById()
  • deleteByFilter()

And that, if my business logic needed data to process a command, it would have to make due with the data returned by these methods; even, if these methods returned way more data than was necessary to fulfill the request.

For example, if I had a workflow that wanted to check to see if an object existed, I'd have to call .getById() and then check to see if that method returned null (or threw an error, depending on the implementation):

function doSomething( id ) {

	var thing = repository.getById( id );

	if ( ! thing ) {

		// ... the object doesn't exist in the persistence layer.

	}

	// .... moar logic ....

}

After doing this a few times, I might try to remove the verbosity by creating a helper method in my business logic:

function doSomething( id ) {

	if ( ! isThingExists( id ) ) {

		// ... the object doesn't exist in the persistence layer.

	}

	// .... moar logic ....

}

function isThingExists( id ) {

	return( !! repository.getById( id ) );

}

This makes the intent of the logic a bit easier to parse. But, the unfortunate part here is that I'm still pulling back all of the data for "thing" even though all I want to do is to check to see if "thing" exists. This is bad for the underlying database because I'm probably not using a covering index to gather that data; it's bad for the network bandwidth because I'm transferring more bits than I need to over the wire; and, it's bad for the end-user because it will almost certainly result in greater request latency.

At some point, in order to maintain my own sanity, I ended up pushing that "exists" concept down into the Repository / Data Access Layer with a few methods like:

  • existsById()
  • existsByFilter()

Instead of returning Objects, these methods would return Booleans. And it immediately made life better for me and for my users!

This was the tipping point. This was the moment that made me step back and re-evaluate my perspective on what a Data Access Layer was there to do.

Before that moment, I had believed that a Data Access Layer was this finite concept that I had to "fit my logic into". Essentially, the Data Access Layer existed as something that I was, at least to some degree, always "fighting against". Once I added a few methods that could return a non-object result, however, it completely flipped my notions on their head. Suddenly, this abstraction was working for me, not against me.

What I came to understand over time was that I had burdened my original understanding of the Data Access Layer with too many constraints. By removing these constraints, I allowed the Data Access Layer to remain a simple abstraction: one that hides the persistence implementation. Nothing more, nothing less.

What this means is that, as long as a method doesn't leak implementation details about the underlying persistence mechanism, the method is completely valid. So now, I can have methods in my Data Access Layer that like this:

  • getAirspeedVelocityOfUnladenAfricanSwallow()

And, as long as I can swap my Data Access Layer with an implementation that uses SQL, or stored procedures, or MongoDB, or Redis, or an in-memory cache, or a flat file system, and still consistently implement a method, then, that method fits securely within the constraints of the abstraction.

Of course, you could argue that having these type of methods in a Data Access Layer is going to mean pushing some business logic down into the abstraction itself. And, you might be right. But, I'm not sure that it matters all that much. As long as a consistent contract is upheld by the abstraction, the location of some filtering and aggregation logic seems inconsequential.

My point here is that abstraction of the data access layer should do just that: abstract the data access. This doesn't imply how that's done; or limit the logic that the abstraction can actually encapsulate; only that the implementation details don't leak up into a higher layer of the application architecture. Once I began to embrace this more focused understanding of what a Repository / DAL does, my applications became easier to create and maintain.



Reader Comments

I did the same, as if the people in data layer white lab coats were going to come and take me away. Not enough tutorials mention that the concepts are guides or starting points. You summed it up perfectly with this - "As long as a consistent contract is upheld by the abstraction, the location of some filtering and aggregation logic seems inconsequential."

Reply to this Comment

@Paul,

I'm glad I'm not crazy and that this rings true. Though, I suppose as with all things, the older I get, the more likely I am to question why I've always done something some way.

Reply to this Comment

Hi Ben
Just saying I eventually came to the same conclusion in my projects. Just create repository methods that actually do what you need.

In our project assuming repository methods should all be CRUD methods lead to a further trap of repositories being forced to implement a "Generic repository" base class which turned out to be a major design flaw.

Reply to this Comment

@Rob,

Totally! When you feel like you have to bend to fit into some ill-conceived notation of how it's supposed to work, you end up building a lot of stuff that actually makes it less flexible and harder to maintain in the long run... at least, in my experience. When you keep it simple and build things that deliver specific value, it makes:

  • The code easier to understand and maintain.
  • The code easier to delete because it's more clear how things are being used.
  • The code easier to migrate because each unit-of-work has clear boundaries and implementation requirements.

And, in general, I am finding more and more that my historic use of "Base Classes", comes back to haunt me, having nothing even to do with the Repository pattern. The "Base" class becomes this place where people just feel free to jam anything they want and then use it anywhere they want without considering the long-term fall-out of their decisions (myself included).

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
NEW: Some basic markdown formatting is now supported: bold, italic, blockquotes, lists, fenced code-blocks. Read more about markdown syntax »
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.