Repositories And Data Access Layers Can Have As Many Methods As You Find Helpful

Published 2020-01-09 in ColdFusion — Comments (7)

When I was first learning about Abstractions in programming, one of the early patterns that I came across was the Data Access Layer (DAL), which attempted to hide the implementation details of the underlying data persistence mechanism. I believe that the Repository Pattern is a more specific type of Data Access Layer, relating to aggregate roots; however, I tend to use the two terms interchangeably to refer, generally speaking, to the "persistence abstraction".

As I was reading-up on these patterns, I came to believe that the API for these layers had to be kept simple, revolving primarily around CRUD (Create, Read, Update, Delete) methods. As I've gotten older, however, I've come to understand that this constraint is silly and artificial; and, that my persistence APIs can contain as many methods in their interface as I find helpful to get the job done.

At first - and for many years - I thought that all data abstractions had to conform to an interface that looked something like this:

getById()
getByFilter()
create()
updateById()
updateByFilter()
deleteById()
deleteByFilter()

And that, if my business logic needed data to process a command, it would have to make due with the data returned by these methods; even, if these methods returned way more data than was necessary to fulfill the request.

For example, if I had a workflow that wanted to check to see if an object existed, I'd have to call .getById() and then check to see if that method returned null (or threw an error, depending on the implementation):

function doSomething( id ) {

	var thing = repository.getById( id );

	if ( ! thing ) {

		// ... the object doesn't exist in the persistence layer.

	}

	// .... moar logic ....

}

After doing this a few times, I might try to remove the verbosity by creating a helper method in my business logic:

function doSomething( id ) {

	if ( ! isThingExists( id ) ) {

		// ... the object doesn't exist in the persistence layer.

	}

	// .... moar logic ....

}

function isThingExists( id ) {

	return( !! repository.getById( id ) );

}

This makes the intent of the logic a bit easier to parse. But, the unfortunate part here is that I'm still pulling back all of the data for "thing" even though all I want to do is to check to see if "thing" exists. This is bad for the underlying database because I'm probably not using a covering index to gather that data; it's bad for the network bandwidth because I'm transferring more bits than I need to over the wire; and, it's bad for the end-user because it will almost certainly result in greater request latency.

At some point, in order to maintain my own sanity, I ended up pushing that "exists" concept down into the Repository / Data Access Layer with a few methods like:

existsById()
existsByFilter()

Instead of returning Objects, these methods would return Booleans. And it immediately made life better for me and for my users!

This was the tipping point. This was the moment that made me step back and re-evaluate my perspective on what a Data Access Layer was there to do.

Before that moment, I had believed that a Data Access Layer was this finite concept that I had to "fit my logic into". Essentially, the Data Access Layer existed as something that I was, at least to some degree, always "fighting against". Once I added a few methods that could return a non-object result, however, it completely flipped my notions on their head. Suddenly, this abstraction was working for me, not against me.

What I came to understand over time was that I had burdened my original understanding of the Data Access Layer with too many constraints. By removing these constraints, I allowed the Data Access Layer to remain a simple abstraction: one that hides the persistence implementation. Nothing more, nothing less.

What this means is that, as long as a method doesn't leak implementation details about the underlying persistence mechanism, the method is completely valid. So now, I can have methods in my Data Access Layer that like this:

getAirspeedVelocityOfUnladenAfricanSwallow()

And, as long as I can swap my Data Access Layer with an implementation that uses SQL, or stored procedures, or MongoDB, or Redis, or an in-memory cache, or a flat file system, and still consistently implement a method, then, that method fits securely within the constraints of the abstraction.

Of course, you could argue that having these type of methods in a Data Access Layer is going to mean pushing some business logic down into the abstraction itself. And, you might be right. But, I'm not sure that it matters all that much. As long as a consistent contract is upheld by the abstraction, the location of some filtering and aggregation logic seems inconsequential.

My point here is that abstraction of the data access layer should do just that: abstract the data access. This doesn't imply how that's done; or limit the logic that the abstraction can actually encapsulate; only that the implementation details don't leak up into a higher layer of the application architecture. Once I began to embrace this more focused understanding of what a Repository / DAL does, my applications became easier to create and maintain.

Want to use code from this post? Check out the license.

Short link: https://bennadel.com/3754

Reader Comments

Paul Speranza Jan 10, 2020 at 9:08 AM

1 Comments

I did the same, as if the people in data layer white lab coats were going to come and take me away. Not enough tutorials mention that the concepts are guides or starting points. You summed it up perfectly with this - "As long as a consistent contract is upheld by the abstraction, the location of some filtering and aggregation logic seems inconsequential."

Ben Nadel Jan 10, 2020 at 10:12 AM

15,781 Comments

@Paul,

I'm glad I'm not crazy and that this rings true. Though, I suppose as with all things, the older I get, the more likely I am to question why I've always done something some way.

Rob Jan 14, 2020 at 6:19 AM

3 Comments

Hi Ben
Just saying I eventually came to the same conclusion in my projects. Just create repository methods that actually do what you need.

In our project assuming repository methods should all be CRUD methods lead to a further trap of repositories being forced to implement a "Generic repository" base class which turned out to be a major design flaw.

Ben Nadel Jan 21, 2020 at 7:17 AM

15,781 Comments

@Rob,

Totally! When you feel like you have to bend to fit into some ill-conceived notation of how it's supposed to work, you end up building a lot of stuff that actually makes it less flexible and harder to maintain in the long run... at least, in my experience. When you keep it simple and build things that deliver specific value, it makes:

The code easier to understand and maintain.
The code easier to delete because it's more clear how things are being used.
The code easier to migrate because each unit-of-work has clear boundaries and implementation requirements.

And, in general, I am finding more and more that my historic use of "Base Classes", comes back to haunt me, having nothing even to do with the Repository pattern. The "Base" class becomes this place where people just feel free to jam anything they want and then use it anywhere they want without considering the long-term fall-out of their decisions (myself included).

Almas May 15, 2020 at 3:08 PM

1 Comments

And now your Repository layer will violate Interface segregation principle https://en.wikipedia.org/wiki/Interface_segregation_principle
Of course it's important to have a help from Repository layer such as exists instead of null check. However , the main question is how to split it

Amit Shrinivas gujar Aug 14, 2021 at 9:53 AM

1 Comments

sorry to asking silly questions . but i really want to understand the data access layer how to integrate with service layer . means a demo app atleast

Ben Nadel Aug 16, 2021 at 5:30 AM

15,781 Comments

@Amit,

The data access layer is just a way to read data into memory through a set of APIs. The main goals of the data access layer is to:

Make consuming the data easier.
Hide the implementation details of the data persistence.

I wouldn't think too much more about it than that. I used to think that I had to follow all sorts of special rules about data access layers. But, then I remembered that the data access layer works for me, not the other way around. It's there to make my life easier - not to force me to fit into some set of arbitrary rules.

Oh my chickens, this post is old!

Hit me up on Twitter if you want to discuss it further.