It Walks Like A Duck, It Quacks Like A Duck, But It's A Donkey - The Problem Of Inappropriately DRY Code

By Ben Nadel

Published 2015-03-06 in ColdFusion, JavaScript / DHTML, Work — Comments (26)

I spend a lot of time refactoring old code. And, as I do this, I keep seeing the same problem over and over again. As developers try to keep their code DRY, as in "Do-not Repeat Yourself," they often ignore a critical point of evaluation - the code's reason for change. When this property is ignored, code - that would otherwise be unique - is mistaken as duplicate code and is reused inappropriately. This makes future change much more difficult and error prone.

Imagine that you have a Pizza Delivery Guy and a Stunt Driver. Both of them need a car to do their job. And, from appearances alone, both of their cars might look exactly alike. But, you wouldn't want the Pizza Delivery Guy to use the same car that the Stunt Driver uses. In fact, even if the cars are 100% the same, in every single detail, you still wouldn't want them to use the same car. The reason: each car will change for very different reasons.

Let's pretend, for a moment, that we think the two cars are duplicates and we want to keep our parking lot "DRY" (Do-not Repeat Yourself); so, we ditch one and have the Pizza Delivery Guy and the Stunt Driver use the same car. Now, imagine that the Stunt Driver gets a job where he has to make the car skid-out. In order to do this, he has to remove the Anti-locking Brake Systems (ABS) so that his wheels will lock up while driving.

Of course, we don't want the 17-year-old Pizza Delivery Guy driving a car without the ABS system; so, rather than removing it altogether, we implement a switching system in the car so that the ABS can be turned on and off. Sure, it sucks that each driver now has to A) know about that switch and B) remember to turn it on or off; but, hey, our parking lot is still "DRY".

At first, you might think this is a victory. But really, this is just the first step in a slow, painful death. Each time the car changes, whether it be fuzzy-dice for the Pizza Delivery Guy or removing the air bags for the Stunt Driver, the car becomes more complex and much harder to configure for each driver. Until, one day, the car is no longer satisfactory for anyone who uses it.

This is the same problem that I see in software: code that should be able to change independently, cannot change. Or, at least, it cannot be changed without accidentally breaking something else that mistakenly references it.

Now, don't get me wrong - sometimes breaking changes are required; that's just a fact of software evolution. And, I'm certainly not saying that DRY code is bad. On the contrary - removing duplication is the cat's pajamas - the bee's knees! All I am saying is that when you are evaluating whether or not code is, in fact, duplicated, you have to take into account the reasons that the code may change. And, if that code has to be able to change independently, that code is unique and does not violate the DRY principle.

Some subtle reasons that code might have to change independently:

It is heavily coupled to the user interface (UI).
It has different performance constraints.
It has a different level of tolerance for accuracy (ex, uses a replica data-source or cached data).
It side-steps certain business rules.

The good news is, there are very likely to be aspects of the code that can be factored out and reused, minimizing the amount of so-called "duplicate" code. And, I will say that more often than not, this subtly-unique code is related to data access, not to data mutation.

As a final caveat, I will gladly admit that I'm not a terribly good programmer. But, I do know the pain of refactoring. And, refactoring code that was built to change independently is a lot more enjoyable and results in far fewer errors.

Short link: https://bennadel.com/2796

Reader Comments

Miguel Ulloa Mar 7, 2015 at 4:38 AM

8 Comments

I could see you may be able to use the Strategy Pattern to address the issue. That way you can have two unique concrete implementatione but still use the two cars just the same.

Gion Mar 7, 2015 at 6:08 AM

6 Comments

Hi Ben. Nice post and I share your view to some extent. But Id classify it generally as over engineering or anti patterns. Like devs who create 20 folders for components but each folder only contains one file. 10 different locations where you build different levels of configuration astraction etc. There are many anti patterns like this. For the problem you describe a strategy pattern, adapter pattern, decorator pattern or generally just simple inheritance based abstraction solves this problem quite nicely. When I'm facing shared code that will need to be branched out, the first attempt is always to find the next level of abstraction that solves the problem for both branches and build a base for inheritance. Then the stunt car and pizza car can inherit from the general car and you stay DRY and cost of change stays low. If the abstraction its on method or compution level a strategy pattern can help.

Ben Nadel Mar 7, 2015 at 7:48 AM

16,020 Comments

@Miguel, @Gion,

I definitely agree that there are a number of patterns here that work - for the first one or two changes, at least. And, maybe that's the end of it. The problem that I see is that as the software continues to evolve people try way too hard to shoe-horn things together and almost never consider ripping them apart.

Going back to the car example, what happens if we decide the Pizza Delivery Guy needs to use an electric car rather than a combustion engine? This has engine and battery-weight considerations. Then, later on, we decided that actually the PDG should drive a manually powered Rickshaw?

If you try to have these to part of the software using the same car, at this point, that's where you get into the over-engineering. At some point, you have to stop saying "how can I make these two things the same?" and start saying, "how I can make these two things simple?".

Clearly, there's a lot of nuance! But, I just find that people who worship DRY'ness like its a religion will end up making as many bad decisions as people who don't care about DRY'ness at all.

Gion Mar 7, 2015 at 7:57 AM

6 Comments

Hey Ben. I totally agree with you. Complexity grows anyway and its important to stay simple. If that means a bit less DRY for the sake of simplicity I'm fine.

Andy Mar 7, 2015 at 11:35 AM

10 Comments

I wholeheartedly agree with the part about people treating certain coding principles as religion: same with languages, IDE's, OS's, etc. - they're all just tools in your box. A craftsman learns to use them all.

However, one of the other problems is treating these principles (i.e. DRY) in a vacuum. In a vacuum, your scenario has legs. However, most of your scenarios above are mitigated by proper use of the SOLID principles and/or other patterns as suggested by other posters. Specifically open/closed.

I realize that you are refactoring older code, so moving target and all that... yeah.

Miguel Ulloa Mar 7, 2015 at 2:43 PM

8 Comments

As @Andy suggested. I guess the lesson is to be suspicious our coding strategies. Communicating these to other developers could have resulted in better more maintenable code.

RIAstar Mar 7, 2015 at 7:02 PM

1 Comments

As I was reading your story about Cars and Pizzas, I couldn't help but thinking: "this has OOP written all over it". From the chosen examples to the actual problem. The reason that there is a conflict between keeping things DRY and keeping your code maintainable, is the object-oriented paradigm itself. As long as you keep thinking about such problems in terms of cars and all their possible variations, you'll encounter the same issues. And frankly, some solutions proposed in these comments - especially reliance on inheritance - only make things worse: once one starts creating an entire taxonomy of cars to be able to implement shared features DRY-ly, anyone but the creator of the hierarchy (especially you, the refactoring guy) will be in a world of pain.

I thought about expanding on your metaphor to provide an alternative solution but I couldn't, because thinking about code in metaphors *is* the problem. You won't hear me saying that OOP is fundamentally flawed, but anyone who's learned OOP has learned it through metaphors (I am one of them). @Andy rightfully suggests that adherence to SOLID principles will save you, however the reality in the wild is that you will find Cars everywhere (you can't really say that a Car has a single responsibility, can you?). Because we were all trained that way. It's in our genes, so much so that it has lead some to believe that inheritance is the crux of reusability, whereas in reality it usually makes code unnecessarily brittle.
Inheritance can look like a good solution because it's easy to reason about - initially -, but there is no better place for a single responsibility than a pure function that you can compose, decorate and otherwise manipulate until you build the required driving experience.

Jean Mar 7, 2015 at 11:35 PM

16 Comments

I love you philosophical posts. They lead to the best comments conversations and they really make your readers think about their own code. Keep up the good work!

Mark Gregory Mar 7, 2015 at 11:39 PM

10 Comments

Amen! I have seen more sins committed in the name of DRY than any other development principle. I *might* have committed a few of those myself.

Miguel Ulloa Mar 8, 2015 at 11:40 AM

8 Comments

Hey @Mark, I guess I have been less lucky. I think I've seen more crimes with people not using DRY more than with over using it. I used to work for a company where implementing a new client meant copying the entire app's code and pasting it into another folder. After doing this hundreds of times making the smallest of changes across the different instances of the app was very painful and avoided. From that extreme scenario Ben's problem seems like a nicer problem to have. At least the developers where trying to apply a good principles. Of course there are no silver bullets and to @RIAstar's point even all of OO practices themselves should be carefully questioned when applying them to our code. Is this the right solution? Am I over engineering and therefore over complicating things? As @Ben suggested at some point we have to stop trying to make things the same and start making them simpler.

I blame the Java influence on ColdFusion for this. CF has always had an influx of Java developers. These guys lived in a world of NOUNs, a very unnatural world where NOUNs are first class citizens and VERBs must be wrapped within a NOUN in order to do the things we need to do i.e. register() within a warper class RegistrationManager. This leads to huge inheritance trees and it is very taxing as @RIAstar suggested. Now we know many of these former Java guys who are now part of our community have created great tools for us so we cannot deny that OO has good practices which allow us to create large, complex and maintainable systems, we just need to understand when it isn't the best solution. It is a bit outdated since Java 8, but for more on read this great blog post: http://steve-yegge.blogspot.com/2006/03/execution-in-kingdom-of-nouns.html.

@RIAstar, while I agree OO isn't always the best solution, I think in this case OO is the best solution. @Ben does not need a special stuntDrive or pizzaDeliveryDrive VERBs. No mater how many vehicles get added to the system, we just need one VERB, drive(). I suggested the Strategy Pattern because it address the issue quite nicely. Rather than inheriting functionality you inherit behavior. This'll allow you to implement the Stunt car without ABS and/or air bags and the Pizza Delivery car to be electric or any future changes without removing the drive VERB/method so that the app can still polymorphically execute drive().

Gion Mar 8, 2015 at 6:49 PM

6 Comments

@RIAstart interesting thoughts and I need to admit, I'm also coming from an OO background. I was a Java developer for quite some time. My day to day job was to create new classes in an endless ocean of already existing classes. Trying to solve problems with transfer objects, DAOs generated by DAO factories initialized by builders. Classes with 5 lines of code. 10 interfaces with one method each. And even classes with 10 levels of abstract super classes. Yes, that's what people made out of OO. And it was taught us to do so. Just look at the whole Java Enterprise ecosystem.

You were praising SOLID principles but actually most of them are used in conjunction with inheritance (e.g. liskov substitution). I believe inheritance is nothing bad in general but its heavily overused and not used correctly. Also one of the biggest issue I see is that people tend to write their software too abstract from the beginning ending up in a huge inheritance chain and complexity they will actually never really need.

I love functional programming and functional composition and use it wherever possible. Its my first choice. But there are things that can't be solved with functional programming (at least I've never came across solutions like this). Tell me how to solve complex polymorphic problems without OO? One example are visual charts. If I implement this with OO I can easily design a generic chart and implement my bar chart and line chart. If I want to combine them into one chart type I can create a Compound Chart type from the abstract base that will delegate to the two chart types and unite them into one combined chart. These kind of problems are quite common in IT systems and OO has a perfect way of addressing them.

I started to look into functional programming with Clojure and learned to apply the concepts of higher order functions, closure scope, reconsidering the meaning of a list, recursion with tail call optimization and building complex functionality by functional composition. I do the same today with JavaScript (at least for what's possible) and I write functionality more efficient than ever (lines of code and maintainable complexity, maybe not performance).

Functional style programming and functional composition is great and provides re-usability on every level of abstraction, however, that's not how the real world works. I'd love to learn from someone who can teach me how to write my whole application with pure functional code. The UI elements, database connection, behavior on screens, game loops, user input... I'd love to know how this can be modeled with simple functional composition.

@Miguel, I think human think in objects. Objects with attributes. Objects that interact with each other. Some people learn OO by designing classes with nouns and methods with verbs. I think that's not how to think about Classes and Object. I think it's directly related to nature and how we interact with it. That's how our universe is built right? A huge pile of composite objects. Atoms, molecules, proteins, organs, body, earth, milky way yadda yadda yadda... Maybe it's wrong to build software like this but I can't really think of an other way to design large systems. If you look at big software systems and visualize them, they look like a building plan of a big factory.

Miguel Ulloa Mar 8, 2015 at 10:17 PM

8 Comments

@Gion, I have not discussed my personal design process, i was only using the language used on the reference I provided. I actually attempt to use whatever design paradigm I think best fits my Story. BTW, that resource Is great. I often recommend it to devs with OO fiver. Umm, you actually reinforced many of my points . Perhaps, you did not read whole comment???

Ben Nadel Mar 9, 2015 at 8:39 AM

16,020 Comments

@All,

To be completely transparent, I am NOT good at OOP style programming. Not only have I long had trouble wrapping my head around it, I also come from a programming language - ColdFusion - in which the "cost" of instantiating new objects is rather high (performance-wise). As such, my journey down the OOP path has been rocky, at best.

I find that I can *sort of* think about Objects when it comes to building utility objects / classes / features. But, the second that I have to build anything that is related to the *business model*, I quickly revert to my "service layer / workflow layer" habits.

And, for full transparency, the software aspect that spawned this post was closely related to data-access, and not so much related to anything about modifying the domain. When you have to make data access generic, you have to start building much complexity around filters and conditional relationships. The more UIs that need to access the *so-called same* data, the more I see this complexity getting even worse. Until, it feels like if you modify it at all, everything will break.

And, what's worse is that when too many use-cases get shoe-horned into the same code, it becomes unclear if the logic is even needed. Unless every developer is extremely diligent about deleting code that is no longer needed, code can remain complex even as the application becomes more simple.

That said, it's conversations like this that make me realize how much more I have left to learn :D

Kjell-Åke Gafvelin Mar 10, 2015 at 4:00 AM

4 Comments

I understand what you talk about as I'm in a project right now where we have this issue now that we have a lot of code that tries to solve every other problem you throw at it. And it's a complete nightmare to try to understand and then add new functionality to it!

The main issue is that they have looked at the code and said "If I only add another parameter to this function it will work for this use case as well!"! This makes it become more of less a black box where no one has a friggin clue on what it does.

This is not something that you just can fix by throw a pattern at it. No, you have to change the mentality of the person doing this so that they understand how they should break apart their code without the fear of code duplication.

I have a golden rule that I try to follow and that is that a function should be only so large that it fits on one screen. When you start to break down a huge function with lots of control flow parameters into smaller chunks you will soon see that those extra control flow parameters can be turned into several smaller functions with less parameters.

And now when you have all these smaller functions it will be a lot easier to refactor the code and add new use cases that do not affect the other use cases!

Ben Nadel Mar 10, 2015 at 8:19 PM

16,020 Comments

@Kjell-Åke,

I'm glad you feel my pain :D The worst part of about creating this "solves every problem" is that as the software continues to evolve, calling-code is likely to be removed. And when that happens, the chances are good that the developer doesn't go in to the "black box" and remove code that may have only solved a problem for that particular context.

This means that you can have code, in that black box, that adds complexity but is NOT adding any value because nothing in your application every calls it.

Going back to the Pizza Delivery Guy and Stunt Driver; if they are two completely different cars and you fire the Pizza delivery guy, it's *very clear* that you no longer need that car. Throw it away.

But, if the single car was built to accommodate all kinds of drivers, then when one gets fired, it becomes very unclear to whether or not you even need configuration options for the car.

David L. Mar 12, 2015 at 12:16 PM

4 Comments

I have been on both sides of this dilemma. I have seen (and written) CF code, especially before the era of CFC's, where completely identical queries were cut and paste across several pages. I have also seen, and been guilty of creating, "one function / query that does ALL THE THINGS!"

As usual, the best answer usually lies somewhere in the middle. The question we need to ask ourselves when creating new code, or streamlining old code, is "What is reasonable / responsible?"

Anna Mar 12, 2015 at 3:05 PM

383 Comments

I think this is a really good discussion. I've run across this before with the quandary: should I create a completely different object so as not to change the current object or require it to have to change too much, or should I just add to this object so that I don't have repeat myself or my code at all. As a part of my current job, I have delved into the .net framework, which I NEVER thought I would do and CF was my first and what I always loved and all. It seems like with the way the .net framework is set up, there's an awful lot of repetition. Or it may just be that it's the way the projects I am working on right now have been developed thus far and I'm just not experienced enough in .net to know the difference? At any rate, it's been interesting learning a whole different framework, language, etc., but I definitely like to try to keep up to date with CF, and there's no guarantee I won't be back at it CF'ing again to some point in the future.

Kjell-Åke Gafvelin Mar 13, 2015 at 3:51 AM

4 Comments

What I think is important as well as DRY is Single Responsibility Principle (SRP http://en.wikipedia.org/wiki/Single_responsibility_principle).

Kjell-Åke Gafvelin Mar 13, 2015 at 4:03 AM

4 Comments

** Previous comment was posted by accident **

What I think is important as well as DRY is Single Responsibility Principle (SRP http://en.wikipedia.org/wiki/Single_responsibility_principle). Since CF is so forgiving and allows devs to include all kinds of code within one page you get business logic, data access, rendering as a complete mess!

If you apply SRP you have to think on what is the main purpose of this file. Is it page rendering, is it business logic, is it data access? And once you start to divide the responsibility you will get more maintainable code.

Miguel Ulloa Mar 18, 2015 at 11:09 AM

8 Comments

I am doing some refactoring of some Gateway methods and it reminded me, not so much in my current job but other jobs in the past, I seen these huge generic queries with lots of arguments passed in and conditional logic generating different dynamic queries depending on the parameters passed. They were so hard to wrap your head around and unfortunatelly the guy who made it was an ass, and the SCM's log was no help with comments like "genetic change" you were not able to figure out why the change was put in place. It didn't even stop there were a few of them passed actual table names as parameters themselves so some of the JOINs were dinamically determined at run time. Needless to say that made those particular queries particularly difficult to trace. Of course the creator of the query knew exactly what conditions cause what table or another will be passed but of course there were no comments unti I placed them there.

After looking at the mess I started to pretty format them in a effort to at least make them more readable. That make them grow a lot longer. The longest one was close to 9 pages long. Good luck trying to understand that bad boy.

The lessons I learned from that are that it is OK to have separate fairly similar queries on my Gateway methods. If I am passing extra parameters to modify the query maybe I should ask myself "should I instead create a separate method with its own query?" Some times parameters are OK like on a Search field which if it is passed it can used to further filter the query, but most of the time they are not.

Ben Nadel Mar 18, 2015 at 8:43 PM

16,020 Comments

@Kjell-Åke,

I think the Single Responsibility Principle (SRP) is super interesting in this kind of a context. This rings especially true for me when you think about rendering a User Interface (UI). Consider two different UIs that make use of a method called `.getMostActiveUsers()`. For sake of argument, let's say one is in a public dashboard and the other is in private Admin dashboard.

At first, you probably use the same data-access method in both places because, hey, it's the same data.

Ok, now some time later, the Admins want the admin-facing list to also contain the "last request IP address" in the table. This tiny change fundamentally changes the concept of "most active users". No longer is it one unified concept. Now, you have a "PUBLIC list of most active users" and an "ADMIN list of most active users."

Going back to the SRP, these two different lists have different reasons for changing - one based on needs of the user, one based on needs of the admins. At this point, I would split the original data-access method into two data-access methods, each which can now evolve independently.

This might sound crazy. Maybe you could add a "includeIPAddress" flag in the query. Or, gather the IPs in a different way. But, how long is it before the two different UIs continue to evolve in separate directions? In my experience, interfaces tends to only get more dissimilar over time.

Ben Nadel Mar 18, 2015 at 8:47 PM

16,020 Comments

@Miguel,

Oh man, you should see some of the monstrosities I've created in the past! I'm talking a query of a derived query that consists of like 7 different queries combined with UNION ALL. There's still a few of those beasts in my code. I think one query was -- NO JOKE -- like 900 lines of code. Granted, my style of SQL uses a lot of white-space; but, you can imagine trying to wrap your head around it. A nightmare.

Recently, I complete refactored one of those massive queries to be like 6 different queries (and then I combine the data in the service layer). Now, each query is really small and easy to understand. And, what's more, it's actually faster because I don't have to worry about the penalties of GROUP BY and ORDER BY with large TEXT fields and other problems.

All to say, I feel your pain 100% :D

Ben Nadel Mar 18, 2015 at 8:50 PM

16,020 Comments

@Anna,

I played with .net a bunch of years ago, when I was just getting serious about programming, but don't remember much about it at all. I've been trying to get into Node.js on the server-side. So, it will be interesting to see how I have to shift my thinking in that new context.

Kjell-Åke Gafvelin Mar 20, 2015 at 10:36 AM

4 Comments

You can definitely practice SRP and divide one function into two rather than add a "magic" flag and I strongly suggest that you do. And by doing it like that you will actually get more efficient code re-use than by adding a "magic" flag to a function.

Imagine this kinda pseudo-code:

public Customer[] getCustomers(string name, numeric id) {

... Accquire db Connection...

if (name NEQ "") {
result = ...get customers from db by name...
} else if (id NEQ 0) {
result = .. get customers from db by id...
}

... Close Connection to db...

... map result to Array of customer objects and return....

}

and replace it with this:

public Customer[] getCustomersById(numeric id) {

... Accquire db Connection...

result = .. get customers from db by id...

... Close Connection to db...

return mapResultToCustomer(result);
}

public Customer[] getCustomersByName(string name) {

... Accquire db Connection...

result = ...get customers from db by name...

... Close Connection to db...

return mapResultToCustomer(result);
}

private Customer[] mapResultToCustomer(Array result) {

... map result to Array of customer objects ....

}

This way you get a more clear and concise interface that explains its intents and is less error prone. Maintenance is a lot easier as you will understand what

customers = getCustomersByName("Joe Blow");

rather than

customers = getCustomers(0, "Joe Blow");

If you have, for example, a function that gets data from db, creates and excel spreadsheet and saves the file to disk and is pain in the rear to maintain and understand due to the size of it. By applying SRP to this you could easily make this into four functions where three of them are for getting data, create spreadsheet from data and save the spreadsheet as a file. The fourth function is for combining the other three and becomes then a business logic function.

Anna Apr 10, 2015 at 3:15 PM

383 Comments

Node.js hmmm. . .I had to look that one up! :) I always think that's a good thing when I am reading a technical document and come across something I have to look up. I love expanding my knowledge about various technical methodologies. I do remember reading some posts you had written in the past pertaining to Node.js, but when you program every day and in a different language than the ones you've read about, they all run together a little bit sometimes. From what I could see, I liked it and it did look a little like c# programs I am writing now.

One of the greatest assets to me right now in "learning" this "new" language (and assets is probably not the right word, but after programming in C# all day, my brain is kind of fried right now on coming up with the most brilliant and accurate words, so I'll have to use asset until I can remember which word is more accurate) is the fact that I "almost" have a computer science degree, but studied computer science a lot, and still have an Applications Development Associate's Degree, so I studied Java a lot and loved it and learned quite a bit of it and remember it. And C# does in fact remind me a lot of Java. As well, I also studied Visual Basic, so when doing VB.net on the rare occasion, I'm also able to hang there at least sufficiently. But it is hard "learning" a new language at this "old" age. lol. Especially when I had dedicated so many years to ColdFusion. I think, from my experience, most people who have studied Java or know it at least a little bit would be able to do OK at least working a C# job. That's my $.02, but at this point, it's kind of more like $2.00. C# isn't exactly like Java, but I can see that it's a lot like it, and having known Java helps me a lot.

Ben Nadel Jun 4, 2015 at 8:33 AM

16,020 Comments

@Kjell-Åke,

I completely agree with what you're saying. And, to touch on one of your last points - having a fourth method that pulls the other three methods together, this is something that I've really been thinking about from a general application architecture standpoint. As I maintain an application, it becomes clear where things weren't architected properly because there is much duplication. Imagine a "workflow" function Foo() that does (pseudo-code):

Foo() {
assertPermissionsToExecute();
var x = doThis();
var y = doThat();
var z = compute( x, y );
return( z );
}

The way much of my application works now is that three of those methods - doThis(), doThat(), compute() - ALL check for security access. Because they do too much work! Really, the security should be pulled out and UP in the call stack to make the lower-level functions more single-responsibility. But, they weren't designed well to begin with.

All to say, when you start to see duplication in your method composition, it's probably a good indicator that you original design is trying to do too much.

Oh my chickens, this post is old!

Hit me up on Twitter if you want to discuss it further.