OOPhoto - If Object.Validate(), Why Not Object.Save()?
Posted August 18, 2008 at 10:15 AM by Ben Nadel
In OOPhoto, my latest attempt at learning object oriented programming in ColdFusion, I call Validate() on my objects. This, in turn, calls Validate() on the appropriate Service object and passes itself (the original bean) in as one of the arguments. Validate() is a function that decidedly must know more about the "world" than the object at hand. Therefore, there are parts of validation that must happen outside of the target object. And yet, I have allowed myself to call Validate() on the bean itself as a matter of convenience.
Regardless of the implementation (the indirect call to the Service layer behind the scenes), from the API's viewpoint, it appears that the Bean must know more about the world that just its own data. So, this begs the question: if I do this for the Validate() method, why not for the Save() method as well? Save() is an action that requires not only information about the object but also about the implementation of data persistence in the application. As such, Save() is not something that the object does or should need to know about. Right? I mean the object just needs to know about its own data; persistence should be the responsibility of a different "aspect" of the model.
So, to reiterate, if both Validate() and Save() aspects of the model needs to know about more than just the given bean, why is it that I put Validate() in the bean and leave Save() in the service layer?
The quick answer is, "I don't know."
This is a perfect example of uninformed decision making - of ignorance. I made an architectural decision in my application and I can't explain why I did it. I have brought much shame on family.
I think what this really exemplifies is that I don't understand the principles at play. I believe that what I did was put a Validate() method in the bean to make my objects appear "smart" because, hey, aren't objects in true object oriented programming supposed to be smart and idealized? Perhaps I hoped that if my objects looked enough like OOP-style objects, then maybe my application would be an OOP application.
But the fact is, I can't have it both ways. I can't mix and match the placement of my functionality. I need consistency. I crave it and the sense of order that it provides. I need to either have all of my "smart" methods in the service layer or have them all in the beans themselves. But which is it? If I put them all in the object, then my programming will be shorter and easier to write. But, I don't want my decisions to be based on laziness - if I have to get a reference to the Service layer whenever I want to do something "smart," then so be it. If I put all of my "smart" method in the Service layer, then my programming will be more verbose, but there will be a much better separation of Concerns.
Is it 6 of one, half a dozen of the other? I think both approaches have object oriented programming principles behind them. With smart methods in the Bean, my objects appear smarter and more idealized, a clear aspect of OOP. However, with the smart methods in the Service layer, I am dealing more with processing-based mentality in which I ask objects to "process" the beans, another aspect of object oriented programming.
You may think I am making too big a deal of such a small decision, but really it is quite appropriate. What we are discussing here is a core architectural question that will influence every object that gets created going forward. I will need to go back and think about this carefully.
| || || |
| || |
| || || |
What Other People Are Searching For
"You may think I am making too big a deal of such a small decision, but really it is quite appropriate."
I've read, listened to, and talked to more than a few very skilled application architects that would disagree with the notion that this is making a lot out of a small decision.
Or that this is a small decision.
I would consider the Wikipedia article on "syntax sugar"
I tend to look for solutions that generally follow that model of building a quick "high level" method, like obj.save() through the use of "intelligent defaults" with "low level" methods underneath. The idea is to provide methods of writing concise code without giving up any of the flexibility offered by the verbose alternative.
I struggled with these very same issues in my journey into OOP and the answer is (of course) "it depends." The object.save() idiom is popular in other languages (see "active record pattern"). For CRUD apps (which most essentially are), it's probably fine. The use of a service layer is a "J2EE enterprise" pattern, and a lot of communities outside of Java and CF get on fine without them apparently. Are they "less OO" for it? I don't think so (but I'm no expert). Check out this comment by Elliott Sprehn on Brian Kotek's blog:
Your gut tells you to put the Save function in a service object, not your bean object because you know that you may save the same data in multiple ways (xml, database, rss), and you may want to change how the data is saved at run-time (eg. in response to user preferences).
I'd suggest treating your validate function in the same way:
- you may want to validate the data in multiple ways, against different rules depending on what you are currenlty doing with the data (e.g. allowed characters for a specific data base, vs. correct numeric percision to meet a business need).
- you may want to reuse the rules for different data sets
- you may want to change which rule are applied at run-time. (Strategy pattern)
Ben, just because you have a convenience method where you can ask an object to persist itself doesn't mean that the object "knows" about how to persist itself. What you have here is two ways to do the same thing:
Because internally the user will be calling other objects (like a service or an ORM) to actually perform the save, these are virtually identical. The main benefit to being able to ask the object to persist itself is the level of control you have. If the object itself has control when you call user.save(), it means you have have your own logic there if necessary. This can be harder to do if you go the gateway.save(user) route. So I prefer to be able to call save() on the object. Not because I think one syntax is really superior on its own, but just because of the level of control it offers.
And again, none of this is really going against OO theory. Objects are indeed supposed to be idealized versions of things. What is wrong with asking an object to persist itself? The object doesn't know HOW or WHERE it is persisting itself. It could be an XML file, a database, a remote web service call, a RAM disk....who cares? Certainly not the object.
Its the same if you were to ask an object to render itself. The object itself doesn't know about HTML. It just knows that something configured it with a renderFactory or something, and that it will pass itself into that to actually do the rendering.
"Bob, here's your address. Is it valid?"
"Bob, persist yourself somewhere!"
"Bob, render a summary of yourself."
Nothing wrong here! :-)
I guess, right now, I am feeling that they are more or less the same. I still feel a little awkward about the object.Save() mentality. To me, that is like doing something like:
How can a problem solve itself? Yes, it knows all about itself, but solving? Something about that "feels" like it required a greater entity.
That being said, I guess, right now, what I'm really trying to do is just decide on which side of the fence I want to fall. What I do know, from a comfort level, is that I need to be consistent. I either go all one way or all the other way.
While it makes me feel funny, I am leaning more towards the object.Save() methodology. I know you are doing it for "control" purposes. I wish that my motives were more pure like that; I am leaning that way simply because it provides a shorter syntax which I believe will make the code more easy to write as well as easy to read.
What you are saying is valid. I feel the same way. However, I think a lot of that can be accomplished via dependency injection and factory creation to work with a different syntax.
Mostly, I just need to pick a strategy and go with it in a consistent manner.
As I'm looking at what you're saying, I see your point about Problem.Solve(). But isn't that typically because you *think* about problems? Typically, you have to apply outside information (how to detect relevant information, formulas to use, etc.) for that to happen. In my mind, Object.Save() is completely internal - it doesn't have to have any information about how to go about saving; it's just saying "Okay, put myself into some sort of box here."
(I'm now looking at this and asking if it makes sense - I'm not quite sure. What's your opinion?)
I konw that I personally am a fan of the object knowing how to persist itself in most situations, though I'm sure that there can be other situations that a service object to persist another object would be preferable. That being said, I tend to have my objects understand how to persist themselves to the application's storage medium.
As for the comment about "database, xml, rss", in this case I call that transformation. All of my objects support reflection, enabling me to transform a given object, and it's state, at any time necessary.
Think about the problem of purchasing food at a grocery store. Let's say I grab a carton of Ben & Jerry's Phish Food. But then I stop and think to myself, "Hey, do I already have a tub of this at home?" "Should I be buying another"? This is "validation" logic for the purchase. Now, should the tub of ice cream at the store be responsible for knowing whether I already have one at home?
My gut says No. It is the responsibility of a greater entity (Me -the service) to validate the decision to purchase the ice cream at the store. The ice cream at the store just has to know how to sit there and chill ;)
Saving, to me, seems to go along the same lines. To get it home, first I persist it into a paper bag. Then I persist it into my car. Then I persist it to my table counter. Then I persist it to my freezer. Should the ice cream know about that process?
Now, I am putting those save() and validate() methods in the object, but not because I think the object really knows about how to do that. In fact, I *want* the object to be dumb. I want it to turn around and say, "I have no idea what you are asking, I better ask my 'service' object what to do."
So, while I put the methods in the objects, my gut tells me that the service layer should know how to actually implement the requests.
Okay, that makes sense, I think. I'm getting the picture of an object kind of floating along, chanting, "I'm chocolate ice cream, I'm chocolate ice cream!" All the while, there's a Service Monster hanging over it, saying "Feed Me!" and waiting for the object to give it chocolate ice cream. :-)
But by the same token that the object knows all about itself, doesn't it also have to know that it can be persisted/validated/saved?
Well, I guess you could say that since the Business object know to call the Service object to have itself saved or persisted, then Yes, it does need to have a sense that those things are possible.
I suppose you could have the Business object to send any "unhandled" messaged up to its service (via OnMissingMethod()), but that just seems too vague.
So, yes, the business object does need to know that those action are possible. But, it doesn't necessarily need to know the implementations - just where to send itself.
Ben when you say "Hey, do I already have a tub of this at home?" "Should I be buying another", I think you're asking the wrong question, because in that case you're talking about two different instances of two different objects (homeIceCream vs. storeIceCream).
With a User, we're talking about the same instance (let's pretend for a moment that the objects exist in memory forever). So if I say "Bob do you contain valid state?", that is a valid question and I think it's perfectly fine to expect Bob to be able to answer that question.
Back to the ice cream example, even there I don't think what you're talking about is validation, but rather exposing object behavior. If I have a Product object and I say "Hey Book, has the user already purchased a copy of you?", again I think an idealized Book would be able to answer that question! It definitely would mean that internally the Book would probably go ask a Service to check for an existing purchase from that user, etc. But the key here is just adding useful behavior to the object. If it makes things easier to be able to ask a Book if the user already bought a copy, then by all means, make it so! book.alreadyPurchased() is a perfectly acceptable thing to ask a book if your application benefits! :-)
I guess the real question is how idealized is "idealized." I believe the hard part about this question is that we are asked to make judgment calls about the benefits of something that cannot exist in the real world. As such, not only are we guess at what idealized is, we are basing that on the assumption that our ideas are beneficial.
I think THAT is the biggest principle that I am missing! I don't have a solid understanding of what it actually means to be idealized.
Hmmmmmm. Sounds like a whole other blog post....
That's true. Idealized really equates to usefulness. Consider:
Should I be able to ask a user object if their user name is already in use? I would say it's certainly fine.
I heard this one somewhere and thought it was very interesting. Consider a ClassRoom object and a Student object. In the real world, if the Students don't know where their next class is, some other object, maybe a Teacher or a SchoolMap object, would have to be asked about directions to the next class for every Student to tell them where to go. This would be a procedural approach. But in OO land, there's no problem saying student.goToNextClass() and have the Student be able to determine themselves how to do that.
This is done in real life all the time. As I understand it, places like UPS have very robust object models where a Package determines the best route for it to take to get to a customer. This involves all sorts of game theory and decision trees, where things like available space, package shipping level, cost, mode of transportation, etc. all play into the determination. Each package essentially "bids" on ways to get where it needs to go, and whatever route is the cheapest and meets the needs of that package wins. Clearly in real life a Package can't do any of this, but we're talking about the ideal Package. It doesn't get much better than saying "Package, figure out the best way to get where you need to go and then go there"!
I'm certainly not an expert - but I'd say that an "idealized" object is one that simply does what we need it to do. (which may vary between applications) Which means that, provided we've thought hard enough about what the objects really need to do, we're not "guessing" about what an idealized object is - we've validated our understanding of the object's functionality enough that we don't have to assume that our ideas are beneficial, we know they are.
A book can't, in the real world, know that you already have a copy of it at home. And, for some applications (a bookstore inventory program, for example), an idealized Book (a Book object) wouldn't need to know that. But for another program, the ideal Book might be one that knew whether you had a copy of it or not. Perhaps the difficulty is in the belief that you can have an objectively idealized Book, rather than a Book that is idealized for a given purpose?
Just random thoughts.
I think Brian's point Bob is basically correct (even though that's not how I currently program). Basically, as far as objects go, we can pretend that they have an inkling of an idea that they can be rendered, validated, and persisted.
Sure, they don't know anything about HOW they get rendered, validated, or persisted. But when they were created, they were handed a persistence service object (or maybe, later on, a new persistence object was handed to them), and they just save themselves using that service layer. You swap out these services based on the strategy pattern (or at least, the way I understand the strategy pattern) as needed.
I remember this was a discussion Hal Helms was having a long time ago, about whether an object could display itself. It's something he wanted to try to do, but felt that conventional OO logic was against him.
One of the questions we also have to ask, is -- does choice of language matter? Is it easier in one language to pass the object off to a service layer vs. having an internal persistence call, or does it not really make any difference in terms of construction.
I'm interested to see the ongoing conversation surrounding this.
Uh, guess that should be "Brian's point about Bob". Sorry.
Ben, I think that maybe you're asking two different questions, and rolling them into one is getting confusing.
The first question is, "What do I want my object to be able to do?" Technically, that corresponds to, "What do I want in my object's API?"
It sounds like you'd like to be able to ask an object to save itself, by calling myObject.save(), so save() should be part of the object's API.
The second question is, "How should that behaviour be implemented?", meaning exactly where should the code to do the save be located. You could choose to write that code in the object itself, allowing it to talk to an ORM for example, or you could implement it elsewhere (e.g., in a gateway) and then inject the gateway into the object. The choice is up to you.
So really, those two questions can be asked and answered independently, giving you your cake and allowing you to eat it too. At least that's how it seems to me.
The object doesn't have to know anything about saving itself - it just has to know exactly where to find that information, or, where to pass itself to complete the requested operation.
I've found a good way to handle this is to pass in, via the object service, a pointer to your DAO to every instance of the object, and set that pointer as an instance variable. So in the object bean, you might have something like this:
<cfargument name="dao" hint="reference to this objects DAO">
<cfset variables.dao = arguments.dao>
And then...when the save method is called, the object passes itself to the dao:
<cfset variables.dao.save( this ) />
I wasn't going to post anything else here after having made myself look like an ass before. :P
But in retrospect I will add a couple more comments. Brian's description of the idealized package object at UPS is the underlying motivation behind my creation of the CacheBox project, which I've only barely started working on (none of the code works yet). I wanted to create a system by which you could create a naive CacheBox which can internally determine whether or not various caching services are available and what its own best strategy for caching is, independent of the rest of your system. So that while you might initialize it with an "occupancy" (limit of the number of objects in cache), you're not spending any time working on configuring or developing the service layer that performs the actual caching. You just say "here's my cache -- cache, store this x" and the CacheBox then determines the best available strategy (which may even change periodically throughout the day based on usage trends and available resources). I would love to get some more folks involved in that project as well to help me figure out methods for strategy selection, resource analysis, etc. :)
I will however also throw out another comment in the "this may sound egotistical" category with regard to persistence. While it's all fine and good to say that you want the flexibility and separation of concerns to be able to save any given object so XML or various other places that are *not* in a database, any of those things (yes, even XML) are very obscure edge-cases and easily dealt with on the rare occasion that they crop up with any number of various strategies from "okay, I usually call object.save(), but I'll call xmlPersistenceLayer.save(object) here instead", to inheritance (com.myCompany.xmlPersistedX -> extends com.myCompany.X) to a decorator which gets wrapped around the original object and replaces the original persistence methods. The latter two solutions even allow you to reuse existing integration code.
Which is why in my mind, I don't see an advantage in terms of "separation of concerns" to avoiding object.save(), because well over 99% of the time the object is going to be saved to a database.
Avoiding it simply because you *might* (but in all probability won't), means you're creating more work for yourself simply because there's a remote possibility that you might actually need to do more work. So the options are "more work all the time" or "more work on rare occasions". I opt for "more work on rare occasions" personally... ymmv of course. :)
And speaking of alternatives... creating an XMLFacadeAgent.cfc for the DataFaucet ORM which would behave like a SQL-Agent but interact with an XML file under the hood is another in a long string of cool projects I don't have the time to work on right now. Would love to see someone else give it a shot tho! :)
That UPS package objects sound meetier than my entire application :) This has given me much to think about.
Also, a very good point. To think that, in the middle of trying to find "one" answer of idealization, I am faced with the possibility that ideal is context-sensitive. Uggg :)
You are correct in that the API and the implementation are two different questions. However, my real question now is not what I want my objects to do.... the question - the understanding - that I need to obtain is what they should do! A slight, but important difference. Wanting implies understanding, wondering implies confusion (which is what I have).
This has turned out pretty long, so I'll start with my conclusion:
1. save() does not belong on the bean
2. validate() does belong on the bean - but not as much of it as you think.
OK, my reasoning:
I don't think bean.validate() needs to know about more than just the bean. Things like uniqueness checks belong to the collection(s) to which you are trying to add the bean. A bean can be OK to add to one collection and at the same time not OK to add to a different collection. In some circumstances for "collection" you can read "database table".
WRT save(), I like to think of persistence scopes as being in the spectrum of variable scopes. We're familiar with function local, source file, package, request, application etc. scopes. A datasource is just another scope. So, if I have a local variable in a function, and I want it to go into the Request scope, in general I won't do that inside the function - I'll return the value and the caller can then put the result where they want it. In the same way, I have whoever owns the reference to my bean decide where to put it, perhaps into a persistent scope by saving it.
As an exercise, rename your bean.save() function to bean.addToPersistentScope() and see how you like it. Maybe add bean.addToRequestScope() and bean.addToApplicationScope(). Not so pretty IMHO.
An extension to this thought is transitive persistence. If I put a value into a struct that is in the Application scope, that value is then also in the Application scope. I don't need to explicitly assign the value to the Application scope, in fact I don't even need to know that the struct is in the Application scope.
In the same way, if I add a bean to a persistent collection, or set a bean as a property of a persistent parent, then my bean should also be persisted. I shouldn't need to explictly save it. So the only beans that ever get explicilty saved are what the domain-driven design folk would call an aggregate root - some top level bean that has no parent.
It takes a pretty serious ORM framework to do all of this seamlessly. Transfer isn't there yet, Hibernate is further along but still not quite there. In the meantime we have to add some of the plumbing ourselves. I just keep that end goal in sight, and don't expect the plumbing to look like anything other than plumbing. But right now that end goal tells me:
1. save() does not belong on the bean
2. validate() does belong on the bean - *and* on the collection, *and* a couple of other places that I won't go into again.
I'm sorry - I think your discussion is way to advanced for my low-level understanding.
I think part of my confusion comes from you referring to different types of persistence. On one hand, we have the APPLICATION scope. On the other hand we have the database. You are referring to both of these as persistence (which is accurate). However, the database is not a "live" scope like APPLICATION is. It's not plugged into the ColdFusion framework. I can't just refer to Database.[table].RowNumber or something to reference a record.
As such, I don't think that they can be treated in the same way. Something, somewhere needs to explicitly save the data back into the database. After each query, that relationship dies. You need something to explicitly save it again.
Plus, what if you want to make changes to the object before you save it to the database? If you have this automatic DB persistence, how can ensure that one user doesn't set values that corrupt the use of another user? After all, there might be some intermediary validation that needs to take place before the data is actually committed.
Anyway, thanks for sharing your thoughts, but it pretty much went over my head :)
One of the problems with these types of questions is precisely that the answer changes based on the context. A user object should do what that specific application needs it to do. A different application will have different needs.
Should you be able to ask a persistent object to save itself? Much as I don't like the Active Record pattern, it does have the convenience that once you have a business object, you can do whatever you need with it and then tell it to save changes without needing to reach back to the (user) service to handle the save operation. That becomes useful when you have code that asks a (user) service for a user object and then passes that off to another service (S) which may or may not perform some operations on that user. If save() is only on the user service, you have to tie your S service to that user service - which increases dependencies.
I view validation in much the same way: I'd rather just go to the object and ask it "Are you valid?" and not have to care about where the actual validation logic really lives.
Going back to the ice cream example, the model that springs to mind is shopper.shouldPurchase(product) and the shopper object is responsible for deciding whether to purchase the product or not. However, there is no one true way to approach this problem - it all depends on what elements are important to model in your scenario and what behaviors are important.
Some more thoughts on what is means for an object to be "Ideal". While I know the implementation of this will change from situation to situation, I am trying to explore the principles behind such decisions: