OOP Philosophy: Invalid State vs. Invalid Method Call
Posted March 23, 2009 at 9:15 PM by Ben Nadel
Today, on a conference call with Hal Helms, we started to talk about domain modeling and which objects in the application would be responsible for what logic. The example we had on hand was that of an account transfer. We decided that it would be good to model the Transfer itself as a domain object that might have the following subset of class properties:
- OriginatingAccount (Account)
- TargetAccount (Account)
- Amount (Money)
One of the benefits of having a Transfer class was that you could subclass it for specialization. For example, if your application would only allow for a given account to have 4 transfers a month, then we could have some sort of StandardTransfer class that extends the Transfer base class and has logic to allow for only 4 transfers per month.
I have gotten into the mentality of thinking about classes as data types and with that, I have accepted the idea that a data type can only exist in a valid state (otherwise, it cannot uphold the contract of the given data type). As such, I asked Hal if this subclass, StandardTransfer, should throw an exception in the constructor (Init() method) if the given OriginatingAccount has already executed four transfers in the current month. To me, it seems that in order for the StandardTransfer to be in a "valid state", it would need to contain an originating account property that could perform the transfer. If the OriginatingAccount could not perform the transfer, then it would seem to me that the StandardTransfer would not be in a valid state.
Hal disagreed with this. He believed that the validity of the class was defined by its data types and not their capabilities. Meaning, as long as the subclass was passed two Account classes and a Money class, then it was in a valid state. What he argued would be "invalid" would be the execution of the Transfer's Execute() method. The Execute() method would throw an exception if called. The thinking here was that the StandardTransfer class would have some sort of IsValidTransfer() method that would check the transfer properties in the context of the 4-transfer-limit business logic. To say it another way, the object was valid, but some of its method executions would not be valid.
I have to say that this rocked my world in a huge way and I'm having trouble reconciling prior Hal Helms teachings with this concept.
I think this really took the legs out from under what I considered a "valid state" object. To me, the validity of an object was modeled as the combination of each data type and its meaning within the application. For example, in order for an "Account" class to be in a valid state, it would need to have an "account number" string property AND that account number property could not be the empty string (for example).
But if we take what we see in the Transfer / StandardTransfer example, and apply it to the smaller example of the Account object, what we see is that an object's validity is a function of its composed data types and NOT what the value of those data types actually are. So, if you can create an object who's data types are valid, but who's composed values are NOT valid in the context of the application, then somebody, somewhere needs to know if the object is valid from a contextual standpoint.
At this point, you might be asking yourself where I am going with this line of reasoning? Really, what I'm having trouble reconciling is Hal's belief that you should never call IsValid() (or Validate()) on an object (a point that was made very clear in my first Real World OO class). His reasoning behind this was that an object should never be able exist in an invalid state, so calling Object.IsValid() would be a non-sequitur. However, if the "valid state" that he is referring to is a function of data type, not data value, then validity, in the way that we generally think about it - in the context of the application - is completely unrelated to the concept of object validity.
As such, either the service class that is creating and populating objects needs to have some sort of IsValid() method call (to which we would pass an object instance for business-logic validity) or, the object itself needs to have an IsValid() method that checks business-logic validity.
So, which one is it?
If we have an IsValid() method on the object, then this really goes directly against what Hal taught me previously?
If we have an IsValid() method on the service class, then we start to create "bloated" service layers and anemic domain objects?
(I use question marks because I am not sure if this logic is sound)
The other option we have is to put business-logic validation in the Controller. But, if we do that, then we lose out on code reuse, forcing us to duplicate validation logic anywhere the Controller needs to create a given object.
And so it is that a seemingly simple conference call has inadvertently rocked my world and turned my mental model of Object Oriented Programming on its head.
Why wouldn't you extend the Account class with a TransferAmountTo(Amount, TargetAccount) method? I'm having trouble rationalizing why a transfer (an action verb) would be an object, which are traditionally nouns. You wouldn't want to take that transfer, serialize it, and move it someplace else, nor does the transfer mean anything outside of the context of the Account.
Subclassing the Transfer, then, would instead apply to the Account. (Though I'm not saying that you would actually subclass the account just to specify that you could have 4 transfers per month -- that seems just wrong.)
Making it a method instead of an object then solves any sort of invalid state issues.
I feel like I'm in a swimming pool up to my eyes, it's not quite over my head, but still a bit deeper then I'm comfortable with.
I think I see Hal's point, init'ing an object with technically valid data should not throw an error, even if there is business logic that might not like that particular set of data. I began to think.. what if you had another method in there.. say "TransferStatus()" that would instead of send a transfer, but check on the status, if the init died on the data you wouldn't be able to get to that method, which in that context is valid. Or maybe even a Rollback() method.
And following up that example if you, create a transfer for the 5th time (from=Ben,To=Tim,Amount=5000), the init should let that through because the data is sound. If you wanted to check on the status of that transfer, you get "Ben Refused", if you try to roll it back you get "Amount Sent Back", and if you try to Execute that transfer you get "Whoa! you've hit your limit tonight, why not go back home and spent dome time with the wife and kids".
OR.. am I completely missing something?
Another thought would be, maybe you should have business logic that prevents StandardTransfer.init() from happening if you've hit 4.
It does raise a pretty fundamental question though, that I struggle with at times. Where does the business logic validation belong. Because I could imagine a case where 4 becomes the standard limit, but then certain people are allowed 5.. or 6. Ad checking what a given person is allowed muddies up the transfer Object quite a bit. Maybe you need another Object called "TransferPolice" that wraps around the transfer and and won't run Execute() on the transfer if it's not valid.
Anyway, no need for me to write a book, but I am very interested in this topic.
To be honest, I don't think the Transfer example was the best example; however, I think the example was secondary to the confusion that it caused in my head. Transfer example or not, what it came down to was the concept of:
data type validity vs. business logic validity
... and where this validation takes place and what it means for an object to be in a "valid state."
I am not necessarily disagreeing with the fact that you should be able to create a Transfer object that has accounts that would not be able to execute the given transfer. What I am saying, however, is that this concept goes against my previous understanding of OOP and object validity. But, I am not so sure that my previous understanding was correct at all.
But, I agree that this does raise the fundamental question about Where does the business logic validation go? If we want our domain objects to be idealized, then they should know a thing or two about business logic validation; however, if they are going to know that AND be able to be created in a business-logic-invalid state, then they are going to need to have some sort of IsValid() or Validate() method to get any potential errors about this business-logic context.
And, to get back to my major point, this idea of an idealized object knowing about its own invalid business state goes against my previous acceptance that objects could not exist in an invalid state.
You may have passed good data to the object, but that does not make the object valid. This is an interesting thought.
Running through some thoughts...
Should the constructor only check its properties for validity?
If the object has valid properties, but cannot perform a function, then is it a valid object? What if it can perform 3 functions, but not 1 function?
Transfer.init() throws error: over transfer limit
Transfer.execute() throws error: over transfer limit
My opinion is if an error will occur in all method calls (excluding getters/setters) or fundamentally limit the object's purpose, then the constructor should check for it. Otherwise, the function should check.
Since a Transfer object would have no purpose without a execute function, then it seems like it should throw an error during construction.
Ben, I don't think you should ever ask the object isValid() -- that would be wrong since the object should never be in an invalid state. Instead, we're asking "can I perform this operation?" If you're thinking that an object that already has 4 transactions is in an invalid state, then what would happen if we started with 0 transactions and worked our way up to 4? Is the object now in an invalid state? That can't be right.
The point of creating an AccountTransfer object is exactly to *hold* business logic that does not belong in either a SavingsAccount or a CheckingAccount object. Here we have a very important concept: an object that expresses behavior (rather than primarily holding data).
Should we be able to create that object if there are already 4 transactions? Certainly. It may be that all Customers are required to have an AccountTransfer object. It could be that the AccountTransfer object holds some history that we need access to. (Bad example, I know...) The point is that the TransferObject isn't in an invalid state: it's that a certain operation (execute, in your example) can't be performed.
There could be other reasons why "execute" can't be performed. Perhaps a temporary hold has been placed on an account. Is the object in an invalid state? By no means. Again, the important point is that we've encapsulated a behavior as an object rather than a thing (such as an account).
I can see and accept the idea you can create a transfer object that cannot execute. Really, the philosophical journey of exploration was not that this was a bad idea in any way, but rather that it totally changed what I consider a "valid state" object.
Let's take it to a super low-level example, imagine that a user has to create an account and that account has only two string properties:
* Username (string)
* Password (string)
Now, if we are saying that the object can only exist in a valid state based on data type, not business logic execution, then having an account with a zero-length username and password would be considered valid and I could therefore create that object instance without error.
But, now let's say that my application requires that a "valid" account actually have a username and a password of length and furthermore that the selected username is unique to the system. In that context, the above described Account instance would not be valid in the business context.
So, what rocked my world was really that when I was talking to you about "valid state" objects, I was referring to business-context validation and you were referring to data-type validation.... we were not on the same page.
Now, objects with behavior are one thing. BUT, I think many of us create objects that do not have any definable behavior. Despite this lack of behavior, they still need to exist in a valid state in the context of the application (such as just prior to persistence). As such, someone needs to validate them from a business standpoint.
If you want to encapsulate this business logic in the object (as you said above), then the business object (ie, Account) would need to have a method perhaps to check for method execution:
Of course, now w run into the conundrum that persistence is NOT a concern of the business object as persistence it byproduct of application environment, not of the "what it means to be an object" existentialist question.
So, where does that leave us?
* We cannot ask an object if its ready to be persisted as it doesn't know what persistence is.
* But, we also cannot simply ask it if it is valid as you are saying that this would never make sense.
So, I guess, I just don't know where to go with this?
Let's use your example of username and password. Is it fundamental to the notion of a User object that the username MUST exist? Let's assume it is. How about for a password? Yes, again. The data types are great, so is the object in a valid state? Not necessarily.
As you said, we want to restrict the username and password to be non-zero length strings. Let's say, further, that you've said that for an object to be in a valid state, it MUST have a username and password on its creation. In that case, this call...
cfset session.user = CreateObject('component', 'User').init('', '') must fail: no object will be created. So, it's not just data types that determine a valid object, but possible restrictions on those data types.
Now, in your example of the MoneyTransfer, we have to ask, Is it fundamental to the nature of a MoneyTransfer that x obtain (whatever "x" is?) If so, a MoneyTransfer object should NEVER be allowed to exist such that x does NOT obtain.
Perhaps this has gotten so murky because the idea of the limitation on only four transactions per month was so ill-advised. That might well be a business rule, but should it be in the MoneyTransfer? No.
We might be better choosing a different example to work with.
I'm not understanding why datatype validity is an issue. If you give the properties of a Class strict datatypes, there should be no need to validate the datatype. For example, if Amount is typed as "Money" (which I assume is a Class itself???) then if you pass an Array, or a some other Class that is not a Subclass of Money, it should throw an exception. Am I way off here, or am I missing the point?
I think the money transfer example is too complex to discuss well. To me, it's the more complex version of a simpler concept. As such, let's stick to the Account example.
Let's say that Account.Init( "", "" ) fails because what it means to be an account requires a username and password. However, let's say that in the application, we have username restrictions. For example, let's just say that the username "Admin" for whatever reason is restricted and no one can use it. This is clearly a limitation of the business logic and not what it means to be an account domain object.
Account.Init( "admin", "xyz" )
So, now, we are back to the place we were before - the "valid state" of the object is correct if you base this purely on data type restrictions. However, in the context of the given application, then this data type is not valid for use.
Now, assuming the only way this will be used is to be persisted. Someone needs to know if this object is valid in the context of the business logic.
Who does it? Bringing back and modifying my text above:
* We cannot ask an object if its ready to be persisted as it doesn't know what persistence is.
* We cannot simply ask it if it is valid as you are saying that this would never make sense.
* We cannot put this validation in the service layer as it creates a thick service layer and a potentially anemic domain model.
That is correct. What I am driving at is that my understanding of what a "Valid" object was went beyond just data type checking and into the business logic validation of the passed-in data. It was on this premise that I formulated other OO concepts.
However, now that I am beginning to see that "valid state" objects were actually just referring to data type validation, then my other premises fall apart.
I think I'm still missing the concept here. Why not validate the data before creating an instance of the Object? For example, when a user is creating an account - they hit the submit button - call a function that validates that the username is valid - it is not null, it is unique, it is not a disallowed name. If validation passes, then you create the instance.
FYI, I'm not trying to tell you that you are wrong, I'm just asking "why?".
What you just asked is exactly what I'm trying to figure out :) The question is, however, if you put that validation in the layer above the domain object (ie. Account), then, does the Account become nothing more than a data container (with built-in type checking)?
And, if so, is that a *bad* thing?
@Ben: That business logic restriction (no "Admin" username) is NOT fundamental to what it means to be a User, is it? (If it is, it should be enforced by the object.) Most of the time, we have business logic as a superset of the logic required by the object to be valid.
Here are some more super-object restrictions:
* no duplicate usernames
* no people with a username ending in "hotmail.com"
* no employees of the company may be users (they're Employees)
When we say that an object must be in a valid state it does not mean that it conforms to these business rules. How, for example, would a User object know if the username was unique? We see that we have business logic that is a superset of the object's validity constraints.
So where should that go? In a UserCreation object perhaps that could make all these and any other constraint checks BEFORE we try to create a user. Note (again) the use of an object that models BEHAVIOR rather than statefulness.
So, are you saying that the business-restriction validation on top of a data type can go in something like the Service layer (assuming we there is no benefit to creating a new object to model every action with business logic)?
You COULD put it in a service layer, but I'd prefer to put it in a separate object whose job it is to know how to do things like UserCreator. That way, I can swap those out at run time. Imagine that we have some wacky business logic that alternates between weekdays and weekends. At runtime, I can swap out WeekdayUserCreator with WeekendUserCreator. Now, that's a silly example, but I bet you can think of some non-silly things you can do with this technique. Oh -- and isn't there a design pattern for this sort of thing?
That's what I say. "Account", as a Class should simply be a Value Object - simply a class with properties. No functions, save maybe some getter/setters that may set other properties. For example, let's say I have a Class that has the properties prefix, name, and label. When I set the name or prefix property, I can set the label property to prefix + name.
Fair enough on the object for the add behavior. But, at some point, someone has to return a collection of reasons as to why an Account cannot be created:
WeekdayUserCreator.Init( Account )
WeekdayUserCreator.Validate() :: Errors
This way, validity within the business logic context can be validated before an exception is thrown (as there might be many reasons why an account cannot be created, but only one exception type can be thrown at a time).
@Hal, small aside about the "design pattern for this sort of thing".
One of the smartest guys I've ever worked with was a longtime Perl programmer, and as he was getting into J2EE (this was back in 02/03), I remember talking on the phone with him and he said "What's with all these damn Factories!?"
What was his take on the Factory? Did he just create objects inline?
I suspect at the time that's probably the case. At that time, he was a director and was doing very little coding; however, he was responsible for reviewing a lot of the stuff that came from the company to whom we had outsourced a large chunk of the development. The joke at the time was that "everything's a factory. they have factories for their factories".
As interesting as discussions like this one are, in my opinion it would be best to sum up different feasible solutions to the problem, instead of trying to find the silver bullet.
So what exactly is the problem here? I understand it as follows: "how do we handle the possibility that the (combination of) property value(s) inside an object prevents one or more operations to be performed correctly by the object, or on the object?"
The desired outcome to me would be confidence that the system will not produce unexpected errors, perform operations that are considered illegal in the business domain, or cause corruption of data in the persistence layer as a result of the property values inside an object.
This might not be a problem definition that is agreed upon, but at least I tried :)
The point is: when we can agree on the definition of the problem and the desired outcome, we can start to produce an overview of possible solutions. The description of such solutions and their tradeoffs and the underlying design principles are a good basis for sound decision making in your software designs.
Anyone feel they understand the problem deeply enough to sum up the alternatives coherently? I am afraid I have to pass on that yet, my understanding is not deep enough right now. :)
I have taken this conversation and formulated some more thoughts:
A week away from reading blogs and I've missed Ben having a "crisis of faith". Oh noes!
This is a very interesting problem to analyze because it really speaks to some fundamental concepts of OO that go beyond just "object = data + methods" which is where most people start.
First off, classes/objects are useful for modeling "stuff" when solving a problem. That stuff doesn't necessarily have to be physical entities, it can also be concepts or pure behavior. In fact, something that is core to both the account/transfer problem and user/password problem listed above is the idea of a strategy.
The TransferObject is a reasonable model of an intended operation but it is the "transfer strategy" of the specific account that determines whether or not the transfer operation is allowed. Not "valid" but "allowed". You are not really asking "Can I create this TransferObject?" but "Will the account's strategy allow me to execute this operation?"
In the case of the user/password, whether the password is valid or not may depend on the "password strategy" for the system itself. That strategy might include rules such as: cannot reuse any of the last five passwords, cannot include the username or certain portions thereof. Those rules cannot be inherent in the password itself and since the password policy can change - and may even be different for different users in the system - it cannot always be inherent in the user either.
In both cases, you need a class to represent the policy / strategy / rules and you apply an instance of that to the object(s) in play within your model.
Usually, if you dig deep enough, these classes *do* relate to the business domain directly - but they may not be obvious nouns (recasting password validity as a Policy may make it easier to spot as a noun?).
Finally, of course, there's no One True Way(tm) to do OO "right". OO is always an evolutionary process, trying to find a better way to model the problem domain in order to make the solution both more accurate and easier to maintain.
I really like the strategy idea. I tried to explore that more in this context with the above link (binding components and methods to form commands). I'm not saying that my implementation is the solution, but rather as a concept to have exchangeable behaviors.
How those get integrated into the domain model in a clean way... still thinking it out.
I got some time this morning to play around more with the idea of Domain objects as "data types". Also tried out a "Form Helper" object for the first time:
Would love feedback :)
I am fall on the side of allowing the bean to be filled with data regardless of the validity. Then some thing validates it based on business rules. Either itself, a service layer or sometimes I create a validation object for that data type if there are many forms of validation. To me this also offers more reuse since the bean can be used to carry user entered data to the business layer for validation but also used for model to model data transfer. This also allows me to always return that object type in methods vs sometimes an object and sometimes not. Usually since my business layer already returns an error collection that is where I do the validation.
I am wresting with this concept right now. I am not sure where I will end up just yet.