To learn all about Skin Spider, click here.
To view the most updated application, click here.
To view the current code base, click here.
For this update, I have created two configuration objects. One of the things that I want to do in this iteration of the application is pull the configuration information out of the Application.cfc (which is where it is currently) and encapsulate it in a configuration component. There is some debate in the ColdFusion community about XML configuration files vs. programmatic configuration files / objects. I am a huge proponent of programmatic configuration files as they are much more flexible and powerful (in my opinion).
This is the second ColdFusion component that we are writing for the application (Application.cfc being the first). This component raises some new issues to think about. Unlike the Application.cfc component, this component is persisted throughout the life of the application as a singleton, and to some degree, can be considered "stateful".
The persisted nature of the component is not such a big issue. Right now, we are going to store it directly in the APPLICATION scope during application initialization and then reference directly in the APPLICATION scope when ever it is needed. This will probably change to some degree as we get more into frameworks, but this will be fine for now.
The real issue here is the idea of stateful data and data retrieval. There is a lot of intense debate over this sort of thing and I am not sure where I stand just yet. There are a lot of people out there that feel that stateful data should be stored in some sort of an instance data object. Then stateful data (as well as much of the non-stateful data) should be accessed and manipulated via "getter" and "setter" methods. Apparently, it is frowned upon to allow programmers to access these values directly.
I am not completely sold on this idea. I have a strong feeling that if a programmer is going to be stupid (and do things like programmatically set the wrong DSN values somewhere in the code), then they should be allowed to be stupid. Common sense is what separates the thinkers from those of use that walk into walls. It's kind of like natural selection at the digital level.
Now, despite my feelings, I wanted to demonstrate both ideas (as best as I know them). To do so, I have created two configuration objects: Config.cfc and PublicConfig.cfc. Config.cfc stores all of its stateful data in a private instance object: VARIABLES.Instance. This is available only internally to the config object itself. Any outside access to this data must be done through "getter" methods such as Config::GetUrl(). On the other hand, PublicConfig.cfc stores all the stateful data directly into its public "THIS" scope where anyone in the world can both get it and set it at will (provided they have a reference to the config object).
I have not started using the configuration objects yet, as I wanted to demonstrate them before I moved them into full production. As I said, I am not sold firmly on either of the ways, but right now I am leaning towards going with the PublicConfig.cfc as accessing public data directly is going to be faster (processing-wise) than going through member methods. And, as I said before, if the programmer (me in this case) is going to be stupid and screw with my own data, well then, that's my fault.
Before I get into the rest of the config object, I just wanted to touch upon something called the "Transfer Object". I am not sure what the actual definition of the transfer object is, and I am sure there is more than one, but I use the transfer object to refer to the data-only version of an object. The config objects have data and methods and can be heavy to pass around. The "Transfer Object" on the other hand, would contain ONLY the data parts (as a structure) and would be faster (processing-wise) to pass around.
How you get the transfer object is the same in both types of configuration bean: Config::GetTransferObject() (struct). How the transfer object (sometimes referred to as the Light Weight Transfer Object, or LTO) is created within the various config objects is different. In the privately-scoped object, Config.cfc, all the GetTransferObject() method does is return a duplicate of the "stateful" data structure, VARIABLES.Instance. In the public configuration object, PublicConfig.cfc, the GetTransferObject() method has to loop over the THIS scope and duplicate some of the data, but not all (we don't want class methods in the transfer object).
The later method can be a pain. You can either do the looping as I have done or you can explicitly duplicate set values. Either way, there is either some guess work involved or some manual labor. The nice thing about using an instance data object is that it is so simple to return a duplication of it.
Creating Portable Applications
One of the things that I strive for in all my applications is high portability. I love being able to pick up an application and plop it down anywhere and just have it work. It is so sexy when that happens. One of the ways in which I make that possible is by using relative paths to describe the structure of the application and then dynamically calculating all site URLs and full server paths based on those relative paths (and other environmental data).
This is possible because I know where the configuration object lives within the application itself. Once I know that, I can calculate the site Url as well as the root path of the application on the server. Take a look at either of the config objects to see how this works. Notice that I am never making any reference to where the application lives, but rather, only to where sub-parts of the application live relatively to the root of the application itself.
Of particular interest, take a look at the CalculateSiteUrl() method. This method take the CGI object and the server root path of the application and dynamically determines the URL of the site itself. This means no more having to hard code links that you send to in emails or provide in a public manner.
I am not sure how all of this is affected by mapped paths. Again, I don't like mapped paths. While I understand their use, I think there are much more clever ways of doing things that allow you to maintain a low level of coupling to the server setup.
Keeping It All Together
You will probably notice that the config objects have the XML database schema stored in it. There are those who would see this and be tempted to say "Wait, that's XML. Why not move it into it's own xml file and then pass it into the configuration object." Well, because, there's really no point to that; it would server no purpose. I like to keep all the configuration information in one place. There's no need to break things out into separate files if you don't gain anything from it. Remember, the XML database schema is as much a part of the application configuration is as setting up the file paths or the application name. No matter what I change, it's going to be a "configuration" change, and therefore, why not keep it all in one place.
Remember, just because you can abstract something out or encapsulate it, doesn't mean that you should or that it is right for your application. This goes for the XML database schema as well as the privately-scoped instance data. I am not saying that these are wrong; they always have a place. I am only saying that before you commit to any idea, stop and think about what you are gaining by doing it. In the case of this application, if we broke out the XML schema to its own file, there would be zero up side, and the down side is that we would have to maintain two configuration files instead of one.
Also, if anyone sees how this could be accomplished (everything mentioned above) in an xml-file configuration setup, PLEASE let me know. I think there would be a lot of people interested in seeing that.
Looking For A New Job?
- Wanted: Full-Time ColdFusion Developer at Intoria Internet Architects
- Cold Fusion Senior Developer at Edge Information Management
- Back-End Web Developer-Information Technologist at Michigan State University
- ColdFusion Developer at Nonfat Media
- Mid-to-Senior Level Web Application Developer at SiteVision, Inc.
Great article, bunch of comments, so I'm just going to post a aouple of smaller comments so I can remember what I'm thinking!
Benefit of config over publicconfig is not just to stop you overwriting your own data source but also to abstract any possible logic in getters and setters. If I getURL() I can add RegEx processing or clean up or concatenation from sub variables or change my rules for figuring it out or whatever without breaking my code - I just change a single getter.
Of course, I'd use an IBO so I didn't have to manually write getters for everything :->
For small project, short time period, no biggie, but I think it is worth playing with config object as once you start to need it, you really appreciate the benefits of the encapsulation of getting and setting logic.
Going straight to memeory to get configobject.MyDSN vs calling a one line method in configobject.getDSN() is pretty trivial in terms of performance. If you're optimizing that hard, use c++!!!
Question: Are you EVER under any circumstances going to need to change the way the DSN is calculated? If so, consider config object with getters. I f it is really unliokely you can handle the issue with custom config code when you load the bean, but I'd strongly recommend trying getters and setters. As the projects get more custom and complex it'll make the code base much more maintainable!
TO vs. "real" object
They're not actually being passed around at all. Complex objects including cfc's are passed by reference, so they're sitting at OXCCAA or whatever the memory address is and it is just the pointer to that address that is being passed aounr. I'd be surprised if there was any performance hit in passing around a large object with methods vs. a TO. Instantiation is another matter as the methods obviously have to be read from disk and compiled down to java byte code.
Any way I could ask you to post a simple article on using expandpath and cgi variables and the like to lower the amount of config data you need? If you don't, I will some time as I need that documented somewhere, but you just seem to be king of the self configuring applicaiton and if you ever got the chance to write such an article I'd love to read it!!!
Two reason to break out into seperate config files: independent variability and file size (these are general comments - may not relate to SS).
If you have an app that is going to need 10,000 lines of config, it is time to break it down logically into smaller files if at all possible as it's a bear to open, scroll through and edit 10,000 lines of config.
If you have (say) db config and (say) directory config (I know, bad example, but bear with me). If I have two different db configs for dev and live and three different directory cnfigs for three different web servers (two live, one dev) I have two options. Most of the time I'll use conditional logic in config script and have one file, but sometimes it is better to have a directory config for each web server and a db config for each db server and then to load the appropriate config files for a given web/db server combination, so it can make sense to break them down.
Only other reason to break XML out is if you want to more easily run it through DTD, allow people to edit in XML editor, etc.
NONE of the comments/posts disagree with choices for the use case, but I did want to highlight some of the things I'd be thinking of when making these decisions for a given project.
Keep up the great work - when I finally get round to having a blog roll you're gonna be on it!
From what I skimmed, great comments. Bit of an emergency at work. Will address comments at lunch :) Thanks!
Don't you just hate when work gets in the way of blogging :->
Take your time, good luck with the emergency.
Seriously! Clients are so not cool that way :)
Gonna try and respond one at a time. As far as the IBO, I like the idea of having generic getters and setters. Hopefully in one of my next iterations, I can give it a go. I can picture it now, like one of those Movie Phone commercials... "This application is good but I wish it had some more Peter Bell IBOs in it".
I think the configuration bean is not the best case for topic dicussion about this stuff. The config is one of those things that really should be set once then just referenced going forward. That is why I refer to it as being some what stateful, not truly stateful. I see what you are saying about making a change in the "Getter" method, just one place, and having it still work in the application, and I think that makes more sense for some objects, but the config is probably just not the best case.
Premature optimization... true. The getters / setters have not caused any problems, so why rule them out as a performance issue? I am tackling this debate right now because I am about to start building out components and, among other things, consistency is very important to me. While it might come across as optimization (and in part it is), I also just want to nail down a methodology that I can start applying to other objects. Basically I want to come up with a boilerplate that I can use going forward.
Of course, the idea of consistency raises another issue. There ARE going to be times for sure when I need to use getters / setters (such as adding a collection object to another collection where it has to work things out internally)..... so that begs the question, if some times I NEED getters / setters, then to be consistent, shouldn't I just always use getters / setters?
But just to go back to premature optimization, again, ColdFusion is and always was designed to be "fast enough". Right, C++ for super optimization, ColdFusion for ease of use and faster development.
Also, as far as DSN, I was thinking about perhaps having a DSN bean that gets created and stored within the Config bean. Not sure yet, and nothing I can really play around with at the moment (as there is no DSN type database in this application).
As far as performance for passing around a struct (passed by reference) and passing around a ColdFusion component (passed by reference), I have not done any testing myself. I have, however, read that passing around large CFCs can actually cause a noticable performance hit. I don't have link to show, I think it was something I may have been told in a CFUG meeting. That would actually be something interesting to test out.
But yeah, I suspect that for me, and my size components, it wouldn't make much of any difference for performance. However, when it comes to functionality, there might be issues. The Transfer Object, as I have them set up, are designed to be duplicates of the instance data, not references to it. Of course, this depends on what the instance data is and ColdFusion's ability to duplicate certain objects, but there might be times when you want to queue and instance or hold onto the data in some way without acting on it right away. Using a reference to the original object, you might have to be concerned that the values have changed from the time of queuing to the time of action... is that an issue? It will depend on what you are doing. If you use a true duplicate of the data, however, you can pass them off and they never have to worry about "data corruption."
I could certainly write something up about expand path and CGI and what not. I would probably end up tying together a few other things that I have written, but it would be nice to see it all in one place.
ExpandPath() / GetCurrentTemplatePath()
Plus, the CalculateSiteUrl() in the Config objects listed above. In the config objects as well, there is a neat little function called TravelUpDirectory() which just takes a path and returns the (N) next one up the path list.
Let me see what I can put together.
Good points about why to break out the configuration into smaller files. But, just to re-iterate the point you made yourself, it depends on the project.
There's always going to be a blurry line between creating a scalable system, a maintainable system, a fast-to-develop-system, a small file-set, and X number of other factors to consider. Take for example an OnRequestStart() method in the Application.cfc. This might have several "areas" of functionality:
- Perform actions based on URL (ie. reset application)
- Prepare framework for request
- Generic cleaning of form scope
- Combine FORM / URL scope
Now, with all this going on, OnRequestStart() might start to get very big. With this size, it might become less readable and less maintainable and less understandable. I might want to break each of those sections out into a help-function of the Application.cfc. Or maybe I want to have a helper object that facilitiates some of those actions. Or maybe I just want to have different templates that get CFIncluded into the OnRequestStart() method.
None of these are good or bad on their own. It all depends on how it affects the system over all.
All that to just simply say that I agree with you :)
LOL re: the movie commercial! Yeah. More OO goodness in every byte :->
All comments make sense. I THINK there would be no performance hit from referencing objects with methods vs. objects without methods, but I could be wrong. Just can't figure out why a reference to a point in memory wuld cost more than a reference to a different point in memory, butnot knowing what's happening under the hood . . .
Agree 100% there will be performance penalty on instantiation and on duplication. Rule of thumb - if you're doing a handful, no problem. If you're instantiating and/or duplicating a bunch of cfc's on each page request that may become a problem no matter how simple they are, but it'll definitely be worse if more methods.
I will check out the link you provided - thanks, but also looking forwards to the "pull together" if you get a chance!
I was initially dubious about getters and setters. I then had a painfully complex project with a bunch of business objects with complex rules for calculating attributes and now I'd never go back. But it is nominally less performant and as with any pattern it is unnecessary for use cases that don't need it!!!
"as far as DSN, I was thinking about perhaps having a DSN bean that gets created and stored within the Config bean"
What do you see as the advantage of doing the above over having a variable in a persistant scope which stores the DSN info?
Great blog BTW, keep up the good work :)
Muhaha victory is mine! Thanks so much for sharing your approach Ben. If there's one thing I hate is complicated application config files. It should be as easy as plugin and play. And thats why I love the approach you have. Keep it up! I'm sure you will pick up OO in CF real soon and start to see the benefits as we all have. I personally use generic getters and setters although hardcoded getters and setters do have their place. Thanks again. Great to see you working together with Peter.
Thanks for reading the blog. As for the DSN stuff, the idea goes back to the data access security that I was talking about before. Let's say that I had DSN in a struct and that struct was stored in the APPLICATION or the REQUEST or even in the Config bean. Structs are passed by reference, so any time I passed the DSN object to someone, they would have the ability to mess with the DSN values that everyone else was using (since everyone would have pointers to the same DSN struct).
If, however, I had the DSN info in a DSN bean, then I could have "Getter" methods:
... that would return values. Now, while CFCs are passed around by reference the same as structs, the difference would be that the DSN bean would not have any SETter methods. All DSN info would be stored in the DSN bean during intialization. After that, the DSN bean could only give info, never get it. That way, you never have to worry about people messing with the data or corrupting it in some way.
Of course, this goes back to the idea that if someone should mess with their own data then you have a whole other set of issues to worry about and blah blah blah. But, that's the general idea.
Glad you are liking the stuff I am posting. Please feel free to drop me any suggestions. I will continue to update the Skin Spider project, so please, any advice you have might make it in.
Yeah, Peter has some great ideas. I am looking forward to integrating some of his ideas (which you share) such as the generic getters and setters. But, I don't want to "jump the shark." I want to get there when it gets there so that other can one day follow through the application example and see the logic.
It's amusing to see you not rant but the whole blah blah blah thing.
"Of course, this goes back to the idea that if someone should mess with their own data then you have a whole other set of issues to worry about and blah blah blah. But, that's the general idea."
The reason I point that out is for many people who start picking up OO don't really dig the idea all that much. When it comes to configuration I love your approach as I want the plugin and play but I still don't something like configuration needs to be so complex to have 1 or 2 XML files, a datasource bean, and an application bean.
Thats currently done at my job but thats a decision the senior architect takes. Although he knows that irks me thats probably why he implements it. ;) With the application scope everything is perfect for config and it should go there. Maybe once I really start working on a large application I will see the benefit of splitting things apart but if I do it would be very much in the mold you have done with a config object, thats how I would go about it at least.
As a side note I love your de-spam. Prevents devs from making quick comments because now they have to do math, doh! I personally prefer this approach though as from an accessibility point of view its better than an image with hard to read text. Great stuff!
One more benefit for getters - you can use AOP to wrap them in all kinds of cool stuff. Want to log every request for the DSN to a file to see what is going on or set security requirements for calling a setter? Just add some advice and define the pointcuts and you're done.
Obviously right now that is CS only which isn't optimized for transient objects, but when I add simple AOP to LightWire next week, we'll be able to do this performantly for all of our transient objects as well!
Glad to see I still have just enough math skills to be able to post :->
It's funny you mention the idea of wrappers because like two years ago when I first scratched the surface of OOP in CF, it was this very idea that somewhat sold me on the whole idea.
I was building a system that had to log system emails and this idea of wrappers seemed very appealing. How cool would it have been to be able to have some sort of EmailService and EmailBean (sent via the EmailService) and then have a wrapper that you "wrap" around the EmailService that stores the emails in the DB before it passes them off to the EmailService.