My ColdFusion User Defined Function Library Structure
Posted September 11, 2006 at 8:50 AM by Ben Nadel
I was recently asked how I build my library of ColdFusion user defined functions (UDF). I sometimes post code samples that will break if you copy them directly as they contain references to parts of the library that you do not have. I cache all my UDF's in the APPLICATION scope in a single ColdFusion Component (CFC). Then, for each page request, I get a reference to it and store it in the REQUEST scope:
- REQUEST.UDFLib = APPLICATION.ServiceFactory.GetUDFLib();
I put it into the request scope more for personal reasons (as opposed to going to the APPLICATION scope each time). The ServiceFactory is just a cached object instance in my APPLICATION scope that houses an interface for getting and creating objects in my application. Without a ServiceFactory, you could easily do this like:
- <cfset APPLICATION.UDFLib = CreateObject( "component", "UDFLib" ).Init() />
Now, my UDF library is one object, but it is actually composed of many different objects. I like to house each of my UDF library categories (ie. Text, XML) in a different component. This makes adding and editing methods much easier and it forces me to think out my code more. Plus, it cuts down on naming conflicts and allows for smaller naming conventions. Here is the code for the main UDF library object:
- hint="Houses all the user defined functions for this application.">
- <cffunction name="Init" access="public" returntype="UDFLib" output="no"
- hint="Returns an initialized user defined library instance.">
- // Since this is a library and NOT a real entity bean, we are going to
- // store all library references in the THIS scope so that they are
- // easily accessible to anyone who has an instance reference.
- // System contains core system functions.
- THIS.System = CreateObject( "component", "UDFlib.SystemLib" ).Init( THIS );
- // AJAX contains AJAX utility functions.
- THIS.AJAX = CreateObject( "component", "UDFLib.AJAXLib" ).Init( THIS );
- // Array contains array manipulation functions.
- THIS.Array = CreateObject( "component", "UDFLib.ArrayLib" ).Init( THIS );
- // DateTime contains date and time manipulation functions.
- THIS.DateTime = CreateObject( "component", "UDFlib.DateTimeLib" ).Init( THIS );
- // List contains list manipulation functions.
- THIS.List = CreateObject( "component", "UDFlib.ListLib" ).Init( THIS );
- // IO contains input and output related functions.
- THIS.IO = CreateObject( "component", "UDFlib.IOLib" ).Init( THIS );
- // Query contains query manipulation functions.
- THIS.Query = CreateObject( "component", "UDFlib.QueryLib" ).Init( THIS );
- // Struct contains struct manipulation functions.
- THIS.Struct = CreateObject( "component", "UDFlib.StructLib" ).Init( THIS );
- // Text contains text manipulation functions.
- THIS.Text = CreateObject( "component", "UDFlib.TextLib" ).Init( THIS );
- // Validation contains data validation functions.
- THIS.Validation = CreateObject( "component", "UDFlib.ValidationLib" ).Init( THIS );
- // Xml contains xml manipulation functions.
- THIS.Xml = CreateObject( "component", "UDFlib.XmlLib" ).Init( THIS );
- // The custom library is stuff that is decidedly NOT part of the main function
- // library as it has been created specifically for the current application. This
- // library would not be used for a different application.
- THIS.Custom = CreateObject( "component", "UDFlib.CustomLib" ).Init( THIS );
- // Return THIS object.
- return( THIS );
As you can see, each type of library method is stored in its own component in the THIS scope of the library. Therefore, when I need to reference a method, I cannot just call the UDF library object; I have to call a sub-component of it:
- <cfset qNew = UDFLib.Query.Append( qOne, qTwo ) />
I like this not only because it makes updating the library easier, it also makes the code a bit more self explanatory. Seeing what "library" you are calling will immediatly tell you more about what the method does.
As you can see above, when I create my sub libraries, I always pass in a THIS reference to the main UDF library object. This way, all the child libraries can have a reference to the parent library and therefore can call methods in sibling libraries. Each Init() method of a child library looks similar to this:
- hint="Returns an initialized library instance.">
- <!--- Define arguments. --->
- <cfargument name="Library" type="struct" required="yes" />
- // The library variable is a reference to the main UDF
- // library. This is going to be used to reference other
- // parts of the library that are not in this specific
- // component.
- VARIABLES.Library = ARGUMENTS.Library;
- // Return THIS object.
The child library stores the reference to the parent library. This way, if I am in one method, I can easily call methods from another library:
- <cfset strText = VARIABLES.Library.Text.ToAsciiString( "-" ) />
So there you have it, my UDF library approach.
What Other People Are Searching For
I use a very similar structure for my liraries, with one small difference. Where you have your ServiceFactory, I have a PluginManager base class. All of the helper object CFCs then inherit from a Plugin base class, while the specific factories inherit from the PluginManager class. The PluginManager base class uses reflection to look at the inherited class' namespace and tries to load all CFCs present in a Plugins directory beneath it. Thus, no explicit loading. I just have to drop a new file into a Plugins directory and the PluginManager will load it. The plugins don't really have to inherit from the Plugin base class, but if I add the extra extends attribute in the <cfcomponent>, they get nifty things like knowing they are a plugin, figuring out who their parent Manager is, etc.
Thus, visually, I have a file structure like so:
PluginManager.cfc (base class)
Plugin.cfc (base class)
FooManager.cfc (extends PluginManager)
SockManager.cfc (extends PluginManager)
bar.cfc (extends Plugin)
quux.cfc (maybe extends Plugin)
red.cfc (extends Plugin)
blue.cfc (extends Plugin)
Then, in my Application.cfc, I only have to do one explicit object creation for each factory:
<cfset FooManager = CreateObject("component", "org.rickosborne.FooManager").init()>
<cfset SockMgr = CreateObject("component", "org.rickosborne.SockManager").init()>
And the rest is just automagic. The PluginManager base class does a <cfdirectory> listing of the \org\rickosborne\foo directory and tries to load each of the plugins.
The main benefit in my case is that if I don't want someone to have a specific plugin, I just don't give it to them. No code changes necessary. Conversely, adding new plugins to a client's site generally requires no code, just dropping a new CFC into a folder.
That is pretty cool. It goes a little over my head, but I think I understand it. I like the idea though. I am slowly trying to learn more about OOP and clever CFC ideas.
How do you handle the idea of passing in arguments during object instantiation? Is that part of the reflection stuff to figure out what is being asked for?
Very cool though.
Wonderful explanation for simple question. Nice way of grouping all UDFs into one CFC and access all in one reference. I almost understand the code you post except the part of instantiating components inside UDFLib CFC. The question I need to ask is:
While creating the objects instance as you wrote here
# // Array contains array manipulation functions.
# THIS.Array = CreateObject( "component", "UDFLib.ArrayLib" ).Init( THIS );
Dose UDFlib in "UDFLib.ArrayLib" means/(map to) real folder under the cfc mapping folder or you use this because you are creating from inside UDFLib CFC?
Sorry about that. It is just a directory sturcture. I keep all of my CFC in a folder called "extensions". I don't like mappings and try not to use them. A lot of people will yell at me for that, but mappings, in my experience, only cause hardship.
So, as for directory structure I have:
So when I say that I am creating a component in path "UDFLib.Text", it merely means that it is the Text.cfc component within in the directory UDFLib. I don't need any mappings for this since this is being called from a component in the "Extensions" directory. Hence, UDFLib.Text is relative to the calling directory.
Hope that helps.
I got it
Thank you Ben I appreciate your work and explain.
Any time :)
Ben asked: "How do you handle the idea of passing in arguments during object instantiation? Is that part of the reflection stuff to figure out what is being asked for?"
Generally, I don't have to worry about it. Most of these plugin-type libraries, for me anyway, fall into one of two categories:
1. Filter (transform x into y, such as a crypto hash)
2. Action (take thing x and mail it to me, logging it to file y)
In both cases, I then have 1 of 4 approaches:
1. I don't really need any initialization data, as the very existence of the plugin means it is doing something specifically different than all of the other filters (SHA vs MD5 vs ROT-13, etc)
2. The plugin is a Black Box, and thus persists nothing, thus not really needing any initialization data, and everything must be passed into it for each function (a Crypto manager would give you back a SHA black box, but you would then have to call a SetKey function to do any initialization)
3. If it really needs some kind of initialization data, the plugin can ask its Manager for it when it is init()ed. Thus, if all of the plugins should share one datasource, the Manager would know about it.
4. That last point can be really abstracted to the case where I have a "shim" class that extends Plugin (maybe CryptoPlugin) and then the other classes inherit from it (SHACryptoPlugin). Thus, the shim could be the one that asks the Manager for any sort of initialization data, and we're back to the plugins being very dumb/Black Box and getting their initialization data through inheritance instead of explicitly. (Certainly not my favorite option, but I've been forced to do it at least once in the name of Refactoring Mercilessly.)
I know it's a quirky design decision, but I have been very much into Black Box objects lately. It means a lot more function arguments (you are explicitly passing the datasource/DAO each time), but it has also led to a lot more code reuse for me. It's certainly not for everyone, though. I'm sure there are others that would look at my code and say "wtf? why not just have one object per datasource?", but for the types of applications I'm working on, it works out very well. (Most of my apps have to work against multiple datasources of multiple radically-different DBMS types, sometimes at once.)
I also like it because it leads to very simple-to-read and simple-to-upgrade code, as you're never trying to figure out where-t-f the frickin' datasource is coming from -- it's right there in the argument list. (Or Reactor object vs datasource or whatever.)
The application I threw out when I got here was insidious because it looked the previous developer tried to do OOP, but then would get frustrated 4-layers deep and break the object model horribly by hard-coding a datasource or table name or something. And it was Fusebox. Really nasty Fusebox. It made debugging a living hell. Hence, my retaliation with uber-Black-Boxes.
Awesome explanation. I think I am sort of understanding what you are talking about. I too like the black box idea. I am doing my very best not to break encapsulation rules. That is why I always pass my data source around and in the above example, why I pass in references to the main library to all child libraries so that all calls are made through whatever it was passed and no guess work has to be done.
I am slowly learning more OOP, but it is not an easy journey. I learn very well by doing but I find it hard to find a really good, but small example of great OOP design. But as time goes on, more stuff is starting to make sense.
Okay, I reread that comment and realized it was hopelessly vague. Let me give you a specific example.
The directory stucture (vastly simplified) and object model look like this:
. . SiteManager.cfc
. . Site/
. . . site.cfc (generic baseclass/shim/fallback)
. . . rickosborne.cfc (actually extends site.cfc, not plugin!)
. . . rixsoft.cfc
. . . corri.cfc
. . IdentityManager.cfc
. . Identity/
. . . noauth.cfc (generic baseclass/shim/fallback)
. . . session.cfc
. . . client.cfc
. . . cookie.cfc
. . . url.cfc
. . PageManager.cfc
. . Page/
. . . static.cfc
. . . store.cfc
. . . news.cfc
. . . LoginManager.cfc
. . . Login/
. . . . local.cfc (generic baseclass/shim/fallback)
. . . . inames.cfc
. . . . openid.cfc
. . . . sxip.cfc
. . . . google.cfc
. . Response.cfc
The Application.cfc initializes the 3 managers, which automagically initialize all of their plugins. In the onRequestStart, I have the following code:
The CurrentSite() function in the SiteManager just loops through each of its plugins and asks isCurrentSite(), which is actually only ever defined in the site.cfc base class, along with a HostNames array. Each of the derived classes then only adds their host names to their local copy of HostNames during init(), and with no code in the inherited classes, I automagically get the right one anyway. (I'm thinking about converting this to XML, but I'm on the fence about it.) There's absolutely zero duplicated code. The generic Site object is the only one to override this method, and always returns True, but then does not perform any branding later on (or returns an "Unknown Site" message). Sites are view-level objects.
The IdentityManager is model-level code that is half an abstraction for Session and Client variables and half implementation of the various login methods. Do they have a Session that is authenticated? Maybe it's in Client variables instead? Okay, try looking for Cookies. What about a URL nonce? (The view-level parts with the actual login forms are actually plugins handled by a LoginManager, which is itself also a Page plugin.)
Again, the same thing happens for the PageManager. The kicker is that CurrentPage isn't actually rendering the page, it's just returning the object that can render the page. Since the Page rendering may be different depending on the Identity and the Site, you'd have a chicken-and-egg problem unless you had all of your objects in place and rendered them all at once. The generic Page object always returns a really simple object that ignores the Identity and Site and just says "No such page". Pages are view-level objects, but can request model-level objects from the Response object (such as a Store or Cart object, which the Response object asks the Application for).
The controller part is of course the Response object, which doesn't do much more than figure out the current context (Where/Site, Who/Identity, What/Page) and get them to talk to eachother long enough to render the page.
But, hopefully you can see that having everything as dynamically-loaded plugins is extremely useful. If I want to add a new sitewide branding design, I create a new blank Site CFC, set the few tiny host names and branding specifics, drop it in the folder, then re-init() the SiteManager. Same goes if I want to add a new authentication scheme or page type (Blog, News, Calendar, whatever).
This comes in even more handy for incremental development. I can easily create a Store_v2 object that only answers true to CurrentPage() if it knows it is me, thus I'm the only ones that sees it. I can then develop v2 side-by-side with v1 and then remove the v1 code when v2 is ready. No muss, no fuss, no worrying about "oh crap! v2 requires FooBar and I forgot to put it on the production server!".
I think where you are is basically where I aim to get. Methodology or not, I am talking about higher level design. I really like plug-in idea and the ease with which things are swapped out.
I am gonna let all that stew in my head for a while. Thanks for posting such in-depth explanations of your stuff. It's super helpful.
Have I missed something? Wouldnt it be
<cfset qNew = Request.UDFLib.Query.Append( qOne, qTwo ) />
You are absolutely correct. My mistake. Since the UDFLib instance is cached in the REQUEST scope, I would have to use REQUEST in the code. That was a typo.
I was thinking that one your reasons for not using application scope was not typing out application.UDFLib all the time, then i thought you would have to type out request.UDFLib anyway.
Suppose variables scope isnt accessible from some of your code? or are there other reasons why variables scope isnt useful?
The reason I don't use VARIABLES scope to store my page data is that I just never really used it. Back in the day, I never scoped any of my stuff. Now that I am scoping things properly and getting more and more into CFC's I am not in the habit of using the page VARIABLES scope. Furthermore, I don't want to get confused between the CFC Variables scope and the Page Variables scope.
At this point, I like the way REQUEST looks (yeah, I am that shallow). I don't see VARIABLES as adding anything over the REQUEST scope.
Hi Ben -
I've been looking around for CF UDF Library availability/management techniques and found precious little out there.
Simple me. I've been <cfincluding> my UDF libraries and as my code becomes more granular and reused - perhaps even a bit oo - I find that this basic technique is not adequate for a number of reasons. I don't fully understand the CF memory model either
Your post above is very helpful. As it's two years old, I'm wondering if you have an update of more recent wisdom.
If so, would you share it?
I still use this basic idea. I with there was a way to make globally accessible UDFs to act like built-in functions, but I have not found a way; plus, for name conflict reasons, its probably not the best goal :) I now have a UDF object (like above) that I just pass around as needed.
I really like this implementation, and have already started using it.
One question: How do you determine which methods go in which Library? For instance, I have a method that parses a date into a specific format needed for certain database queries. Would you put this in "Date"/Date.cfc Library, or a "String"/String.cfc Library. I know I should just do what's easiest for me to understand. I'm just curious about your opinion on this.
Ah, yes - memory usage versus performance.
How often is the function called? How important is it's performance? Is it distributed globally in your code, or only in an isolated area, where it could alternatively be placed?
The answers to these questions will inform your decision, but I don't believe there's a way to calculate an answer with any specificity.
As I posted above (quite some time back), the CF memory model is a bit of a mystery to me. FWIW, my function libraries total about 125k on disk and work very well, even on massively shared hosts.
@Don my question doesn't really have anything to do with memory or architecture... i'm more thinking about what functions to put in what libraries....
To answer my own question, I have decided to put methods in libraries based on what the input is. If I'm modifying a structure, that method goes in StructLib.cfc. If I'm parsing a date or formatting a date, that method goes in DateTimeLib.cfc. I think I was confusing myself, because, let's say I have a method that converts one type of object (let's say a "list") into another type of object (let's say an "array"), I was unsure if that was a List method or an Array method. But, now, I've got it down - basing it on the INPUT type.
@Ben, how do you handle UDF Libraries that are specific to an application on your website. For instance, let's say you have a public-facing e-commerce site. You might have a My Account application, and a Checkout application, etc. If you have UDFs specific to your website application, you would likely put them in your CustomLib.cfc. But, that could get big and out of control, so would you create a CheckoutLib.cfc and a MyAccountLib.cfc to store methods specific to those apps?
"...I wish there was a way to make globally accessible UDFs to act like built-in functions, but I have not found a way..." (Oct 6, 2008)
I just wanted to inject a later post of yours into this discussion where your answer your own wish through the use of the URL scope. This works like a charm even tho it seems like rather a strange workaround.
Thanks for all of these great tricks and treats!
Don't know why I hadn't searched your site for this before, Ben. This is fantastic, all the best practices I'm already doing plus a few things I didn't think of.
The application source -> request reference is particularly elegant. I like it a lot.
Then I followed the link Roger posted... to something I had even thought about trying before... holy crap... its like christmas in here today...
(Though, I wish I would have searched on this 3 years ago... I could have historically made really good use of these!)