XML Config Files And The Heuristic Exception
Posted November 30, 2006 at 8:02 AM by Ben Nadel
There seems to always be a conversation going on somewhere about the use of XML config files versus Programmatic config files in ColdFusion. I personally believe that Programmatic is the way to go, but that is not the point of this post. I wanted to talk about heuristics for a moment. A heuristic (in a general sense) is a rule that is used to increase the probability of solving a problem. In ColdFusion, there have been many heuristics (perhaps thought of as best practices) passed around. A few that pop to mind are:
- If you have a file that has thousands of lines of code, it can probably be broken up into smaller files that are more cohesive.
- If you have a ColdFusion component that has a ton of methods, it's probably a "God" Component and should be broken up into smaller, more cohesive components.
These make a lot of sense to me. Maybe I am crazy, maybe I am not.
When someone argues for a Programmatic configuration file, often times the counter argument is that that might work for small applications, but what happens when you have thousands of objects and a config file that is thousands of lines long? This scenario is apparently much more suited to an extremely large XML config file than it is to an extremely large programmatic config file.
Instead of arguing for one or the other, I would like to argue that NEITHER of those is a good approach. Heuristically speaking, large files are not as maintainable, whether they are XML or programmatic. There seems to be this underlying sense that frameworks should be build around and suited towards enormous XML configuration files. How is this any different than building a framework to be suited towards one gianormous ColdFusion controller.cfc? It's not really; abstractly, they are all just system files that cover one aspect of the applications functionality.
There is a better way. I have not fleshed it out as of yet, but it involves much smaller configuration files that should be equally functional as XML or programmatic. It will be more organic, more natural feeling... well at least hopefully. I will try to get what is in my head down on paper over the next few months. It may all end in disaster, or it may end with the creation of a sleek little framework. Who knows :)
What Other People Are Searching For
I'm by no means an expert at CF, so I have no idea if it's "the fastest" way to generate pages, but the ListGetAt() function makes for very compact configuration files.
A quick example, querying a database, and returning the records into a four column grid layout.
Looping through the list, determines two of the four columns are visible, and one of those columns has search enabled.
I am also no expert. My argument goes off of a gut feeling (which may, in the long run be completely wrong). I am not sure I quite understand how ListGetAt() ties in exactly to a config file, but I assume you mean that you would have LOTS of those definition in a config file?
Yes, there are lots, and there is quite a bit of code in a custom tag to generate pages from all those definitions. I posted only three lines too convey the idea, and keep it readable in a blog.
There are two major problems with using ListGetAt() so far, but I think I have a solution.
First problem, some of these variables contain strings of text, some boolean. It's supposed to be that way, but not exactly easy to scan and modify, because they don't line up perfectly. I find myself having to count 1,2,3... a lot.
Second problem, as the application becomes more complete, natrually some functions depend on Global properties. As in my first posting, if Search is false, turning on column #2 searching makes no difference, because it's globally false. This isn't a huge problem, but I've goofed a couple times thinking something was broken, and I had simply turned it off.
Which brings me to the solution. While it's possible to edit these configuration files by hand, more than 4 columns is distracting to the eye. Eventually you need some kind of interface for it. Because what you really want as an end user, is to click a button to generate the file of properties.
The stage I'm at now: I've successfully gotten configuarion files to generate a grid, based on output from a query. But an interface is needed to configure the properties. I have an HTML prototyp I could upload if you're interested in seeing what I've got. I don't have an online CF Server, so nothing saves to a database, but you can click around and see how it works.
For me the discussion between XML and programmatic configuration comes down not to syntax but to the type of configuration that you are doing.
If you are passing in logic, then you are doing programmatic configuarion (worded another way, if your configuration has conditionals, then it is essentially programmatic).
If you are passing in data, then you don't need programmatic configuration (data, that is, without conditionals).
If you are passing in data, then the requirements of your data will help determine your format. If you can use a database, that is usually the best format (except, of course, for information about the database).
If your data is very simple (no complex data or data sets), then an ini file will usually work.
If your data has complex relationships (data that contains other data or contains multiples of some kind), then XML (while not perfect) is a nice format that is very universal.
When dealing with lists, for example, where the value of the list at a given position matters, often XML is a better format. It is more verbose, but that adds clarity.
You are also introducing another dimension which I have not tackled, but that I think could be very valuable if done right. That is whether your configuration data should be centralized or distributed.
As you have referenced, centralized data is handy but can grow too large.
If you can figure out a good mechanism for distributing configuration data, I would certainly be interested.
I use an XML file to define the components in use for a site. This works very well, but doesn't really allow me to have a component easil register itself in the configuration (I like my components to self-install as much as possible).
I don't have anything that I am totally satisfied with yet, but to me, distributed configuration just feels more natural and waaay more cohesive. But, it might be a pipe dream.