CFParam And Regex-Pattern Is Quite Awesome
I have been using ColdFusion's CFParam tag for a long time and have been really happy with it. Normally, however, I just use it to parameterize the existence and data type of string and numeric form values. In my Exercise List project, however, one of the form values that I was expecting was a comma delimited list of numeric IDs. This cannot be a numeric value because it contains non-numeric characters (the comma). It can be a string value, but this certainly doesn't help me to ensure that the data type is correct; meaning, making sure that it is a string does not mean that it will be a list of numeric IDs.
Enter the CFParam tag type of Regex and its partner in crime, the Pattern. Using a regular expression, we cannot only make sure that the list of IDs was submitted with the form, we can also make sure that it is truly a comma delimited list of numeric values. Take a look at this CFParam tag:
<cfparam name="FORM.lst_item_id" type="regex" pattern="[\d,]*" default="" />
Here, we are saying that the form value, lst_item_id, must conform to the pattern "[\d,]*", or if it doesn't exist, that it is defaulted to the empty string. For those of you not versed in regular expressions, this pattern means "zero or more characters from the set composed of the comma and all numeric digits.
This is the first time that I have ever uses the regex / pattern CFParam type and I have to say that I am really excited about it. I love regular expressions and I am always happy to see that they can be put to a good use. But more importantly, I am finally utilizing the ColdFusion CFParam tag as it was intended to be used; if you look at the ColdFusion documentation, you will see that CFParam has three responsibilities:
- Tests for the existence of a parameter (that is, a variable).
- Validates its data.
- Provides a default value to the variable if one is provided.
Up until now, when it came to a list of IDs, I was using CFParam, but only for responsibilities #1 and #3 - existence and default value. I was guilty of not using the CFParam tag properly. I was guilty of not making its power work for me. But no longer - I see the light, baby, and I see it shining brightly.
Here is what I am thinking: If you have a list of CFParam tags at the top of your FORM processing action, then by the time those CFParam tags have finished executing, all of your FORM data should be of the proper data type; no more TYPE validation should need to be done, only value validation.
At first, I thought that there might be this exception: a text input field that requires a numeric value. Yes, you could perform all of the type validation via CFParam, and yes you could make sure that only a numeric value is available after the CFParam tags, but really, this is not a valid use case. A text input should NOT and cannot be paramed as a numeric data type as a text input box is not for numeric inputs - it's for text input. It's just by our own business logic that we then say that that text input must be numeric. This is a value validation, not a type validation.
Enforcing that a select box has a numeric value or a that checkbox has a numeric value is a valid data type validation as these interfaces are not supposed to have user-value manipulation outside of selection and checking. Text inputs, on the other hand, can only be paramed as text because it is an interface that does not force a user to enter only a certain type of data.
Want to use code from this post? Check out the license.
Very cool! I'm going to have to dust off my RegEx in 10 Minutes book!
Luckily, I don't think you will have too many different types of expressions to use. When I think about form data type validation, here are the different data types I can think of:
* Text input (text, textarea, file)
* Numeric input (select, checkbox, radiobox)
* Text list
* Numeric list
Text inputs can be CFParamed with [type = string]. Numeric input can be CFParamed with [type = numeric]. Text lists can pretty much also be CFParamed with [type = string]. The numeric list, the one I am talking about in this post can use the regex / pattern CFParam.
There's probably others, but that's all I can think of at this moment. Shouldn't be too taxing for the regex part of your brain :)
What does cfparam do if the value is defined but doesn't match the regex? Throw an error, or use the default value?
It throws an error. Therefore, you should be wrapping them in some sort of CFTry / CFCatch tag group. Sometimes, I do it around all of them, sometimes I break out special CFParam tags into their own group. This is a bit of a pain, which is why I suggested that Adobe come up with a CFParam tag that has a "catch" attribute:
Interesting. I like the idea conceptually, but as mentioned by a few commenters on your previous post, the idea of throwing an exception for a validation failure seems a little extreme (even if the exception is caught in the same tag).
Realize this, though - the data TYPE validation should never throw an exception unless someone has messed with the form submission process. This is an exception worth case; the data type that we excepted was not the data type that was submitted with the FORM.
Part of this confusion (and I am guilty of this as well) is not know the difference between data TYPE validation and data VALUE validation. Data types are not up for discussion - you set them and the FORM submits them (if no one has messed with the data). It is only in data VALUE validation that one can say, Ok, this "text" input is really supposed to be a "numeric" value. That has nothing to do with data types, only with data values. A subtle but hugely important differentiation.
Can you think of a time when your data types should not line up with what you expect them to be? The only thing I can think of is user-tampering; and, if that is the case, who cares if Exception handling is an expensive process? If a user is going to hack our forms, why should we even care if they are having the quickest user experience possible?
Okay, now I'm confused. I'm talking about the case when the form input doesn't match the regex. According to your previous reply, that throws an exception, which has to be caught. My comment is that I'd prefer to catch that failed validation without throwing an exception.
I agree that throwing an exception when we know the form has been tampered with is reasonable. But I can think of lots of cases where input might not match the correct regex, but the form hasn't been tampered with.
Be careful; you might be talking about validating the "value" of a "text box" input. If that is the case, what are you really doing there? Are you validating the "type" of data (string)? or are you validating the "value" of the data, which is not quite the same thing.
If you are talking about doing this for text inputs, I would say do not use regex / pattern. The user has far too much opportunity to mess it up.
As far as exception catching, this is for the TYPE validation, not the value validation.
I see what you're saying now. In fact I think we're in agreement about this.
Sweeeet :) I have only recently come this mental separation between type and value validation, so it might not be spot on, but something about it just makes me feel warm and fuzzy inside.
Another thing along these lines that I have been thinking about and doing lately is:
If a url param is malformed, then logically that url is wrong, or if there is no record in the db for that id, the page does not exist...
<cfif NOT query.recordCount>
my custom 404.cfm page sets the http response code to 404 and provides a friendly interface. During regular use of the application, all links should go to pages that exist...
Just another step in the validating and data checking processes...
BTW, I always like your blog posts. good work
Along the same lines, I usually do something like this:
<cfif NOT qData.RecordCount>
...<cflocation url="#CGI.script_name#" addtoken="false" />
Basically the same as what you are doing, except you probably have better 404 handling than I do.
And thanks for the kind words :)