Ask Ben: Adding A Query String Pair Value To Existing HTML Using ColdFusion (Flagrant Badassery Version)

Posted April 17, 2007 at 8:02 AM by Ben Nadel

Tags: ColdFusion, Ask Ben

I have a challenge for you. I would like to find every link in a block of html and add a name/value pair to the end of the query string. For example: It might find a link <a href="www.google.com">google</a> I want it to change the link to <a href="www.google.com?SID=123">google</a> We need to keep in mind that some links might already have a '?' and we need to use '&' while others will not have the '?'.

Steve of Flagrant Badassery accepted my challenge to modify the regular expression from Version One to handle all the use cases described (and discovered) in the problem domain. As I expected, he did a fantastic job. After picking apart his regular expression I can understand how it works:

  • (<a\s[^>]*?href\s*=\s*""[^?#""]*)\??(?![^#""]*?\bsource=)

At first, I couldn't understand how it handled Hash signs in the URL as it never seemed to replace them back in. But then, I realized that he handled them by not handling them at all. He quite rightly stopped matching the URL at the point it contained a hash sign. The reason for this is that we simply don't need to know what the anchor value is, so why bother even gathering it (especially since it might not even be present). Quite clever!

He also doesn't match any of the query string that is already in the URL. He simply checks to make sure that the URL value "source" is not already present (via a negative look ahead). He then adds our name-value pair to the beginning of the query string. And, just as he did with the Hash sign above, since we are not altering the existing query string, no need to match it. Again, very clever! I never think to NOT match things :)

Here is Steve's bad-ass regular expression solution applied to the example:

  • <html>
  • <head>
  • <title>Alter URL Demo</title>
  • </head>
  • <body>
  •  
  • <p>
  • Hey man, if you are looking for some good images, you should probably try out the search page on <a href="http://www.searchgalleries.com?source=bennadel.com" target="_blank">Search Galleries</a>. It's pretty darn comprehensive and seems to keep track of all the free galleries that you will ever need. If you want to mess with the URL, its easy; just add a "q" query string value to the search url. The general site search URL is <a href="http://www.searchgalleries.com/search/" target="_blank">http://www.searchgalleries.com/search/</a>. So, then, to add a query value to it, such as "mature", you would simply add the query string "q=mature" to the url: <a href="http://www.searchgalleries.com/search/?q=mature#links" target="_blank">http://www.searchgalleries.com/search/?q=mature</a>. You can even search for more than one value at a given time. So, for instance, if you want to search for mature brunette women, you would put go to the URL:
  • <a href="http://www.searchgalleries.com/search/?q=mature+brunette#links" target="_blank">http://www.searchgalleries.com/search/?q=mature+brunette</a>. Notice that "mature" and "brunette" are separated by a "+" sign. This is the URL encoded form of a space.
  • </p>
  •  
  • </body>
  • </html>
  •  
  • <!--- Get the page context. --->
  • <cfset objPageContext = GetPageContext() />
  •  
  • <!--- Get the page buffer. --->
  • <cfset objBuffer = objPageContext.GetOut().GetBuffer() />
  •  
  • <!---
  • Get the content buffer string. This will give us everything
  • that has NOT yet been flushed to the browser. This is just
  • how I am doing it for this demo and is NOT the only way to
  • perform this task. Since this page is small, (and is being
  • tested), we can safely assume that the content has not yet
  • been flushed to the client.
  • --->
  • <cfset strContent = objBuffer.ToString() />
  •  
  • <!---
  • Steve of Flagrand Badassery has taken my challenge to modify
  • the regular expression in order to handle this replace in
  • one swoop rather than using the Java Pattern / Matcher. Here
  • is my attempt to break down his regular expression:
  •  
  • (?i)
  • -- Case insensitive (I added this to use the Java regex
  • -- replace rather than REReplaceNoCase()).
  •  
  • -- First Group:
  • (
  • <a\s[^>]*?
  • -- Only match the Anchor tag followed by a space
  • -- followed by a lazy match of non-">" characters.
  • -- The lazy nature of this regular expression will
  • -- try to match the next token (href) when possible.
  •  
  • href\s*=\s*""[^?##""]*
  • -- The href attribute with possible spaces around
  • -- the equals sign (nice call! I always forget
  • -- that). Then, quotes followed by any characters
  • -- not include ?, #, or ".
  • )
  •  
  • \??
  • -- Matches the literal "?" zero or one times (an
  • -- optional characters in the URL.
  •  
  • -- Negative look ahead:
  • (?!
  • [^##""]*?\b
  • -- A lazy match for any character not including
  • -- # and " followed by a word boundry.
  •  
  • source=
  • -- The URL param that we DONT want to add if it
  • -- already exists (hence the negative look ahead
  • -- that we are currently in).
  • )
  • --->
  • <cfset strContent = strContent.ReplaceAll(
  • "(?i)(<a\s[^>]*?href\s*=\s*""[^?##""]*)\??(?![^##""]*?\bsource=)",
  • "$1?source=bennadel.com&"
  • ) />
  •  
  • <!--- Clear the existing content buffer. --->
  • <cfset objPageContext.GetOut().ClearBuffer() />
  •  
  • <!--- Output the updated HTML. --->
  • <cfset WriteOutput( strContent ) />

Running that indeed gives us the desired output. In the following, you will notice that it does add an additional "&" to the URL. This might not be considered the cleanest, but it will in no way cause any harm and I am absolute content with this solution:

  • <html>
  • <head>
  • <title>Alter URL Demo</title>
  • </head>
  • <body>
  •  
  • <p>
  • Hey man, if you are looking for some good images, you should probably try out the search page on <a href="http://www.searchgalleries.com?source=bennadel.com" target="_blank">Search Galleries</a>. It's pretty darn comprehensive and seems to keep track of all the free galleries that you will ever need. If you want to mess with the URL, its easy; just add a "q" query string value to the search url. The general site search URL is <a href="http://www.searchgalleries.com/search/?source=bennadel.com&" target="_blank">http://www.searchgalleries.com/search/</a>. So, then, to add a query value to it, such as "mature", you would simply add the query string "q=mature" to the url: <a href="http://www.searchgalleries.com/search/?source=bennadel.com&q=mature#links" target="_blank">http://www.searchgalleries.com/search/?q=mature</a>. You can even search for more than one value at a given time. So, for instance, if you want to search for mature brunette women, you would put go to the URL:
  • <a href="http://www.searchgalleries.com/search/?source=bennadel.com&q=mature+brunette#links" target="_blank">http://www.searchgalleries.com/search/?q=mature+brunette</a>. Notice that "mature" and "brunette" are separated by a "+" sign. This is the URL encoded form of a space.
  • </p>
  •  
  • </body>
  • </html>

Nicely done. Also, as one final note, his solution is WAY smaller than mine and will not match the LINK tag (which mine shamefully would). This just goes to demonstrate how amazingly powerful regular expressions are once you fully understand how they can be applied and you can see where they can be applied. Looking at the regular expression above, I see where I really fell short was not in understanding the regular expresssion - I see how it works. Where I fell short was that I simply didn't see how simple it could be if I didn't bother to match the extraneous parts of the URL. I hope that that sort of skill just comes with time and experience.




Reader Comments

Apr 17, 2007 at 10:14 AM // reply »
172 Comments

Regex skillz0rz is one of the lesser-known side-effects of following the great hawk (along with acute eyesight and fire breath). Unfortunately, none of these work very well with the ladies.


Apr 17, 2007 at 10:41 AM // reply »
11,238 Comments

Ha ha ha ha :)


Apr 17, 2007 at 1:18 PM // reply »
3 Comments

I feel this is highly appropriate:

http://xkcd.com/c208.html

:)


Apr 17, 2007 at 1:25 PM // reply »
11,238 Comments

Ha ha, a classic. Steve can definitely save the day.


Apr 17, 2007 at 3:31 PM // reply »
43 Comments

Quick. Somebody send Steve the t-shirt: http://xkcd.com/store/


Apr 17, 2007 at 3:35 PM // reply »
11,238 Comments

Awesome. I didn't know that site had a store. Steve, email me your address, you're getting a t-Shirt ;)


Apr 17, 2007 at 3:37 PM // reply »
43 Comments

Nah. Steve, just post your address here. You'll get lots of cool stuff. Really... :-)


Apr 17, 2007 at 5:49 PM // reply »
46 Comments

Couldn't this also be done client-side? jquery-yo?


Apr 17, 2007 at 8:30 PM // reply »
172 Comments

@Jim Curran:

Heheh! That strip is a classic. :)

@Ben Nadel:

Thanks dude, but don't worry about getting me any free shiznit. (I just might have to buy that No Velociraptors shirt for myself though.)

@Rob Wilkerson:

123 Sesame Street NW


Apr 17, 2007 at 8:41 PM // reply »
172 Comments

One more case where it would make sense to avoid modifying anything is URLs which simply point to a page anchor (e.g., href="#top"). That's easy to do by adding "(?!#)" immediately after the opening quote character for the href attribute in the regex. So, with all the ColdFusion-style escapings, etc., the search pattern would become "(?i)(< a\s[^>]*?href\s*=\s*""(?!##)[^?##""]*)\??(?![^##""]*?\bsource=)" (remove the space between "<" and "a", which was added to avoid anti-spam measures).


Apr 17, 2007 at 8:53 PM // reply »
172 Comments

@Glen Lipka:

It wouldn't make a lot of sense to do it the same way client-side, unless you were updating hrefs within a block of source code that you subsequently insert into the page using document.write or innerHTML. To pull this off on the client side, I'd imagine you'd do something like the following (which doesn't use any particular JavaScript library):

-----------------------------
(function(){
var source = encodeURIComponent("bennadel.com");
var links = document.getElementsByTagName("a");
for (var i = 0; i < links.length; i++) {
<em style="color:green">// URLs which simply point to an anchor within the page shouldn't be modified.
// However, element.href returns an absolute URL regardless of the actual source code,
// so we check if the href contains "#", and that the href and the current page are
// the same after removing any anchors from each. If both conditions are true, the
// browser won't request anything from the server when following the link, so we don't
// want to mess with that by adding a new query key.
if (!(
links[i].href.indexOf("#") > -1 &&
links[i].href.replace(/#.*/, "").toLowerCase() == location.href.replace(/#.*/, "").toLowerCase()
)) {
links[i].href = links[i].href.replace(/^([^?#]*)(\??)(?![^#]*?\bsource=)/, function($0, $1, $2) {
<em style="color:green">// Only include "&" at the end of the replacement string if the URL contained a query
return $1 + "?source=" + source + ($2 == "" ? "" : "&");
});
}
}
})();
-----------------------------

That does not modify any URLs which point to anchors on the current page, but it handles it differently than the ColdFusion/regex-only version because in JavaScript anchorElement.href always returns an absolute URL. Additionally, the above code avoids adding any unnecessary ampersands within URL queries by using a function to determine the replacement string.

Feel free to tighten that up using jQuery, although I'm not sure how useful this code really is.


Apr 17, 2007 at 8:59 PM // reply »
172 Comments

Crap, the code didn't come out very well. Note that in addition to the indentation problem, the two instances of "<em style="color:green">" were not meant to show (their closing tags were stripped out, however).


Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
May 17, 2013 at 7:42 PM
HashKeyCopier - An AngularJS Utility Class For Merging Cached And Live Data
Ben - thanks so much for posting these Angular articles and findings, they've been a huge help towards learning one of the more 'complex' JavaScript frameworks out there (IMO). I have been using Angu ... read »
May 16, 2013 at 5:01 PM
UPDATE: Parsing CSV Data Files In ColdFusion With csvToArray()
Your code was the closest thing I've found to obtaining some direction for converting ISO fields to values that CF can translate properly. Thank you for posting! ... read »
May 15, 2013 at 10:37 PM
Very Simple Pusher And ColdFusion Powered Chat
hi id making plz easy ... read »
May 15, 2013 at 6:07 PM
Making SOAP Web Service Requests With ColdFusion And CFHTTP
Ben, you once again saved my bacon at work. Thank you, thank you, thank you! ... read »
May 15, 2013 at 4:15 PM
What If All User Interface (UI) Data Came In Reports?
@Josh, Thanks! @Ben, I definitely recommend the David West book "Object Thinking" I've been quoting from. It goes deeply into the philosophy and history of OO programming. His breadth ... read »
May 15, 2013 at 11:36 AM
Ask Ben: Print Part Of A Web Page With jQuery
I found this helpfull when you need to keep (refresh) the original parent page after closing the iframe child print dialog (Hoping you're not using a form at this time so it won't submit again): On ... read »
May 14, 2013 at 7:13 PM
What If All User Interface (UI) Data Came In Reports?
@Jonah, If there's any books you'd recommend on the subject of domain modelling, I'd love to hear it. I just downloaded the free PDF of "Domain Driven Design Quickly". Figured I'd give it ... read »
May 14, 2013 at 6:57 PM
The UX Of Prototyping: Low-Fidelity Is The New High-Fidelity
@Phillip, I'm not sure I follow what you mean? Are you saying that you looked at the list of widgets provided by the jQuery UI and let that be your style guide? ... read »
InVision App - Prototyping Made Beautiful With Prototyping Tools