Creating Tiny URLs Using ColdFusion

Posted March 25, 2009 at 9:44 AM by Ben Nadel

Tags: ColdFusion

After building my custom ColdFusion radix conversion methods yesterday, I decided to give creating tiny URLs with ColdFusion a shot. Just as with most of the tiny URL services you see on the internet, the demonstration below either automatically generates a tiny URL for you, or it lets you specify a custom short URL (which it will use if the suggested URL has not already been taken). Then, of course, clicking on the short URL tracks click counts and forwards the user to the target URL (forwarding is not done in my demonstration).

 
 
 
 
 
 
 
 
 
 

As you will see in the code below, the short URL that gets stored in the database is actually a primary key on the data table. This is a very important point because it means that the database will throw a primary key violation if we try to insert a duplicate short URL. Because of this, we are able to offload some of the heavy lifting to the database itself.

To understand the beauty of this, let's think about how we might ordinarily handle this kind of a situation. Traditionally, when worrying about inserting unique values into a datatable, we might do the following:

  1. Query database for value.
  2. Check to see if query returned a value.
  3. If no record returned, insert new value.

Of course, in this model, we run a large risk of having a parallel thread insert our target value just after we tried to read it; then, we go to write something that was just written (creating dirty data). To get around this, however, we might serialize the database read/writes:

  1. Lock datatable for serialized access.
  2. Check to see if query returned a value.
  3. If no record returned, insert new value.
  4. Release lock on datatable.

This works, but now we have ColdFusion handling some threading that one might argue is really best left to the database (as the database is more highly optimized to handle conflicts). Also, this assumes that every write to the database will cause a conflict, which it will not. In cases where no conflict will exist, we are single threading for no reason. And, I believe it also locks any unrelated reads to the same datatable (but I am not well educated at this level). I am sure there are transaction settings that you can use to tweak this on the ColdFusion side, but ultimately, I think by leveraging the database's own conflict resolution, we can speed up the process for every process involved.

As such, you will see that the following code actually leverages and embraces primary key errors thrown by the database. And, in doing so, you will see that in a best case scenario, only one database call will need to be made to execute the given request.

And so, without further delay, my Tiny URL ColdFusion demo:

  • <!--- Include utility functions. --->
  • <cfinclude template="_functions.cfm" />
  •  
  •  
  • <!--- Param form variables. --->
  • <cfparam name="FORM.url" type="string" default="" />
  • <cfparam name="FORM.short_url" type="string" default="" />
  • <cfparam name="FORM.submitted" type="boolean" default="false" />
  •  
  • <!--- Create an array of errors. --->
  • <cfset arrErrors = [] />
  •  
  •  
  • <!---
  • Create a variable to hold the generated url. This will
  • only be populated if the form gets processed.
  • --->
  • <cfset strShortURL = "" />
  •  
  •  
  • <!---
  • Get our collection of available characters that can be used
  • when creating character-sensitive tiny URLs.
  • --->
  • <cfset arrCharacters = ListToArray(
  • "a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 1 2 3 4 5 6 7 8 9",
  • " "
  • ) />
  •  
  •  
  • <!--- Check to see if the form was submitted. --->
  • <cfif FORM.submitted>
  •  
  • <!--- Validate our form data. --->
  • <cfif NOT Len( FORM.url )>
  •  
  • <cfset ArrayAppend(
  • arrErrors,
  • "Please enter your target URL to shorten."
  • ) />
  •  
  • </cfif>
  •  
  • <!---
  • Check to see if our suggested short url has any
  • characters that we don't want to handle.
  • --->
  • <cfif REFind( "[^\w\-_]", FORM.short_url )>
  •  
  • <cfset ArrayAppend(
  • arrErrors,
  • "Your tiny url can only contain letters, numbers, dashes, and underscores."
  • ) />
  •  
  • </cfif>
  •  
  •  
  • <!---
  • Check to see if we have any errors so far. If not,
  • then we have enough data to process.
  • --->
  • <cfif NOT ArrayLen( arrErrors )>
  •  
  • <!---
  • Now that we are going to create the new URL, we
  • need to see if they suggested a tiny URL. If so,
  • we are going to TRY to use that.
  • --->
  • <cfif Len( FORM.short_url )>
  •  
  • <!---
  • Try to insert the short url. If the short URL is
  • already taken, then the database will throw an
  • exception (its a primary key).
  • --->
  • <cftry>
  •  
  • <!--- Insert the new short url. --->
  • <cfquery name="qInsert" datasource="ben">
  • INSERT INTO short_url
  • (
  • is_custom,
  • short_url,
  • numeric_value,
  • target_url,
  • click_count,
  • date_last_clicked,
  • date_created
  • ) VALUES (
  • <cfqueryparam value="1" cfsqltype="cf_sql_tinyint" />, <!--- is_custom. --->
  • <cfqueryparam value="#FORM.short_url#" cfsqltype="cf_sql_varchar" />, <!--- short_url. --->
  • <cfqueryparam value="0" cfsqltype="cf_sql_integer" />, <!--- numeric_value. --->
  • <cfqueryparam value="#FORM.url#" cfsqltype="cf_sql_varchar" />, <!--- target_url. --->
  • <cfqueryparam value="0" cfsqltype="cf_sql_integer" />, <!--- click_count. --->
  • <cfqueryparam value="" cfsqltype="cf_sql_timestamp" null="true" />, <!--- date_last_clicked. --->
  • <cfqueryparam value="#Now()#" cfsqltype="cf_sql_timestamp" /> <!--- date_created. --->
  • );
  • </cfquery>
  •  
  •  
  • <!---
  • If we have gotten this far, then we have
  • successfully inserted the new short URL.
  • Store the code that was used.
  • --->
  • <cfset strShortURL = FORM.short_url />
  •  
  •  
  • <!--- Catch primary key exception. --->
  • <cfcatch>
  •  
  • <cfset ArrayAppend(
  • arrErrors,
  • "The suggested tiny url label is already in use."
  • ) />
  •  
  • </cfcatch>
  • </cftry>
  •  
  • <cfelse>
  •  
  • <!---
  • Now short URL was suggested, so, we need to
  • generate one ourselves. To do so, we are going to
  • grab the largest non-custom value and then keep
  • incrementing it until we insert one successfully.
  • --->
  •  
  • <!--- Grab the largest custom value. --->
  • <cfquery name="qMaxCustom" datasource="ben">
  • SELECT
  • MAX( numeric_value ) AS numeric_value
  • FROM
  • short_url
  • </cfquery>
  •  
  • <!---
  • Get our base numeric value by adding one to the
  • max numberic value.
  • --->
  • <cfset intNextValue = (Val( qMaxCustom.numeric_value ) + 1) />
  •  
  •  
  • <!---
  • Now that we have this value, we are going to
  • create a short URL based on our custom-radix
  • encoding of this unique value. Because someone
  • else might be submitting this form at the same
  • time, we are going to keep looping until the
  • database stops throwing primary key errors (on
  • the short_url).
  • --->
  • <cfloop condition="true">
  •  
  • <!---
  • Get the short-url encoding of our next
  • numeric value.
  • --->
  • <cfset strShortURL = FormatBaseNData(
  • intNextValue,
  • arrCharacters
  • ) />
  •  
  • <!---
  • Try to insert the new value. If this short URL
  • is already taken, database will throw an error
  • (primary key violation).
  • --->
  • <cftry>
  •  
  • <!--- Insert the new short url. --->
  • <cfquery name="qInsert" datasource="ben">
  • INSERT INTO short_url
  • (
  • is_custom,
  • short_url,
  • numeric_value,
  • target_url,
  • click_count,
  • date_last_clicked,
  • date_created
  • ) VALUES (
  • <cfqueryparam value="0" cfsqltype="cf_sql_tinyint" />, <!--- is_custom. --->
  • <cfqueryparam value="#strShortURL#" cfsqltype="cf_sql_varchar" />, <!--- short_url. --->
  • <cfqueryparam value="#intNextValue#" cfsqltype="cf_sql_integer" />, <!--- numeric_value. --->
  • <cfqueryparam value="#FORM.url#" cfsqltype="cf_sql_varchar" />, <!--- target_url. --->
  • <cfqueryparam value="0" cfsqltype="cf_sql_integer" />, <!--- click_count. --->
  • <cfqueryparam value="" cfsqltype="cf_sql_timestamp" null="true" />, <!--- date_last_clicked. --->
  • <cfqueryparam value="#Now()#" cfsqltype="cf_sql_timestamp" /> <!--- date_created. --->
  • );
  • </cfquery>
  •  
  •  
  • <!---
  • If we have gotten this far, then we have
  • successfully inserted the new short URL.
  • Break out of loop and proceed with the
  • selected short url.
  • --->
  • <cfbreak />
  •  
  •  
  • <!--- Catch primary key exception. --->
  • <cfcatch>
  •  
  • <!---
  • Short url was already taken. Increment
  • the next value index and try again.
  • --->
  • <cfset intNextValue++ />
  •  
  • </cfcatch>
  • </cftry>
  •  
  • </cfloop>
  •  
  • </cfif>
  •  
  • </cfif>
  •  
  • </cfif>
  •  
  •  
  • <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  • <html>
  • <head>
  • <title>ColdFusion Tiny URL Demo</title>
  • </head>
  • <body>
  •  
  • <h1>
  • ColdFusion Tiny URL Demo
  • </h1>
  •  
  • <cfoutput>
  •  
  • <!--- Check to see if a short URL was generated. --->
  • <cfif Len( strShortURL )>
  •  
  • <p>
  • <strong>Generated URL:</strong>
  • <a
  • href="./go/?#strShortURL#"
  • target="shorturl"
  • >./go/?#strShortURL#</a>
  • </p>
  •  
  • </cfif>
  •  
  •  
  • <!--- Check to see if we have any errors to display. --->
  • <cfif ArrayLen( arrErrors )>
  •  
  • <p>
  • <strong>Please review the following:</strong>
  • </p>
  •  
  • <ul>
  • <cfloop
  • index="strError"
  • array="#arrErrors#">
  •  
  • <li>
  • #strError#
  • </li>
  •  
  • </cfloop>
  • </ul>
  •  
  • </cfif>
  •  
  •  
  • <form action="#CGI.script_name#" method="post">
  •  
  • <!--- Submission flag. --->
  • <input type="hidden" name="submitted" value="true" />
  •  
  • <p>
  • <label>
  • Target URL:<br />
  •  
  • <input
  • type="text"
  • name="url"
  • value="#FORM.url#"
  • size="60"
  • maxlength="500"
  • />
  • </label>
  • </p>
  •  
  • <p>
  • <label>
  • Tiny URL <em>(optional)</em>:<br />
  •  
  • <input
  • type="text"
  • name="short_url"
  • value="#FORM.short_url#"
  • size="30"
  • maxlength="30"
  • />
  • <br />
  • </label>
  • </p>
  •  
  • <p>
  • <input type="submit" value="Get Short URL" />
  • </p>
  •  
  • </form>
  •  
  • </cfoutput>
  •  
  • </body>
  • </html>

At the top of the code, you will see that I am including a _functions.cfm template. This is where I am including the FormatBaseNData() ColdFusion UDF that I wrote yesterday. Since I am only doing a proof of concept, my short URL is being tested as a query string rather than a 404 / URL rewrite handler. Here is the local page that handles the short URL if you click on the generated link (no error checking has been added to this portion of the demo):

  • <!--- Query for the tiny url. --->
  • <cfquery name="qShortURL" datasource="ben">
  • <!--- Increment the click count. --->
  • UPDATE
  • short_url
  • SET
  • click_count = (click_count + 1),
  • date_last_clicked = <cfqueryparam value="#Now()#" cfsqltype="cf_sql_timestamp" />
  • WHERE
  • short_url = <cfqueryparam value="#CGI.query_string#" cfsqltype="cf_sql_varchar" />
  • ;
  •  
  • <!--- Gath short URL data. --->
  • SELECT
  • u.short_url,
  • u.target_url,
  • u.click_count
  • FROM
  • short_url u
  • WHERE
  • short_url = <cfqueryparam value="#CGI.query_string#" cfsqltype="cf_sql_varchar" />
  • ;
  • </cfquery>
  •  
  •  
  • <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  • <html>
  • <head>
  • <title>ColdFusion Tiny URL Demo</title>
  • </head>
  • <body>
  •  
  • <h1>
  • ColdFusion Tiny URL Demo
  • </h1>
  •  
  • <cfoutput>
  •  
  • <p>
  • <strong>Short URL:</strong>
  • #qShortURL.short_url#
  • </p>
  •  
  • <p>
  • <strong>Target URL:</strong>
  • #qShortURL.target_url#
  • </p>
  •  
  • <p>
  • <strong>Click Count:</strong>
  • #qShortURL.click_count#
  • </p>
  •  
  • </cfoutput>
  •  
  • </body>
  • </html>

So there you have it - my first attempt to create a tiny URL system in ColdFusion. As a final note, I just wanted to point out because I am using both lowercase and uppercase values in my short URLs, it is important that my database table treat the short_url column as case sensitive. In MySQL, in order to do that, I had to change the COLLATION of the short_url field to be (UTF_bin). This treats the string as a binary value which MySQL will always compare using case sensitive operations.




Reader Comments

Mar 25, 2009 at 10:47 AM // reply »
29 Comments

Nice idea with the primary key exception! Still, the larger the table the slower this approach gets. A table with a thousand entries would throw 1000 exceptions in a row before being able to insert a new value, right?
So some kind of hashing with few collisions would be advisable, that's what all these tinyurl services seem to do?


Mar 25, 2009 at 10:55 AM // reply »
11,238 Comments

@Martin,

Before it starts to insert primary key values, the code gets the largest numeric value (for non-custom values). At that point, it starts to try inserting, incrementing the value each time it fails. So, in a best case scenario, it would be:

1. Get highest value.
2. Increment value.
3. Insert value.

If there were conflicts being created, the page flow would be:

1. Get highest value.
2. Increment value.
3. Insert value.
4. Catch exception (goto #2)


Mar 25, 2009 at 11:12 AM // reply »
44 Comments

although this approach is good. i think a better approach would be to use to use an algorithm to generate the tinyurls instead of letting the user enter them.

you could generate a SHA1 and use the first 7 or so character from it (similar to what you can do in git to access commits) an use that as the url ids.


Mar 25, 2009 at 11:14 AM // reply »
11,238 Comments

@Tony,

I think you can alter the code to use a different auto-generation algorithm AND still allow the user to enter their own. I don't know if you watched the video, but the ability to enter your own tiny URL is an optional feature; if left blank, the system will auto-generate one.


Mar 25, 2009 at 11:20 AM // reply »
29 Comments

Ah, I see! Amazing :)


Mar 25, 2009 at 11:22 AM // reply »
11,238 Comments

@Martin,

I am not sure if it's the best way to handle the primary key conflicts - but, there is something that feels really elegant about it.


Mar 25, 2009 at 11:25 AM // reply »
3 Comments

These last two posts have been directly in line with some work I've been doing with generating keys to make urls simpler.

The url shortener is on docket for later this week, but I'll be keeping it simple with just 36 radix.

I do random generation so people can't just step up the sequence.

It's good to see your work for similar solutions.


Mar 25, 2009 at 11:26 AM // reply »
11,238 Comments

@Greg,

Awesome. When you post your thoughts, be sure to drop a link in here so we can take a look.


Mar 25, 2009 at 11:27 AM // reply »
2 Comments

I built something like this last year at work and used a slick trick a colleague of mine showed me: autoincrement your url's id column of regular old type integer, but then lookup or output on the fly in base 36, using formatBaseN and inputBaseN. Clever eh?


Mar 25, 2009 at 11:30 AM // reply »
11,238 Comments

@Jason,

This uses a similar technique. However, since people can suggest their own small url value, I didn't want to use the auto-increment column as then I would be skipping over available incremental values. As such, I do keep a secondary "numeric_value" column in which I am doing a BaseN custom radix conversion.


Mar 25, 2009 at 4:02 PM // reply »
19 Comments

Ben, you should do more video's! or even a kinky podcast! Would be a great addition to the blog


Mar 25, 2009 at 4:39 PM // reply »
11,238 Comments

@Will,

Thank man. I will try to put more videos up here.


Mar 26, 2009 at 8:48 AM // reply »
131 Comments

How about the EXISTS test to cut down on the overhead of leaving the DB engine to determine the PK clash (referring to Martin's first comment)? Not sure if it's more performant over large sets or not, but something like this would still generate an error for the CFTRY block with more control over the DB handling:

IF NOT EXISTS (
SELECT short_url
FROM short_url
WHERE short_url = <cfqueryparam value="#strShortURL#" cfsqltype="cf_sql_varchar" />
)
BEGIN
INSERT INTO short_url
(
...
) VALUES (
...
)
END
ELSE
BEGIN
RAISERROR('The suggested tiny url label is already in use.', 20, 1)
END


Mar 26, 2009 at 8:58 AM // reply »
11,238 Comments

@Jason,

That might work for the case where the user suggests a short url. But, is manually raising an error in the SQL really any different from letting the database raise an PKEY duplication error?


Mar 26, 2009 at 9:21 AM // reply »
131 Comments

Yep, that's the question. As I said, not sure it's an improvement, but I do think Martin was on to something with his concern that waiting for the DB to bubble through all possible PK values before throwing may become a performance issue as the table size grows. It is also possible, however, that EXISTS also takes just as much overhead ... I'll give this a test if I get a few minutes today.


Mar 26, 2009 at 1:26 PM // reply »
29 Comments

@Jason & @Ben
I could imagine that the database internally holds the max value of each column separately. Thus every DB query of that kind should be processed in a constant amount of time. But I'm interessted in your test results!


Mar 27, 2009 at 9:09 AM // reply »
131 Comments

Well, testing with 100,000 UUIDs in the PK, it really appears that both approaches (using EXISTS to pre-test vs simply throwing the repeat value in and waiting for the thrown error) consistently take 0 milliseconds on the DB side. If you scale that out to several million maybe it makes a difference, but my guess is that the PK indexing is good enough to make performance a wash in either case.


Mar 27, 2009 at 12:12 PM // reply »
11,238 Comments

@Jason,

Thanks for testing that! Good to hear that it's fast either way.


Oct 13, 2010 at 7:22 PM // reply »
11 Comments

Although this post didn't help me directly, the concept got me started so thanks (as usual) Ben!

I thought I'd share my final function here and get some feedback.

  • <cfscript>
  • function get_shorty(length){
  • short = '';
  • va = "A,B,C,D,E,F,G,H,I,J,K,L,M,N,P,Q,R,S,T,U,V,W,X,Y,Z,a,b,c,d,e,f,g,h,i,j,k,m,n,o,p,q,r,s,t,u,v,w,x,y,z";
  • numpos = randrange(2,length-1) & ',' & randrange(2,length-1);
  • for (i=1;i lte length; i=i+1){
  • if(listfind(numpos,i)){
  • short = short & randrange(1,9);
  • }
  • short = short & listgetat(va,randrange(1,listlen(va)));
  • }
  • return trim(left(short,length));
  • }
  • </cfscript>
  • <cfoutput>#get_shorty(7)#</cfoutput>

This (obviously) wouldn't check for duplicates but should get someone going.

Finally, I'm doing a simple BINARY call in MySQL to do a case sensitive query to my object table.

  • ... WHERE short_url = binary('#shortid#')

- Brian.


Oct 24, 2010 at 1:50 PM // reply »
11,238 Comments

@Brian,

Glad to, at the very least, help you get the motor running. I need to look into binary-comparisons in SQL. I always see them used and I can never remember how they work exactly. I know people have explained it to me before, but it's one of those things I never use (and therefore never commit to memory).

Perhaps it's time for a blog post on the matter to drive the point home.


Jul 20, 2012 at 4:04 PM // reply »
2 Comments

Not sure if you are still looking at this post or not, but here goes.
I want to use this, but need to have the ?shortword
appear just after the www.aiag.org/?shortword.
However, it is not recogizing the /?shortword, it only sees the default file.
Any ideas you could pass back would be helpful.



Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
May 20, 2013 at 11:45 AM
Using jQuery's Animate() Step Callback Function To Create Custom Animations
This is really useful. I found out that you don't actually have to use a dummy css property (surprisingly). To animate a property in a linear-gradient for instance I did this this.css('someLinearGra ... read »
May 20, 2013 at 10:51 AM
Using A Dynamic Column Name With ValueList() In ColdFusion
@Josh, Oh snap! You're totally right! I'm not sure I've ever tried that. I did know that you can call a number of other array-methods on ColdFusion query columns: http://www.bennadel.com/blog/167 ... read »
May 20, 2013 at 10:45 AM
Using A Dynamic Column Name With ValueList() In ColdFusion
@Ben - I believe you can achieve the same functionality with ColdFusion's built in ArrayToList() function. ArrayToList( users[ "id" ] ); ... read »
May 20, 2013 at 10:21 AM
My Experience With AngularJS - The Super-heroic JavaScript MVW Framework
Is there any error logging and handling framework in angularjs, if not then in what way I can do this. ... read »
May 19, 2013 at 2:31 PM
My Experience With AngularJS - The Super-heroic JavaScript MVW Framework
It's funny really just how well that image describes the way I would imagine most people that go with angular for some project is. I have had a similar roller-coaster ride with it as well, but not qu ... read »
May 17, 2013 at 7:42 PM
HashKeyCopier - An AngularJS Utility Class For Merging Cached And Live Data
Ben - thanks so much for posting these Angular articles and findings, they've been a huge help towards learning one of the more 'complex' JavaScript frameworks out there (IMO). I have been using Angu ... read »
May 16, 2013 at 5:01 PM
UPDATE: Parsing CSV Data Files In ColdFusion With csvToArray()
Your code was the closest thing I've found to obtaining some direction for converting ISO fields to values that CF can translate properly. Thank you for posting! ... read »
May 15, 2013 at 6:07 PM
Making SOAP Web Service Requests With ColdFusion And CFHTTP
Ben, you once again saved my bacon at work. Thank you, thank you, thank you! ... read »
InVision App - Prototyping Made Beautiful With Prototyping Tools