Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at cf.Objective() 2009 (Minneapolis, MN) with:

Why Do I Protect People Who Want To Break My Site?

By Ben Nadel on

When I was working on a new ColdFusion and SQL pagination algorithm the other day, I dropped a lot of the validation logic that went into the pagination values. For instance, normally, if someone entered -30 for an offset, I would take the time to make sure that became 1. Or, if someone put in 9999999999 (past the size of the data table), I would make sure that the offset became the last possible page, and not 9999999999. I would even do things like adjusting to a page if someone put in an offset that was not the beginning of a page (ie. they put in the offset 8 and I adjust to offset=6 [for a 5 item page]).

Basically, what I was doing was taking on this attitude:

"If you try to break my site, I am going to bend over backwards to make sure that the site not only works but returns the closest possible data that matches your request".

But, as I was working on the pagination stuff, I realized, this is such a silly thing to do. If someone is going to be malicious, why should I care about putting in the effort? Now, I am not talking about error checking - that should always be done - the site should never error out. I am just talking about cleaning malicious data.

The only thing that I could think of was the problem of false-positives. For instance, what if someone pastes in a URL but not completely and some value copies only partially. This is not malicious, this is just user error. In a case like that, I would want to try and return the closest possible "guesstimated" target result.

Anyway, not sure where I was going with this. Just thinking out load. Cleaning data requires more effort than it might be worth in SOME cases. Just trying to find the balance I am comfortable with.




Reader Comments

Here's a classic example that just popped into my head - TO and FROM dates. If someone puts in a FROM date that is after the TO date, then this will never return records. Should I put up an error message? Or should I just return the data (which will be empty). Is it worth cleaning that?

Ok, maybe not the best example, since that's not really malicious.

Reply to this Comment

Ben,

I normally tell them something like "the end date you selected occurs before the begin date. please select a valid one."

If you guess, and guess wrong, that can be a huge problem. For this case in particular, I think you'd almost always be guessing wrong, because people don't generally do something like that on purpose.

I'm not a big fan of trying to correct data like that - it can only lead to trouble if you guess wrong, and the worst part is that the user may never notice.

If I wanted to be extra friendly, I might provide a "The end date came before the begin date. Did you mean...? " style of thing where they are alerted to the problem, and have a "guessed" chance to fix it.

I could care less either on the paging issue. Sometimes I like to change the page in the URL rather than clicking "next" so I might accidentally go to a higher than highest page. I wouldn't be upset if you gave me the highest available, but I might like to know about it.

Reply to this Comment

I have a project manager who *always* does that !
Show him a entry field that takes a number and he'll enter a million or two in it if he can, preferably a negative one.
I've learnt to anticipate it by now, but I'm *so* tempted to raise an alert of "F*^* off Jose" for the really stupid values :) But that would be very very stupid...

Reply to this Comment

"I could care less either" should have been "I could care less either /way/"

Sorry - it sounded bad the first way.

Reply to this Comment

I would definitely throw an error message, and not try to guess the users intent.

If you guess wrong, the user may get information they think means one thing, but in fact means something else.

Let's take a mapping application. Let's say you meet a woman that lives in a brand new development at 123 Man Street. You check your mapping application for directions because she is very hot, and you do not want to be late. The map database has not yet been updated with the new development and finds no results for Man St. so the app guesses you really meant "Main St." and serves up directions to 123 Main St.

Next thing you know you are spending another Friday night bowling on your Wii and your hot new friend is now someone else's hot new friend.

Reply to this Comment

if they're clearly doing something malicious then send them a virus.. or write their IP address to a temporary blacklist.. or redirect them to a gov site or something..

Reply to this Comment

Ha ha, I see someone just tried to break my pagination with 9999999999999. It does break, but this is a data type issue (the value entered was TOO large for the cf_sql_integer CFQueryParam. This raises another very interesting, but unrelated issue, that better data type validation is required.... or is it - is showing a "friendly" error page an OK thing to do for something like this, which is very clearly malicious.

Reply to this Comment

In the case of pagination, you should be chucking a 404 if something's out of range.

This might be a valid page: index.cfm?page=2

As might this: index.cfm?page=300

But index.cfm?page=9999 might not return any records - so you should 404 it.

Otherwise, if you return a valid page, you'll be telling search engines, browsers etc that the page *really does exist*, and you might end up being indexed in a page you don't want...

I guess what I'm trying to say is you should be treating url parameters as if they're "real" pages (page-2.html, page-300.html), hence you'd be forced to 404 page-9999.html as it wouldn't exist...

Reply to this Comment

Yeah I agree with everyone here I think about not guestimating the user. I also really like Geoffs concept of throwing a 404, I mean, You can still display a nice friendly error to the user saying that the page they requested does not exist, just add an HTTP reponse header for the 404 so the browser/search bot know what is going on.

Rob

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.