Wrestling With My Dogmatic Fear Of The REQUEST Scope And Accessing Global Variables In ColdFusion

By Ben Nadel

Published 2020-08-26 in ColdFusion — Comments (3)

For years, I've had a general notion that "global variables" are a "Bad Thing" ™. And, I've come to love Inversion of Control (IoC); and, I believe that Dependency Injection (DI) is one of the greatest things since sliced-bread. But, I fear that I've become blind to the pragmatic use-cases in which dirty code is actually better code. Even now as I type that out, it makes me uncomfortable - but, that's where the personal-growth happens! As such, I wanted to sit down and talk about my fears so that I may possibly overcome them and learn to accept the request scope and that some globally-accessible variables in ColdFusion will make my life better.

I first became aware of this fear when InVision introduced engineering standards around request tracing. This is the idea that an incoming request may have properties like:

Request-ID
Calling-Service

And, that these properties have to be propagated throughout the request. Meaning, these tracing values have to be:

Included in any log-data generated by the current request.
Embedded in any message-queue payloads formulated by the current request.
Passed-along with any CFHttp requests spawned by the current request.

In the discussions about how something like this should be implemented, I was adamant that whatever we did, it should not be magical! Meaning, everything should be explicit: clearly identified values that are passed-around as needed down through the call-stack.

When I thought about this type of implementation, I took a purely academic perspective, thinking about Inversion of Control, Dependency Injection, High Coupling, Low Cohesion, and all the other buzz words that you can think of. What I did not think about was how practical any of it was. What I did not think about was how verbose every method call would have to become. What I did not think about was how every invocation would have to change when the tracing requirements within the engineering department changed.

But mostly, what I did not think about was how hypocritical I was being. Because, the reality is, I already reference a lot of globally-accessible data in my ColdFusion applications. There is no more clear example of how beneficial globally-accessible data can be than in my error logging.

An error can happen anywhere. It can happen in the Application.cfc; it can happen in a Controller method; it can happen in a Service method; it can happen in a CFThread; it can happen in a scheduled-task execution. No line of code is truly safe - even bug-free code is still vulnerable to "sudden thread death".

Which means, I have some try/catch blocks in my ColdFusion applications. And, where it makes sense - where an error can't bubble-up to a higher call-stack - I have logging inside my catch blocks. And, when I log an error, here's what my ColdFusion logging component includes:

The sanitized contents of the url scope.
The sanitized contents of the form scope.
A subset of the HTTP headers (pulled from getHttpRequestData()).
The request IP address (pulled from the cgi scope or getHttpRequestData() when using IP-forwarding).
The request method (pulled from the cgi scope).
The request host (pulled from the cgi scope).
The request path (pulled from the cgi scope).
The request referrer (pulled from the cgi scope).
The request user-agent (pulled from the cgi scope).

Heck, I even attempt to pull the aforementioned Request-ID out of a variety of globally-accessible locations!

ASIDE: When I say "sanitized contents", what I mean is that my error-logging components strip-out sensitize user data and other PII (Personally Identifiable Information). Things like bearer token, authorization headers, passwords, credit card expiration dates, and other information that my Security team would hate to see show up in a remote log-aggregation system.

So here I am, up on my high-horse, talking about how we can't use globally-accessible data to implement "request tracing" while, at the same time, I am happily using globally-accessible data - for great good, mind you - in my error logging.

I am sure that I have always rationalized this disconnect by looking at the "error logging" as a "dirty task". Meaning, errors are messy and can happen anywhere; so, it's OK that the logging of those errors be commensurately messy.

But, I fear that this rationalization has been a disservice.

If I try to dig-deep, I believe that much of my fear comes from the fact that I've had to maintain a lot of bad code in my life. Specifically in contexts where Dependency Injection (DI) is less popular. In Node.js / Express.js applications, for example, there is this concept of the "request object", where an open-ended Object / Hash / Struct just gets passed-down through every request-handler; and, every line-of-code that touches this object has the freedom to write anything it wants to that object. Which means, trying to figure out where any given value comes from is a complete nightmare!

ASIDE: Honestly, I find the concept of "middleware", in general, to be problematic. And, I find that middleware can often be removed by creating more intelligent objects that encapsulate logic for what would have been middleware implementations. But, that's unrelated to this post - a rant for another time.

I believe that the trauma of dealing with such unreadable code has lead me to the extreme opposite stance: all code must be explicit! And, while I think that most code should be explicit; and, as much as it makes sense, we should avoid globally-accessible data; I am finally ready to come down of my high-horse and reach a more happy medium.

I think my biggest fear has been the "unknowable state" of globally-accessible data. And, I'm wondering if the best way to counteract that fear is to simply make the unknowable more knowable through a consistent API. What if, instead of referencing globally-accessible data, like the request scope, directly, I do so through an interface that I can inject into other components?

I think this would allow me to keep the "clean parts" clean and better encapsulate the "dirty parts" within an easy-to-understand API. Just thinking out loud here, but what if I created some sort of RequestMetadata.cfc ColdFusion component that could be "initialized" at the start of every incoming HTTP request; and then, would expose methods for accessing data anywhere in the call-stack that would otherwise be globally-accessible?

I haven't fully thought this through, but an abbreviated version of such a ColdFusion component might look like this:

component
	output = false
	hint = "I provide a strong API around globally-accessible data."
	{

	/**
	* This method would be called as the very first action within the request
	* processing, allowing for some request-specific data to be initialized as
	* needed.
	*/
	public void function setupRequest() {

		request.requestID = getOrCreateRequestID();

	}

	// ---
	// PUBLIC METHODS.
	// ---

	// .... truncated example ....
	// .... truncated example ....
	// .... truncated example ....

	public string function getRequestCountryCode() {

		var headers = getRequestHeaders();

		// Using CloudFlare headers.
		return( headers[ "CF-IPCountry" ] ?: "" );

	}


	public string function getRequestCallingService() {

		var headers = getRequestHeaders();

		return( headers[ "Calling-Service" ] ?: "None" );

	}


	public struct function getRequestHeaders() {

		return( getHttpRequestData().headers );
		
	}


	public string function getRequestID() {

		return( request.requestID );

	}


	public string function getRequestIP() {

		var headers = getRequestHeaders();

		// Since we are behind multiple load-balancers, we should look for the
		// IP address in the HTTP headers first, and then fallback to using the
		// more-common
		var ipAddress = ( headers[ "X-Forwarded-For" ] ?: cgi.remote_addr )
			.trim()
			.lcase()
			.listFirst()
		;

		return( ipAddress );

	}


	public numeric function getUserID() {

		// Try to pull the user ID out of the FW/1 request-context if it has
		// been defined.
		return( val( request.context?.user?.id ?: 0 ) );

	}

	// .... truncated example ....
	// .... truncated example ....
	// .... truncated example ....

	// ---
	// PRIVATE METHODS.
	// ---

	private string function createRequestID() {

		return( "inv-#createUuid()#".lcase() );

	}


	private string function getOrCreateRequestID() {

		var headers = getRequestHeaders();

		return( headers[ "Request-ID" ] ?: createRequestID() );

	}

}

With a component like this, I could now use Dependency-Injection (DI) to make the RequestMetadata instance available to anything within my ColdFusion call-stack; and with that, I could "more cleanly" reach into the global scope - using a predictable API - to grab values like the "User ID", the "IP Address", and the "Request ID".

An added benefit of this approach is that I would also get to encapsulate some of the more convoluted logic surrounding some of the values. For example, none of the calling code would have to understand that the IP-address is being propagated through load-balancer HTTP headers; or, that we can do some high-level geolocation using CloudFlare country-code headers (but only in a production environment).

With an encapsulation of globally-accessible data like this, I think I would be able to squash the emotional and dogmatic fears that I've had about accessing global data. And, remove some the hypocritical tendencies that I have in my application architecture. I can sure tell that I'd love to never pass-down another IP-address or User-Agent string through my method calls!

Want to use code from this post? Check out the license.

Short link: https://bennadel.com/3882

Reader Comments

Ben Nadel Aug 26, 2020 at 7:24 AM

15,848 Comments

@All,

This post was, in large part, inspired by a recent episode of the Go Time podcast from the Changelog:

https://changelog.com/gotime/143

In that episode, they talk about Go's Context package, and how it can be used -- and abused -- when making data available to particular execution context.

Dominic Aug 26, 2020 at 7:25 AM

22 Comments

Absolutely :).

This is what ColdBox does with their RequestContext object that is made available as event variable in views and handlers (controllers). You can decorate it exactly as you sketched out there. So you could have:

event.getRequestId()
event.getCallingService()

(and it already includes things like getting headers out of the java request, etc.).

Ben Nadel Aug 26, 2020 at 7:35 AM

15,848 Comments

@Dominic,

I would also say that historically, I've been "OK" with the Controller layer messing with global data since I've always seen the Controller as being "the messy" part of the application - the translation layer that takes all that contextual stuff and translates it into something that that is meaningful for the "Application core".

Where I've run into a lot of emotional hurdles is when that same "messiness" leaks down into the "application core". But, that's where I'm starting to re-think my vision.

Oh my chickens, this post is old!

Hit me up on Twitter if you want to discuss it further.