Wrestling With My Dogmatic Fear Of The REQUEST Scope And Accessing Global Variables In ColdFusion
For years, I've had a general notion that "global variables" are a "Bad Thing" ™. And, I've come to love Inversion of Control (IoC); and, I believe that Dependency Injection (DI) is one of the greatest things since sliced-bread. But, I fear that I've become blind to the pragmatic use-cases in which dirty code is actually better code. Even now as I type that out, it makes me uncomfortable - but, that's where the personal-growth happens! As such, I wanted to sit down and talk about my fears so that I may possibly overcome them and learn to accept the request
scope and that some globally-accessible variables in ColdFusion will make my life better.
I first became aware of this fear when InVision introduced engineering standards around request tracing. This is the idea that an incoming request may have properties like:
Request-ID
Calling-Service
And, that these properties have to be propagated throughout the request. Meaning, these tracing values have to be:
- Included in any log-data generated by the current request.
- Embedded in any message-queue payloads formulated by the current request.
- Passed-along with any
CFHttp
requests spawned by the current request.
In the discussions about how something like this should be implemented, I was adamant that whatever we did, it should not be magical! Meaning, everything should be explicit: clearly identified values that are passed-around as needed down through the call-stack.
When I thought about this type of implementation, I took a purely academic perspective, thinking about Inversion of Control, Dependency Injection, High Coupling, Low Cohesion, and all the other buzz words that you can think of. What I did not think about was how practical any of it was. What I did not think about was how verbose every method call would have to become. What I did not think about was how every invocation would have to change when the tracing requirements within the engineering department changed.
But mostly, what I did not think about was how hypocritical I was being. Because, the reality is, I already reference a lot of globally-accessible data in my ColdFusion applications. There is no more clear example of how beneficial globally-accessible data can be than in my error logging.
An error can happen anywhere. It can happen in the Application.cfc
; it can happen in a Controller method; it can happen in a Service method; it can happen in a CFThread
; it can happen in a scheduled-task execution. No line of code is truly safe - even bug-free code is still vulnerable to "sudden thread death".
Which means, I have some try/catch
blocks in my ColdFusion applications. And, where it makes sense - where an error can't bubble-up to a higher call-stack - I have logging inside my catch
blocks. And, when I log an error, here's what my ColdFusion logging component includes:
- The sanitized contents of the
url
scope. - The sanitized contents of the
form
scope. - A subset of the HTTP headers (pulled from
getHttpRequestData()
). - The request IP address (pulled from the
cgi
scope orgetHttpRequestData()
when using IP-forwarding). - The request method (pulled from the
cgi
scope). - The request host (pulled from the
cgi
scope). - The request path (pulled from the
cgi
scope). - The request referrer (pulled from the
cgi
scope). - The request user-agent (pulled from the
cgi
scope).
Heck, I even attempt to pull the aforementioned Request-ID
out of a variety of globally-accessible locations!
ASIDE: When I say "sanitized contents", what I mean is that my error-logging components strip-out sensitize user data and other PII (Personally Identifiable Information). Things like bearer token, authorization headers, passwords, credit card expiration dates, and other information that my Security team would hate to see show up in a remote log-aggregation system.
So here I am, up on my high-horse, talking about how we can't use globally-accessible data to implement "request tracing" while, at the same time, I am happily using globally-accessible data - for great good, mind you - in my error logging.
I am sure that I have always rationalized this disconnect by looking at the "error logging" as a "dirty task". Meaning, errors are messy and can happen anywhere; so, it's OK that the logging of those errors be commensurately messy.
But, I fear that this rationalization has been a disservice.
If I try to dig-deep, I believe that much of my fear comes from the fact that I've had to maintain a lot of bad code in my life. Specifically in contexts where Dependency Injection (DI) is less popular. In Node.js / Express.js applications, for example, there is this concept of the "request object", where an open-ended Object / Hash / Struct just gets passed-down through every request-handler; and, every line-of-code that touches this object has the freedom to write anything it wants to that object. Which means, trying to figure out where any given value comes from is a complete nightmare!
ASIDE: Honestly, I find the concept of "middleware", in general, to be problematic. And, I find that middleware can often be removed by creating more intelligent objects that encapsulate logic for what would have been middleware implementations. But, that's unrelated to this post - a rant for another time.
I believe that the trauma of dealing with such unreadable code has lead me to the extreme opposite stance: all code must be explicit! And, while I think that most code should be explicit; and, as much as it makes sense, we should avoid globally-accessible data; I am finally ready to come down of my high-horse and reach a more happy medium.
I think my biggest fear has been the "unknowable state" of globally-accessible data. And, I'm wondering if the best way to counteract that fear is to simply make the unknowable more knowable through a consistent API. What if, instead of referencing globally-accessible data, like the request
scope, directly, I do so through an interface that I can inject into other components?
I think this would allow me to keep the "clean parts" clean and better encapsulate the "dirty parts" within an easy-to-understand API. Just thinking out loud here, but what if I created some sort of RequestMetadata.cfc
ColdFusion component that could be "initialized" at the start of every incoming HTTP request; and then, would expose methods for accessing data anywhere in the call-stack that would otherwise be globally-accessible?
I haven't fully thought this through, but an abbreviated version of such a ColdFusion component might look like this:
component
output = false
hint = "I provide a strong API around globally-accessible data."
{
/**
* This method would be called as the very first action within the request
* processing, allowing for some request-specific data to be initialized as
* needed.
*/
public void function setupRequest() {
request.requestID = getOrCreateRequestID();
}
// ---
// PUBLIC METHODS.
// ---
// .... truncated example ....
// .... truncated example ....
// .... truncated example ....
public string function getRequestCountryCode() {
var headers = getRequestHeaders();
// Using CloudFlare headers.
return( headers[ "CF-IPCountry" ] ?: "" );
}
public string function getRequestCallingService() {
var headers = getRequestHeaders();
return( headers[ "Calling-Service" ] ?: "None" );
}
public struct function getRequestHeaders() {
return( getHttpRequestData().headers );
}
public string function getRequestID() {
return( request.requestID );
}
public string function getRequestIP() {
var headers = getRequestHeaders();
// Since we are behind multiple load-balancers, we should look for the
// IP address in the HTTP headers first, and then fallback to using the
// more-common
var ipAddress = ( headers[ "X-Forwarded-For" ] ?: cgi.remote_addr )
.trim()
.lcase()
.listFirst()
;
return( ipAddress );
}
public numeric function getUserID() {
// Try to pull the user ID out of the FW/1 request-context if it has
// been defined.
return( val( request.context?.user?.id ?: 0 ) );
}
// .... truncated example ....
// .... truncated example ....
// .... truncated example ....
// ---
// PRIVATE METHODS.
// ---
private string function createRequestID() {
return( "inv-#createUuid()#".lcase() );
}
private string function getOrCreateRequestID() {
var headers = getRequestHeaders();
return( headers[ "Request-ID" ] ?: createRequestID() );
}
}
With a component like this, I could now use Dependency-Injection (DI) to make the RequestMetadata
instance available to anything within my ColdFusion call-stack; and with that, I could "more cleanly" reach into the global scope - using a predictable API - to grab values like the "User ID", the "IP Address", and the "Request ID".
An added benefit of this approach is that I would also get to encapsulate some of the more convoluted logic surrounding some of the values. For example, none of the calling code would have to understand that the IP-address is being propagated through load-balancer HTTP headers; or, that we can do some high-level geolocation using CloudFlare country-code headers (but only in a production environment).
With an encapsulation of globally-accessible data like this, I think I would be able to squash the emotional and dogmatic fears that I've had about accessing global data. And, remove some the hypocritical tendencies that I have in my application architecture. I can sure tell that I'd love to never pass-down another IP-address or User-Agent string through my method calls!
Want to use code from this post? Check out the license.
Reader Comments
@All,
This post was, in large part, inspired by a recent episode of the Go Time podcast from the Changelog:
https://changelog.com/gotime/143
In that episode, they talk about Go's Context package, and how it can be used -- and abused -- when making data available to particular execution context.
Absolutely :).
This is what ColdBox does with their RequestContext object that is made available as event variable in views and handlers (controllers). You can decorate it exactly as you sketched out there. So you could have:
event.getRequestId()
event.getCallingService()
(and it already includes things like getting headers out of the java request, etc.).
@Dominic,
I would also say that historically, I've been "OK" with the Controller layer messing with global data since I've always seen the Controller as being "the messy" part of the application - the translation layer that takes all that contextual stuff and translates it into something that that is meaningful for the "Application core".
Where I've run into a lot of emotional hurdles is when that same "messiness" leaks down into the "application core". But, that's where I'm starting to re-think my vision.