Skip to main content
Ben Nadel at CFUNITED 2008 (Washington, D.C.) with: Joe Rinehart
Ben Nadel at CFUNITED 2008 (Washington, D.C.) with: Joe Rinehart ( @joeRinehart )

Always Identify Your Calling Service Within The HTTP User Agent Header

By on
Tags:

As part of a large cost-cutting initiative, we recently cancelled our Pingdom uptime monitoring service (which is no knock against the Pingdom service itself). But, when I checked our request logs, I saw that we still had steady traffic hitting our Pingdom end-point from the US, Japan, and Ireland. There was nothing in the request logs to help identify who was initiating these requests. And, after some additional debugging code (that I deployed to production), I saw that the requests were coming from a New Relic Synthetic Monitor. To their credit, New Relic included an X-Abuse-Info HTTP header with more information; however, this felt like a good lesson: your calling service should always be identified in the HTTP user agent header.

I don't know the origin of this practice, but it seems that every single HTTP request logging agent / feature that I've ever encountered defaults to recording the User Agent header of the incoming request. As web developers, we often think of the User Agent as the "browser" making the request (ie, Chrome, Firefox, Safari, Edge, etc); however, the User Agent is just the standard for a client—of any kind—to identify itself to the target server. This includes server-to-server communication.

Now, I'm not saying that you should only use the User Agent header to identify the calling service; but, I'm saying that, at a bare minimum, some meaningful identification needs to be included in the User Agent header such that it can be found in most generic logging configurations.

If our calling service is written in ColdFusion / Lucee CFML, we can easily configure the User-Agent header using the CFHttpParam tag:

<cfscript>

	http
		result = "httpResponse"
		method = "head"
		url = "#cgi.http_host#/ua-identity/target.cfm"
		timeout = 5
		{

		// At a bare minimum, your bot / client / monitoring service should be identified
		// as part of the HTTP User Agent. This way, most default logging configurations
		// will provide insight in where these requests are coming from.
		httpparam
			type = "header"
			name = "user-agent"
			value = "Uptime Bot (see X-Uptime-Bot | example.com for more info)"
		;
		// As a measure of enhanced identification, you can provide a secondary HTTP
		// header that provides more low-detail about the service and / or account that
		// is responsible for the incoming request.
		httpparam
			type = "header"
			name = "X-Uptime-Bot"
			value = "Account ID: 1234 | Adjust monitors using https://example.com/account/1234/monitors"
		;
	}

	echo( "Done" );

</cfscript>

Notice that I'm including two different HTTP headers in the above code: the User-Agent header, which will be logged just about everywhere; and, the X-Uptime-Bot header, which will only be logged if a given service is expecting it and wants to record it.

When we populate the User-Agent header in our outbound request, it populates the cgi.http_user_agent value in the corresponding inbound request. We can see this in our target.cfm page (the end-point being requested via the CFHttp tag above):

<cfscript>

	// Implicitly provided via the `User-Agent` header.
	systemOutput( "REQUEST FROM: #cgi.http_user_agent#", true );

	// For other headers (non User-Agent), we have to explicitly grab the incoming request
	// data and examine the headers collection.
	headers = getHttpRequestData( false ).headers;
	botHeader = "X-Uptime-Bot";

	if ( headers.keyExists( botHeader ) ) {

		systemOutput( "#botHeader#: #headers[ botHeader ]#", true );

	}

</cfscript>

If we now run the first Lucee CFML page and then look at the logs, we see:

[INFO ] REQUEST FROM: Uptime Bot (see X-Uptime-Bot | example.com for more info)
[INFO ] X-Uptime-Bot: Account ID: 1234 | Adjust monitors using https://example.com/account/1234/monitors

As you can see, the User-Agent header clearly identified our requesting service; and, the X-Uptime-Bot header provided more information about the calling service. Even if we hadn't logged the secondary header, the logs would still be giving us meaningful information.

You might think that this is only helpful when an external service is hitting your ColdFusion application. But, I can't tell you how many times I've had trouble debugging internal service calls—that is, services that we own calling other services that we own.

Internally, we have a standard in which the calling service must be identified in the X-Calling-Service HTTP header. But, this information only gets logged in a context that expects it. Which, doesn't include layers like our ALBs (Application Load Balancers), ELBs (Elastic Load Balancers), or WAF (Web Application Firewall). If we had established a standard in which both the User-Agent and the X-Calling-Service header were involved in identifying the calling service, it would have made debugging so much easier.

All to say, when you're making a request using the CFHttp tag (or whatever HTTP mechanism your application is using), always identify yourself first in the User-Agent header; and then, as a secondary means of identification, consider populating additional HTTP headers.

Want to use code from this post? Check out the license.

Reader Comments

Post A Comment — I'd Love To Hear From You!

Post a Comment

I believe in love. I believe in compassion. I believe in human rights. I believe that we can afford to give more of these gifts to the world around us because it costs us nothing to be decent and kind and understanding. And, I want you to know that when you land on this site, you are accepted for who you are, no matter how you identify, what truths you live, or whatever kind of goofy shit makes you feel alive! Rock on with your bad self!
Ben Nadel