Skip to main content
Ben Nadel at CFUNITED 2009 (Lansdowne, VA) with: George Murphy
Ben Nadel at CFUNITED 2009 (Lansdowne, VA) with: George Murphy

Working Around Hot Linking Restrictions

By on
Tags:

I was testing out my new functions JREGetNoCase() and JREGet() (uses Java regular expressions to return all matching substrings of a given string) by attempting to grab IMG tags off of random web sites:

<!--- Get the images for this page. --->
<cfset arrImages = JREGetNoCase(

objHttpRequest.FileContent,
"<a[^>]+href=""?([^"">]+)[0-9]+\.jpg""?[^>]*>[\s]*<img[^>]+src=""?([^"">]+)[0-9]+\.jpg""?[^>]*[\s]*</a>"
) />

This gets all A tags that have an IMG as the only child element. The functions work perfectly. I am actually totally excited about them. But, as I was dumping out the data, I realized that only some of the images worked on my page; however, when I pasted the captured IMG source into another window, the image loaded just fine.

Very curious. After some research, I see that Apache can block access (and maybe IIS can too) to files based on header data (among other criteria). It seems that some sites were blocking my file "grabs" since they were coming from my site.

To get around this, I had to create a sepparate page that would grab the img binary using a falsified CGI value (http referer) and stream that binary data to the browser:

<!--- Kill extra output. --->
<cfsilent>

 

<!--- Set page settings. --->
<cfsetting showdebugoutput="false" />

<!--- Param url variables. --->
<cfparam name="URL.src" type="string" default="" />

 

<!--- Get the domain of the image. --->
<cfset strDomain = REReplace( URL.src, "(\.(com|net)).+", "\1", "ONE" ) />

 

<!--- Grab the source image. --->
<cfhttp
url="#URL.src#"
method="GET"
useragent="ua"
getasbinary="yes"
result="objHttp">

<!--- Set referrer params. --->
<cfhttpparam type="CGI" name="http_referer" value="#strDomain#" encoded="false" />
</cfhttp>

</cfsilent>

 

<cfset GetPageContext().GetOut().ClearBuffer()
/><cfcontent
type="image/jpg"
variable="#objHttp.FileContent#"
/>

As you can see above, I grab the site Domain information from the actual SRC value, then I set that domain information as the CGI.http_referer for the CFHttp Get. This works like a charm (95% of the time). It doesn't have any error checking, but that could easily be worked in via the Status of the CFHttp return data.

Reader Comments

15,688 Comments

Unethical and down right impractical. If you were to hotlink images, it means that you have to put processing and the data transfer time into every single image that you display.

I don't ever see this type of thing being used to "Steal" content but rather to download content such as by an Offline-Explorer / archiving type of application.

I believe in love. I believe in compassion. I believe in human rights. I believe that we can afford to give more of these gifts to the world around us because it costs us nothing to be decent and kind and understanding. And, I want you to know that when you land on this site, you are accepted for who you are, no matter how you identify, what truths you live, or whatever kind of goofy shit makes you feel alive! Rock on with your bad self!
Ben Nadel