Working Around Hot Linking Restrictions

By Ben Nadel

Published 2006-04-27 in ColdFusion — Comments (4)

I was testing out my new functions JREGetNoCase() and JREGet() (uses Java regular expressions to return all matching substrings of a given string) by attempting to grab IMG tags off of random web sites:

<cfset arrImages = JREGetNoCase(

objHttpRequest.FileContent,
"<a[^>]+href=""?([^"">]+)[0-9]+\.jpg""?[^>]*>[\s]*<img[^>]+src=""?([^"">]+)[0-9]+\.jpg""?[^>]*[\s]*</a>"
) />

This gets all A tags that have an IMG as the only child element. The functions work perfectly. I am actually totally excited about them. But, as I was dumping out the data, I realized that only some of the images worked on my page; however, when I pasted the captured IMG source into another window, the image loaded just fine.

Very curious. After some research, I see that Apache can block access (and maybe IIS can too) to files based on header data (among other criteria). It seems that some sites were blocking my file "grabs" since they were coming from my site.

To get around this, I had to create a sepparate page that would grab the img binary using a falsified CGI value (http referer) and stream that binary data to the browser:

<cfsilent>

<cfsetting showdebugoutput="false" />


<cfparam name="URL.src" type="string" default="" />

<cfset strDomain = REReplace( URL.src, "(\.(com|net)).+", "\1", "ONE" ) />

<cfhttp
url="#URL.src#"
method="GET"
useragent="ua"
getasbinary="yes"
result="objHttp">


<cfhttpparam type="CGI" name="http_referer" value="#strDomain#" encoded="false" />
</cfhttp>

</cfsilent>

As you can see above, I grab the site Domain information from the actual SRC value, then I set that domain information as the CGI.http_referer for the CFHttp Get. This works like a charm (95% of the time). It doesn't have any error checking, but that could easily be worked in via the Status of the CFHttp return data.

Short link: https://bennadel.com/33

Reader Comments

Steven Levithan Feb 3, 2007 at 1:14 AM

172 Comments

Hmm, that's somewhat unethical, since people disable hotlinking for a reason (bandwidth costs, etc.).

Steven Levithan Feb 3, 2007 at 1:16 AM

172 Comments

Also, the regexes could be improved. ;-)

Ben Nadel Feb 4, 2007 at 12:48 PM

16,256 Comments

Unethical and down right impractical. If you were to hotlink images, it means that you have to put processing and the data transfer time into every single image that you display.

I don't ever see this type of thing being used to "Steal" content but rather to download content such as by an Offline-Explorer / archiving type of application.

Gsm Lobby Mar 21, 2008 at 12:24 PM

7 Comments

thanks for the code.

Oh my chickens, this post is old!

Hit me up on LinkedIn if you want to discuss it further.