I am getting errors when I try to grab google results with cfhttp. But, when I go to page with my browser, it works just fine. What am I doing wrong?
You are not doing anything wrong. Google wants to be used by regular web users. CFHttp does not announce itself as a regular user. When you do a CFHttp page grab, it passes along, as its User Agent a non-standard value. I am not sure offhand what it is, but I think it sends "ColdFusion" as its user agent. Doing a regular CFHttp will return this error:
Your client does not have permission to get URL /search?hl=en&lr=&q=Girls+Gone+Wild&btnG=Search from this server.
This is there for a reason: you might be violating the Google terms of service (I have not read them, nor do I condone working around this). If you want to avoid this, you can fake Google into thinking you ARE a web browser by sending a standard user agent in your CFHttp:
Launch code in new window » Download code as text file »
Notice that I am sending the FireFox / Mozilla user agent. This should work just fine. But again, I am not aware of the legality of such an action - proceed with caution.
Download Code Snippet ZIP File
Comments (2) | Post Comment | Ask Ben | Permalink | Other Searches | Print Page
Ask Ben: Counting Spaces In A Given String
Using The ColdFusion Query Object As A Complex Object Iterator
Great post on pulling Google search results pages.
Now if I could only figure out how to get it to only pull
the result for a particular site's listing.
Trying to analyze the different result text for one domain
for different keyword searches.
Posted by Dr Adam on Dec 9, 2007 at 9:57 PM
@Dr. Adam,
Just put "site:youdomain.com" in the google query and it should only pull for a particular site.
Posted by Ben Nadel on Dec 10, 2007 at 7:06 AM