Earlier this week, Nick G over on the CF-Talk list asked about searching through the content of the files in a given directory. I would say this is a task best performed by something other than ColdFusion... but of course, I am not one to turn down the chance to write some sweet ass ColdFusion code. And so, last night, I wrote this ColdFusion user defined function (UDF) that takes either a directory path or an array of file paths and a phrase to search for and returns an array of file paths that contain the given phrase. The search can be done either as a literal text search or as a regular expression search.
Here is the SearchFiles() ColdFusion UDF:
Launch code in new window » Download code as text file »
As you can see, there is no real magic going on here. The algorithm just loops over the file paths, checks them against the file extension filter, reads in the content, searches for the phrase, and then returns all matching file paths. The only difference between a standard search and a regular expression search is that the standard search uses FindNoCase() where as the regular expression search uses REFindNoCase().
Here is an example of how to search the current directory:
Launch code in new window » Download code as text file »
... and here is how you might call it using an array of file paths:
Launch code in new window » Download code as text file »
So that's that. I am sure there are many ways of doing this in ColdFusion that have already been done, but you know me - I love to reinvent the wheel (no matter what Sean might say - I love getting the machinery firing full blast). One modification that could be neat would be to search the file name itself. This would be an easy modification (perhaps for the next attempt).
Download Code Snippet ZIP File
Comments (4) | Post Comment | Ask Ben | Permalink | Other Searches | Print Page
What Has Two Thumbs And Is Attending CFUnited Express NYC... THIS GUY!
ColdFusion Query Maintains Current Row Even When Passed By Reference
Why not just use Verity (or Lucene for the Blue Dragon crowd)?
Posted by Christopher Wigginton on Mar 1, 2007 at 4:44 PM
Overhead... and that sort of a demo would be beyond my area of expertise. Plus, verity requires duplicating data (for the index). This can take random directories / file paths on the fly. This doesn't require any planning.
Posted by Ben Nadel on Mar 1, 2007 at 5:03 PM
Hi all,
Thanks for the code!!
I tried to search for a file content by using your code and it works well! :)
However, it only works well when searching for english text content, but asian lanaguages (eg. chinese, japanese, etc) cannot.
Just wondering if it is possible to search for the asian language content from a file? Any change to the code itself?
Regards,
Ronald
Posted by Ronald Heng on Mar 18, 2007 at 9:10 AM
@Ronald,
I am not sure of how you would go about this. My thoughts, and this is probably NOT the way to go, would be to run regular expression searches and just replace all the foreign extended characters with something like .{1} where it matches one character.
So, something like "Espanol" where is has the "n" with the tilde, you would maybe search for "Espa.{1}ol". Of course, this does not guarantee a good match. There has god to be a much better way to do this.
Posted by Ben Nadel on Mar 19, 2007 at 7:36 AM