Earlier this week, Nick G over on the CF-Talk list asked about searching through the content of the files in a given directory. I would say this is a task best performed by something other than ColdFusion... but of course, I am not one to turn down the chance to write some sweet ass ColdFusion code. And so, last night, I wrote this ColdFusion user defined function (UDF) that takes either a directory path or an array of file paths and a phrase to search for and returns an array of file paths that contain the given phrase. The search can be done either as a literal text search or as a regular expression search.
Here is the SearchFiles() ColdFusion UDF:
<cffunction name="SearchFiles" access="public" returntype="array" output="false" hint="Searchs files for the given values. Returns an array of file paths."> <!--- Define arguments. ---> <cfargument name="Path" type="any" required="true" hint="This is either a directory path or an array of file paths which we will be searching." /> <cfargument name="Criteria" type="string" required="true" hint="The values for which we are searching the file contents." /> <cfargument name="Filter" type="string" required="false" default="cfm,css,htm,html,js,txt,xml" hint="List of file extensions that we are going to allow." /> <cfargument name="IsRegex" type="boolean" required="false" default="false" hint="Flags whether or not the search criteria is a regular expression." /> <!--- Define the local scope. ---> <cfset var LOCAL = StructNew() /> <!--- Check to see if we are dealing with a directory path. If we are, we are going to want to get those paths and convert it to an array of file paths. ---> <cfif IsSimpleValue( ARGUMENTS.Path )> <!--- Get all the files in the given directory. We are going to ensure that only files are returned in the resultant query. We don't want to deal with any directories. ---> <cfdirectory action="LIST" directory="#ARGUMENTS.Path#" name="LOCAL.FileQuery" filter="*.*" /> <!--- Now that we have the query, we want to create an array of the file names. ---> <cfset LOCAL.Paths = ArrayNew( 1 ) /> <!--- Loop over the query and set up the values. ---> <cfloop query="LOCAL.FileQuery"> <cfset ArrayAppend( LOCAL.Paths, (LOCAL.FileQuery.directory & "\" & LOCAL.FileQuery.name) ) /> </cfloop> <cfelse> <!--- For consistency sake, just store the path argument into our local paths value so that we can refer to this and the query-route the same way (see above). ---> <cfset LOCAL.Paths = ARGUMENTS.Path /> </cfif> <!--- ASSERT: At this point, whether we were passed in a directory path or an array of file paths, we now have an array of file paths that we are going to search in the variable LOCAL.Paths. ---> <!--- Create an array in which we will store the file paths that had matching criteria. ---> <cfset LOCAL.MatchingPaths = ArrayNew( 1 ) /> <!--- Clean up the filter to be used in a regular expression. We are going to turn the list into an OR reg ex. ---> <cfset ARGUMENTS.Filter = ARGUMENTS.Filter.ReplaceAll( "[^\w\d,]+", "" ).ReplaceAll( ",", "|" ) /> <!--- Loop over the file paths in our paths array. ---> <cfloop index="LOCAL.PathIndex" from="1" to="#ArrayLen( LOCAL.Paths )#" step="1"> <!--- Get a short hand to the current path. This is not necessary but just makes referencing the path easier. ---> <cfset LOCAL.Path = LOCAL.Paths[ LOCAL.PathIndex ] /> <!--- Check to see if this file path is allowed. Either we have no file filters or we do and this file has one of them. ---> <cfif ( (NOT Len( ARGUMENTS.Filter )) OR ( REFindNoCase( "(#ARGUMENTS.Filter#)$", LOCAL.Path ) ))> <!--- This is a file that we can use. Read in the contents of the file. ---> <cffile action="READ" file="#LOCAL.Path#" variable="LOCAL.FileData" /> <!--- Check to see what kind of search we are going. Is it a straight-up value search or is it a regular expression search? ---> <cfif ( ( ARGUMENTS.IsRegex AND REFindNoCase( ARGUMENTS.Criteria, LOCAL.FileData ) ) OR ( (NOT ARGUMENTS.IsRegex) AND FindNoCase( ARGUMENTS.Criteria, LOCAL.FileData ) ) )> <!--- This is a good file path. Add it to the list of successful file paths. ---> <cfset ArrayAppend( LOCAL.MatchingPaths, LOCAL.Path ) /> </cfif> </cfif> </cfloop> <!--- Return the array of matching file paths. ---> <cfreturn LOCAL.MatchingPaths /> </cffunction>
As you can see, there is no real magic going on here. The algorithm just loops over the file paths, checks them against the file extension filter, reads in the content, searches for the phrase, and then returns all matching file paths. The only difference between a standard search and a regular expression search is that the standard search uses FindNoCase() where as the regular expression search uses REFindNoCase().
Here is an example of how to search the current directory:
<!--- Search entire directory. ---> <cfset arrMatchingPaths = SearchFiles( Path = ExpandPath( "./" ), Criteria = "she pondered" ) />
... and here is how you might call it using an array of file paths:
<!--- Create an array of file paths to search. ---> <cfset arrPaths = ArrayNew( 1 ) /> <!--- Add paths to the array. ---> <cfset ArrayAppend( arrPaths, ExpandPath( "./file_search_data.htm" ) ) /> <cfset ArrayAppend( arrPaths, ExpandPath( "./file_search_data.html" ) ) /> <cfset ArrayAppend( arrPaths, ExpandPath( "./file_search_data.txt" ) ) /> <!--- Search given files for the regular expression match. ---> <cfset arrMatchingPaths = SearchFiles( Path = arrPaths, Criteria = "she (pondered|licked|kissed)", Filter = "txt", IsRegex = true ) />
So that's that. I am sure there are many ways of doing this in ColdFusion that have already been done, but you know me - I love to reinvent the wheel (no matter what Sean might say - I love getting the machinery firing full blast). One modification that could be neat would be to search the file name itself. This would be an easy modification (perhaps for the next attempt).
Want to use code from this post? Check out the license.