Java Exploration in ColdFusion: java.io.LineNumberReader

Posted September 18, 2006 at 11:38 AM by Ben Nadel

Tags: ColdFusion

As always I am trying to learn more about the Java libraries that live underneath the surface of ColdFusion MX 7. One class caught my eye: LineNumberReader. This is utility that reads in one line of data from a file at a time while keeping track of the line number of the current data read. It considers lines to be separated by \n, \r, or \r followed by \n.

Now, when would you use something like this? Why not just read in an entire file and split it on line breaks? It's all about memory usage. To read in an entire file and store it in the system RAM can slow down the machine greatly. You might even run out of memory and I am sure bad things happen at that point. By using a line reader, you can read in only bits of the file at a time. It's not going to be as fast as reading in the whole file and breaking it up, but it's going to be a lot nicer on the overall system.

Let's take a look at an example.

  • <!--- Create the line reader. --->
  • <cfset objLineReader = CreateObject(
  • "java",
  • "java.io.LineNumberReader"
  • ).Init(
  •  
  • <!--- Buffered Reader. --->
  • CreateObject(
  • "java",
  • "java.io.BufferedReader"
  • ).Init(
  •  
  • <!--- File reader. --->
  • CreateObject(
  • "java",
  • "java.io.FileReader"
  • ).Init(
  •  
  • <!--- File path. --->
  • JavaCast( "string", ExpandPath( "./data.txt" ) )
  •  
  • )
  •  
  • )
  •  
  • ) />
  •  
  •  
  • <!--- Get first line. --->
  • <cfset REQUEST.LineData = objLineReader.ReadLine() />
  •  
  • <!--- Loop while we still have line data. --->
  • <cfloop condition="StructKeyExists( REQUEST, 'LineData' )">
  •  
  • <!--- Get the line number. --->
  • <cfset intLineNumber = objLineReader.GetLineNumber() />
  •  
  • <!--- Output line data. --->
  • #intLineNumber#) #REQUEST.LineData#<br />
  •  
  • <!--- Read the next line. --->
  • <cfset REQUEST.LineData = objLineReader.ReadLine() />
  •  
  • </cfloop>

As you can see, at the center of it all, we are creating a FileReader instance. This is going to get the actual data from the file we specify (in this case, "data.txt"). Then, we wrap the FileReader in a BufferedReader. The buffered reader makes data retrieval from the file much more efficient by bulk loading file data then passing back bits of pre-read data. It only goes back to the file itself when it runs out of loaded data to return. Then, we wrap the BufferedReader in the LineNumberReader.

Looping over the lines in the data can be a bit confusing if you don't understand how ColdFusion handles NULL values passed back from Java. In the example above, you will see that we read the first line into a REQUEST-scoped variable, LineData. We then keep reading lines until the key "LineData" no longer exists in the REQUEST scope. This might seem very odd, but it is how a lot of readers will work in ColdFusion (such as reading in ZIP entries from an ZipInputStream). The LineNumberReader keeps reading lines until the return data is NULL. Since ColdFusion doesn't have a NULL data type, it attempts to create a NULL value by just destroying the variable reference itself. So, it keeps reading data, then hits a NULL, and as a result, it strips the variable "LineData" right out of the REQUEST scope.

So, nothing special here, just a little example of how something like that will work.




Reader Comments

Sep 18, 2006 at 10:01 PM // reply »
304 Comments

You should wrap this baby up into a simple UDF and submit it to cflib. :)


Sep 19, 2006 at 9:38 AM // reply »
10,638 Comments

Which part would be in the UDF? The creating of the line number reader object? What were you envisioning?


Apr 15, 2009 at 3:20 PM // reply »
7 Comments

So, I am attempting to use this method for importing a file that is about 1MB and is just over 34,000 lines and I run out of memory every time. Any advice? If you'd like to see my code, you can check it out here:

http://pastie.org/447660


Apr 15, 2009 at 4:51 PM // reply »
10,638 Comments

@Luke,

I am not sure that forcing garbage collection actually does anything within a single request processing. I don't think it can because the garbage collector can't be sure that the given value isn't going to be referred to later down in the code perhaps? I think the actual request needs to finish executing before GC works as you intend it to (just a theory).

That said, if the file is only 1MB, you really shouldn't be running out of memory! That's really not a large file. What happens if you simply read in the entire file at one time and then break it up by line break:

<cfset arrLines = ListToArray( fileData, "#Chr( 13 )##Chr( 10 )#" )>

That might not run out of memory? Or, are you already having memory issues?


Aug 7, 2009 at 10:55 AM // reply »
2 Comments

I see this post is almost 3 years old but all the same, I'm wondering why you nested the Buffer Reader inside the LineNumberReader; isn't LineNumberReader an extension of Buffer Reader? It seems like it's an extra step. Perhaps it's just preference but wouldn't something like this be just as good, if not a tab more efficient:

<cfscript>
a = createObject('java', 'java.io.FileReader').init('file_n_path.name');
b = createObject('java', 'java.io.LineNumberReader').init(a);
</cfscript>


Sep 6, 2009 at 2:58 PM // reply »
10,638 Comments

@Nick,

I think you might be correct. Nice call!


GB
Jan 9, 2012 at 12:13 PM // reply »
1 Comments

I get this error (CF 8).. Any tips?

Object Instantiation Exception.
An exception occurred when instantiating a Java object. The class must not be an interface or an abstract class. Error: ''.

1 : <!--- File path. --->
22 : JavaCast( "string", ExpandPath( "./data.txt" ) )


Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
InVision App - Prototyping Made Beautiful With Prototyping Tools Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
Feb 3, 2012 at 10:49 PM
How I Got Node.js Running On A Linux Micro Instance Using Amazon EC2
Wow this was really helpful! Only thing I would add is you need to update your .bash_profile after you edit the secure_path. This is what I did: $ . ~/.bash_profile Otherwise, NPM won't be found. ... read »
Feb 3, 2012 at 10:14 PM
Pushing Base64-Encoded Images Over HTML5 WebSockets With Pusher And ColdFusion
@Ben, Just wanted to let you know that pusher are soon to start limiting sizes on messages. This was the detail that came through in the Feb dispatch: "However, we will soon be limiting the s ... read »
Feb 3, 2012 at 5:05 PM
Regular Expressions Make CSV Parsing In ColdFusion So Much Easier (And Faster)
I tried using your RegEx in my C# program, but it was matching an extra empty-string at the end and so I would end up with an extra field that doesn't exist, so I changed it to this: (^|,)("(?: ... read »
Feb 3, 2012 at 3:47 PM
ColdFusion Supports HTTP Verbs PUT And DELETE (As Well As GET And POST)
Josh Cyr posted this on Twitter just a little bit ago. Thought it was appropriate. http://stackoverflow.com/questions/1619152/how-to-create-rest-urls-without-verbs/1619677#1619677 ... read »
Feb 3, 2012 at 2:28 PM
Changing The Execution Context Of Your Self-Executing Function Blocks In JavaScript
@Michael, You definitely make a good point (and extra points for quoting movies - I love movies). When you use a return() statement to define the object's public API, it does provide a consistent a ... read »
Feb 3, 2012 at 2:04 PM
Changing The Execution Context Of Your Self-Executing Function Blocks In JavaScript
To quote Jurassic Park: "Just because you can doesn't mean you should". I completely, utterly disagree with the thought that this is more readable. Consider the current module pattern: if ... read »
Feb 3, 2012 at 1:10 PM
REST API Design Rulebook By Mark Masse
@Jordan, Yeah, WRML was created by Mark Masse (author of the book). I also found it to be a bit convoluted. I suppose it is intended to allow the Client to be able to programmaticaly respond to cha ... read »
Feb 3, 2012 at 1:08 PM
ColdFusion Supports HTTP Verbs PUT And DELETE (As Well As GET And POST)
@Jason, To be honest, I don't have good answers for that kinds of stuff. And, to the point, that is specifically why I *really* liked the REST API Design Rulebook by Mark Masse - he just cuts throu ... read »