Skip to main content
Ben Nadel at CFUNITED 2008 (Washington, D.C.) with: Yaron Kohn
Ben Nadel at CFUNITED 2008 (Washington, D.C.) with: Yaron Kohn

Reading A Microsoft Excel File In ColdFusion Using JExcel API

By on
Tags:

I just started playing around with the JExcel Java API created by JNIWrapper. The first thing I have done is just read in the data from an Excel file. This is pretty neat stuff. But, before I get into it, let me just touch upon the JExcel Java files. When you download the JExcel Java files, they come as a JAR file. From what I have been told, this is basically a "zipped" file of Java classes. Now, for those of you who follow my blog you know that I LOVE portability and I hate going into the ColdFusion Admin. So, this example is done using a URL Class Loader and a local JAR file that is meant for use with this example only. If you want to see how the Url Class Loader works, it is based on the JavaLoader.cfc by Mark Mandel and you can see my code here. The bottom line here is that this can be done even if you don't have access to the ColdFusion Admin or the ability to update ColdFusion's class paths.

Ok, so onto the example. I have a simple excel file that has a few columns and rows of data. This file is known as the Workbook. The workbook contains sheets (those tabs at the bottom of your excel file). This example reads each sheet into its own query object and each of those query objects is stored in a one-dimension array which will represent the workbook as a whole.

Let's read us in some Excel data:

<!---
	Create the uninitialized Workbook class. Remember, this
	method CreateJExcelObject() just uses a Class Loader to
	load the Workbook class from the locally stored JAR file.
	This method automatically prepends "jxl." to all class
	calls, so do NOT include it in this method call.
--->
<cfset objWorkbook = CreateJExcelObject( "Workbook" ) />

<!---
	Get the workbook from the given file. This workbook
	may contain multiple sheets.
--->
<cfset objWorkbook = objWorkbook.GetWorkbook(
	CreateObject( "java", "java.io.File" ).Init(
		ExpandPath( "./test.xls" )
		)
	) />


<!---
	Create an object to keep track of all the sheet / cell
	data. We are, essentially going to store the excell
	sheets in an array and the data in a query.
--->
<cfset arrWorkbook = ArrayNew( 1 ) />

<!---
	Loop over the number of sheets in the workbook. We need to
	get the number of sheets in the workbook. Be careful! The
	GetSheets() method doesn't return a number (like the later
	used GetColumns() or GetRows()); instead, it returns an
	array of actual jxl.Sheet objects.
--->
<cfloop
	index="intSheet"
	from="1"
	to="#ArrayLen( objWorkbook.GetSheets() )#"
	step="1">


	<!---
		Create a blank query for this excel data sheet.
		We could create the number of columns right now,
		but I think it will be easier to add columns as
		we go.
	--->
	<cfset qData = QueryNew( "" ) />

	<!--- Store this data query into the workbook array. --->
	<cfset ArrayAppend( arrWorkbook, qData ) />


	<!---
		Get the current sheet. Remember that since we are
		getting the sheet by index and we are getting it
		through a JAVA method, we need to use zero-based
		indexes (unlike ColdFusion which is one-based).
	--->
	<cfset objSheet = objWorkbook.GetSheet(
		JavaCast( "int", (intSheet - 1) )
		) />


	<!---
		Loop over the columns and rows. As we loop over the
		columns, we are going to add a column to the data
		query and then add the row values. This should allow
		us to easily create the computed names such as
		COLUMN1, COLUMN2, COLUMN3.
	--->
	<cfloop
		index="intColumn"
		from="1"
		to="#objSheet.GetColumns()#"
		step="1">

		<!---
			Add the column to the query. We are going to add
			all the columns as text since I have no idea what
			kind of data there will be. We can always parse
			it into a data type later.
		--->
		<cfset QueryAddColumn(
			qData,
			("COLUMN" & intColumn),
			"CF_SQL_VARCHAR",
			ArrayNew( 1 )
			) />

		<!---
			We need to add rows. However, since we might be
			adding a lot of data, let's resize the query for
			faster processing. We ONLY want to do this the
			FIRST time. Then, the rest of the times, we can
			simply add data to the existing query cells.
		--->
		<cfif (intColumn EQ 1)>

			<!---
				Resize the query based on the number of rows
				in the excel file.
			--->
			<cfset QueryAddRow(
				qData,
				objSheet.GetRows()
				) />

		</cfif>


		<!---
			Now that we have the appropriate column added to the
			query and we have the correct number of rows that we
			will need, let's loop over the rows and move the
			cell data into the data query.
		--->
		<cfloop
			index="intRow"
			from="1"
			to="#objSheet.GetRows()#"
			step="1">

			<!--- Set the query data. --->
			<cfset qData[ "COLUMN#intColumn#" ][ intRow ] = JavaCast(
				"string",
				objSheet.GetCell(
					JavaCast( "int", intColumn - 1 ),
					JavaCast( "int", intRow - 1 )
					).GetContents()
				) />

		</cfloop>

	</cfloop>

</cfloop>


<!---
	When we have finished processing all the cells, use the
	close() method. This frees up any allocated memory used
	when reading spreadsheets and is particularly important
	when reading large spreadsheets.
--->
<cfset objWorkbook.Close() />

If you dump out the arrWorkbook array, you will see that it has three indexes with the appropriate queries:

CFDump - JExcel API Workbook

Now, this is a SLOW process. This has a minimal amount of data and takes 5 to 6 seconds to load. Yeah, I said SLOW. However, it seems to scale nicely. If I add a good number of columns and records, the parse time only jumped to about 9 seconds. It looks like the overhead comes mostly from actually opening the excel file via Java. Reading the data from the Excel file seems to scale nicely. This is based on LIMITED testing!

Anyway, the JExcel API seems like it could be very cool to do a number of things including writing Excel files. I have just begun to research, but I can already see the possibilities.

Want to use code from this post? Check out the license.

Reader Comments

15,674 Comments

Gus,

Thanks for the link. I will check it out. It's funny, though, clicking on your link I read:

"If you're merely reading spreadsheet data, then use the eventmodel api in the org.apache.poi.hssf.eventusermodel package."

That just seems funny. The JExcel stuff is mad simple like "jxl.Workbook". Why is Jakarta all about using crazy long complicated names?? Anyway, I will give it a looking into. Thanks.

4 Comments

WOW, thanks I was looking into purchasing some code that did the exact same thing. I will check this out first.

153 Comments

I can second the POI/HSSF recommendation. I've been using it at work for about 6 months to read and generate native Excel documents. It doesn't seem to have the 5-6 second lag that you mentioned, and it is mostly logical.

fileOut = createObject("java","java.io.FileOutputStream").init(Arguments.Filename);
wb = createObject("java","org.apache.poi.hssf.usermodel.HSSFWorkbook").init();
sheet = wb.createSheet(Arguments.Sheetname);
// Add the column headers
row = sheet.createRow(0);
style = wb.createCellStyle();
style.setBottomBorderColor(createObject("java","org.apache.poi.hssf.util.HSSFColor$BLACK").getIndex());
style.setTopBorderColor(createObject("java","org.apache.poi.hssf.util.HSSFColor$BLACK").getIndex());
style.setBorderBottom(style.BORDERTHIN);
style.setBorderTop(style.BORDER
THIN);
font = wb.createFont();
font.setBoldweight(font.BOLDWEIGHTBOLD);
style.setFillPattern(style.SOLID
FOREGROUND);
style.setFillForegroundColor(createObject("java","org.apache.poi.hssf.util.HSSFColor$LIGHTCORNFLOWERBLUE").getIndex());
style.setFillBackgroundColor(createObject("java","org.apache.poi.hssf.util.HSSFColor$GREY25PERCENT").getIndex());
style.setFont(font);
style.setWrapText(true);

Blah, blah, blah.

15,674 Comments

Rick,

I will take a look. Right now, all I can really see myself doing is reading and writing simple files. It looks like the POI stuff has some great formatting issues. Let me get a feel for both and then I can make an educated choice.

The Jakarta one just has crazy class names that I am having trouble dealing with. I mean:

org.apache.poi.hssf.util.HSSFColor$LIGHTCORNFLOWERBLUE

Is there nothing they could have done to make that easier to type? Granted I have very little Java experience, so I have no idea what the language forces you to do, but this just seems silly.

6 Comments

I'd be curious to see if you still get a lag if you put the jar file in your CF classpath, and invoke it directly, without the classloader.

15,674 Comments

That might be the issue regarding the initial delay. However, once the Java object is loaded, there should be no difference when it comes to reading the actual Excel file itself right?

My co-worker is actually the one who showed me the JExcel stuff (he just started working with it). He uses the mapped path. I will check with him to see about any lag time.

15,674 Comments

Dave,

That looks like a really awesome overview of the functionality. Dynamite post! Once I start looking into the Jakarta solution I will definitely go through your stuff with a fine-toothed comb. Thanks!

15,674 Comments

Mike,

I am working with Excel 2003 I think. However, if I had to work with XML, would it require the end user to export the excel document as an XML before they put it in the system (what ever system is there to handle the Excel data)? If so, it still requires an extra step. If I was gonna do that, why not just go to CSV. The hope here was to allow the client to perform actions without thinking.

But i have to say, I am very curious now about this XML you speak of. I have tried to work with the XML that Word 2003 produces, but it was a NIGHTMARE. Is the 2007 stuff much cleaner?

3 Comments

I am trying to read an excel file with your code and I am getting this error

jxl.Workbook

The error occurred in D:\users\ixClients\payquiq\scribble.cfm: line 107

105 : ("jxl." & ARGUMENTS.Class),
106 : JavaCast( "boolean", false ),
107 : LOCAL.ClassLoader
108 : )
109 : />

any thoughts?

15,674 Comments

Matt,

I was getting that error for a while and was able to fix it. I can't quite remember what it was. I think I was getting that when I was using the Class Loader to load the workbook class. The problem is that it cannot do that since it needs to be able to call the "new" operator.

That is why I switched over to the Java Proxy object that Mark Mandel demonstrated. See this code:

<cfreturn
CreateObject( "java", "coldfusion.runtime.java.JavaProxy" ).Init(
LOCAL.ClassLoader.LoadClass( ("jxl." & ARGUMENTS.Class) )
) />

This uses the Java Proxy to take the class loader and then load the class without initializing it.

6 Comments

Coldfusion uses a ClassLoader when it starts up and that is how CF loads up java so why are you using a CFC to do the job? Why not just use the
same Classloader already installed and in use?

15,674 Comments

Dan,

To use ColdFusion's ClassLoader, I would have to have the JExcel JAR file installed in ColdFusion's class paths. I however, do not have access to such areas of our setup.

6 Comments

All Classloaders work exactly the same. They all come from one root URLClassloader so it does not matter who's class it is. I showed you and other people on CF Talk working examples but somehow the examples did not take. ColdFusion has a specialized object for URLClassloading and it can be used just like any URLClassloader at runtime without any limits. I asked Adobe to add this as a feature for CF 8 as a new tag so all this maybe moot anyway. My spin is to load the jar via cfhttp to the web root and then load via CF bootstrapClassloader so the network usage only takes one hit. If the jar is found locally hit it, if not hit the network. I will post this message and example on my Coldfusion Java Hacks list today.
Go take a look. I just posted java.io.LineNumberReader example too and Google Talk examples. I am sure you have seen my CF Ghost examples too.

15,674 Comments

Dan,

If you have shown me stuff in Java before and they did not take it is only because I am a total Java noob and things don't tend to stick in my head until I fully understand them. I still don't quite understand what you are saying :) but I will definitely look at your examples!

15 Comments

Ben, I'm using JExcel for nearly months. It's a nice tool using low amount of data. Today I found a timeout in Excel generation file for ~9000 rows and ~25 columns. Do you have tests for huge amount of data? JExcel scales for you?

15,674 Comments

@Marco,

I only did this one test with JExcel. After this, I moved onto the the POI library which comes installed with ColdFusion.

I believe in love. I believe in compassion. I believe in human rights. I believe that we can afford to give more of these gifts to the world around us because it costs us nothing to be decent and kind and understanding. And, I want you to know that when you land on this site, you are accepted for who you are, no matter how you identify, what truths you live, or whatever kind of goofy shit makes you feel alive! Rock on with your bad self!
Ben Nadel