I have been doing a lot of experimenting with writing Microsoft Excel files using ColdFusion. I was just recently turned onto POI by Rob Brooks-Bilson and found some great code resources by Dave Ross. However, when it comes to reading in an Excel file using POI and ColdFusion, I was not able to find much information. So, using Dave's POI reference page and some Java examples on OnJava.com I put this little example together.
There are probably lots of pitfalls and stuff that I am not seeing (this is my first read with ColdFusion and POI) so m code makes the following assumptions with this example:
While the assumptions above might seem a bit limiting, I think that they will cover about 95% of the cases that I will have to deal with. So, here is the code that I have come up with:
Launch code in new window » Download code as text file »
Taking this Excel file:
| | | | ||
| | ![]() | | ||
| | | |
... and then CFDump'ing it out, we get:
| | | | ||
| | ![]() | | ||
| | | |
This is pretty exciting! I am really liking this POI library. Rob Brooks-Bilson tells me that the version of POI that ships with ColdFusion is a slightly older one. I can only hope that the upcoming versions of ColdFusion continue to ship with Microsoft Excel solutions.
POI aside, I am still hoping to get a JExcel sample up and running. Mark Mandel has been nice enough to help me out with that, but I just haven't had time to go back and implement his advice. As you probably notice, I have about a 1000 things going on in my head and it just pours out when ever I can get a few minutes here and there.
Download Code Snippet ZIP File
Comments (22) | Post Comment | Ask Ben | Permalink | Other Searches | Print Page
Now if you could implement this as a full blown CFC. I would love you forever. please?? :)
Posted by Derek Perez on Jan 14, 2007 at 4:25 PM
I played around with this a few weeks ago, as well.
I didn't really get that far into it, because of paying work that needed to get done, but I created a small ColdFusion function that takes as an input a path to an excel file and the number of columns to pull in each worksheet.
It returns a Structure of CF Query objects containing the data from the excel spreadsheet. I didn't go into the detail you have with the different cell types, and your work is a bit more elegant, but combining your work with the structure aspect from my function would yield a powerful data import tool.
You can see my [ uncommented :( ] function here: http://ken.auenson.com/examples/func_ExcelToQueryStruct_POI_example.txt
Posted by Ken Auenson, II on Jan 14, 2007 at 5:09 PM
Ben:
Can you clarify that that POI library you are using is not the one that comes with CF? If so, I'm assuming you had to load this library into CF's class path.
For my work sites, I have access to the server. But for my personal site, I'm hosted by a web hosting company that doesn't allow us to add Java packages.
Bruce
Posted by Bruce on Jan 14, 2007 at 7:08 PM
@Ken,
That stuff looks good. I like the way that you are looping over the sheets and returning them in an array. I have some ideas that could perhaps improve up on it (such as returning an array of sheets structures, each that would have keys for things like "Name", "Query", "Index"... or something). I will try to merge our two ideas together and make a wicked awesome love-child.
@Bruce,
As far as I know, this IS the POI package that ships with ColdFusion MX 7. Other people have been working with JExcel, which I know has to be installed. No one here (at my office) knew about POI until Rob pointed it out to me... meaning, we have not installed anything extra as far as I know. When in doubt, try to run this line:
<cfset CreateObject( "java", "java.lang.StringBuffer" ).Init() />
If that can run without error then you know two things:
1. CreateObject() works
2. You can directly create installed Java packages.
If you can do that, then try to cut-n-paste the demo above. I assume that as long as you are running on a recent enough edition of ColdFusion then you should be good to go.
But again, this is my first time ever reading via POI and ColdFusion, so I am still quite wet behind the ears.
Posted by Ben Nadel on Jan 14, 2007 at 8:08 PM
@Bruce:
My function was using the built-in POI library as well.
@Ben:
That sounds like a great idea.
The only piece I am not happy with right now is that I can't seem to see a nice way to determine the number of columns dynamically. Maybe what would be required would be to loop over every row in a given sheet one time before doing the real work to determine the max columns on that sheet. That would iliminate my solution of requiring a column input, and your solution of retrieving the first row's number of columns.
This could turn out to be a really useful tool! I wish I had more time to spend on it. Good luck, and keep us posted!
Posted by Ken Auenson, II on Jan 14, 2007 at 8:23 PM
A long time ago (back in CF5), I remember using COM objects to pull data out of an excel file. I was very tedious and SLOWWWW. Don't know how fast this thing is under load, but the COM objects I used sucked. I think really it had to do with a CF5 bug, but can't really remember. Anywho, the way I got around it was to use COM to basically open up the excel file and then save it to a TAB delimited file. Then I could just use go old CF to read the TAB delimited file and throw it into a query.
Posted by Tony Petruzzi on Jan 14, 2007 at 9:45 PM
Tony,
That's a pretty clever idea regarding the TAB file saving. I have just started working with POI so I cannot testify as to the speed of it... but I hear very good things. Hopefully, I will know more soon.
Posted by Ben Nadel on Jan 15, 2007 at 7:48 AM
Ken,
I was just trying to put some code down on paper, doing a what you suggested (going through the sheet once to find the number of columns) when it dawned on me... who care?!?
This is what occurred to me: what ever part of the application uses the query that comes out of the Excel sheet is going to know the format of the excel sheet. Meaning, they are going to assume column 1 is last name and column 2 is first name (for example). The structure of the Excel is set for the calling app, not dynamic.
Because the structure is set, no one is every going to treat the query structure as dynamic. This makes extra columns irrelevant. If there is an extra 6th column passed to a part that expects 5 column, it doesn't matter because most likely that 6th column will never be reference.
I am going to go an take out the logic I put in for checking the column count as I think it is going to add more overhead and complicate things more than is worth it.
Posted by Ben Nadel on Jan 15, 2007 at 8:49 AM
Ben:
Good point. I didn't actually think of the use cases before writing my last comment, but your right. anyone using this code will already know the structure of the spreadsheet.
Posted by Ken Auenson, II on Jan 15, 2007 at 9:33 AM
Ken,
I was thinking the same exact thing you were. I even played around with the excel, adding and clearing cells vs. "Delete"ing a column altogether. But then as I was writing the code it just didn't seem that useful (and was making my code more complicated than it needed to be).
I am done writing the "Reading" part of a small CFC, POIUtility.cfc. I just want to add the Write (query to excel) and then I will post up. Hopefully I can do this at lunch.
I am borrowing for you ideas and adding to them. More to come soon.
Posted by Ben Nadel on Jan 15, 2007 at 9:52 AM
Let me know when you get the POIUtility.cfc written. I would like to take a look at the code.
Posted by Matthew on Jan 15, 2007 at 4:20 PM
Matt,
It is written. I am just writing up the entry and example. I will try to finish it right after work. If not then, first thing in the morning.
Posted by Ben Nadel on Jan 15, 2007 at 4:34 PM
If I understand this correctly, this is for apache. Is there a way to do it with IIS?
Posted by CJ on Jan 15, 2007 at 4:36 PM
CJ,
This is for ColdFusion. I don't think it has anything to do with the server (but I am NOT a server guy, so this may be misinformed). I am working with Windows Server and IIS and all is going swimingly.
Posted by Ben Nadel on Jan 15, 2007 at 4:37 PM
Thanks Ben,
I will give it a try and see if I can get it to work
Posted by CJ on Jan 15, 2007 at 4:51 PM
@CJ
I use POI on both IIS and Apache at work, so it should work regardless of which web server you choose.
Posted by Matthew on Jan 15, 2007 at 5:17 PM
Ken, Matt,
Just posted some code:
http://www.bennadel.com/index.cfm?dax=blog:474.view
Gotta go home now, I am STARVING!
Posted by Ben Nadel on Jan 15, 2007 at 7:00 PM
Ben, I'll recommend you to know about JExcel Java libraries. This is a great great piece of code. The ntegration with CF is very easy. I'm using JExcel(http://jexcelapi.sourceforge.net/) and I'm very happy with your power. Please look this simple example:
<cfset inicio=GetTickCount()>
<!--- armazenar a biblioteca jxl.jar no diretorio c:\cfusionmx7\lib e restartar o servidor --->
<cfset filepath = "C:\Manufact\200701_Fat.xls">
<cfscript>
//carrega o metodo Java que manuseia o arquivo xls
wfile = createObject("java","java.io.File").init(filepath);
//cria o workbook e carrega a planilha
workbook = createObject("java","jxl.Workbook");
//pega o workbook
wworkbook = createObject("java","jxl.write.WritableWorkbook");
wworkbook = workbook.getWorkbook(wfile);
sheet = createObject("java","jxl.write.WritableSheet");
//pega uma planilha especifica
sheet = wworkbook.getSheet("FaturamentoMaior");
</cfscript>
<cfset rowslen = sheet.getRows()>
<cfset columnslen = sheet.getColumns()>
<!--- captura o texto das colunas da planilha --->
<cfoutput>
<cfloop from="0" to="#columnslen-1#" index="c">
#sheet.getCell(javacast('int',c),0).getContents()#
<br />
</cfloop>
</cfoutput>
<cfoutput>
<cfloop from="0" to="#rowslen-1#" index="q">
#sheet.getCell(0,javacast('int',q)).getContents()# | #sheet.getCell(1,javacast('int',q)).getContents()# | #sheet.getCell(2,javacast('int',q)).getContents()#
<br />
</cfloop>
</cfoutput>
Tempo de execução: <cfdump var="#GetTickCount()-inicio#">ms
Posted by Marco Antonio C Santos on Jan 17, 2007 at 11:03 AM
Marco,
I have fooled around a bit with JExcel. It is very cool but once I found out that POI was already delivered with ColdFusion MX 7 I kind of took a detour. When I get a better footing on POI, I will surely come back and continue to look at JExcel. Thanks for you example above.
Keep on rockin!
Posted by Ben Nadel on Jan 17, 2007 at 1:45 PM
There's a rudimentary google code project called cfjexcel, that is an attempt at making using JExcelAPI easier.
You do need access to createObject, but it's got some basic row-level formula embedding, which users always love. Might be at least a good place for ideas.
Here's a demo of the formula stuff, and some basic IO:
http://coldshen.com/cfjexcel/test.cfm
Posted by denny on Mar 13, 2007 at 1:42 AM
Your code is littered with magic numbers and non-descript variable names, and is absolutely overkilled with comments. Quality code shouldn't need nearly so many comments. It should be self-explanatory.
Posted by AJ on May 22, 2008 at 3:29 PM
@AJ,
I am always looking for ways to improve my code. Could you give me an example of the "magic numbers" and non-descript variables names and maybe some suggestions? Thanks!
Posted by Ben Nadel on May 22, 2008 at 3:41 PM