Initial Thoughts On POI Utility ColdFusion Custom Tag Output
Posted March 13, 2008 at 9:18 AM
My POIUtility.cfc ColdFusion component has been around for a while and has made reading and writing Excel files in my ColdFusion applications pretty easy. It has its limitations, but for simple reads and writes, it has really been good at getting the job done. For a long time, I have been wanting to upgrade the way it works, but have had trouble either making it a priority, finding the time, and/or wrapping my head around the changes that I wanted to make. I think part of the difficulty is that every time I thought about upgrading it, I thought in terms of both reading and writing Microsoft Excel files. This bogged my thoughts down a bit too much. I don't think I can go about it that way.
In order to get things done, I am going to just think about one aspect at a time and then refactor as I need to. To me, writing Microsoft Excel files in ColdFusion is much more powerful than reading them. Reading in is fairly straightforward; yes, there are some date formatting and formula issues, but for the most part, reading cell values into an array or a query is not a problem. Writing an Excel file on the other hand, that takes a bit more finesse. There are numerous formatting issues both from a data masking viewpoint as well as a cell formatting viewpoint. I have experimented with ways of doing this using the current POIUtility.cfc, with things like headercss and rowcss attributes, even with some dynamic evaluation, but these have all fallen short. Even the most simple reports look 100 times more professional when a good bit of formatting is done - this needs to be a priority and it needs to be accomplished (plus, I have some projects coming up that really need good Excel export features).
Another reason that I want to concentrate on the writing of Excel files rather than the reading is due to the issue of program comparability. The underlying POI library that comes with ColdFusion only works with PRE Excel 2007 formats. Well, it's already 2008; I don't know how long people are still going to be depending on Excel 2000/2003 or whatever POI-compatible versions there are out there. Microsoft Excel 2007 has changed their binary data format and 2007 files cannot be read in by the POIUtility.cfc. As such, that road has a finite length (which is shorter than longer); however, I assume that, and maybe this is way off, Excel 2007 will still be backwards compatible for reading old Excel files. Therefore, while the reading of XLS files is on the road to extinction, perhaps writing Excel files via the POI library will still have a reasonable life ahead of it.
For a while, I kicked around the idea of using a ColdFusion XML document to define the output of the Excel file. We are all used to creating table data grids, so I was going to model is on that style of markup. But, after seeing Jason Delmore's basic JExcel custom tags, I feel like ColdFusion custom tags are the route that I want to go. The use of ColdFusion custom tags has the benefit of being able to do document processing in a piece-wise fashion; meaning, that tags are processed as they are executed - you don't have to read an entire XML document into memory and you don't have to parse a potentially massive XML document before work can be done. I feel like the ColdFusion custom tags are going to hold a large speed gain over the XML document.
Here are some initial thoughts on what this could look like:
Launch code in new window » Download code as text file »
- <!---
- Create an excel document and store binary data into
- REQUEST variable.
- --->
- <poi:document name="REQUEST.ExcelData">
-
- <!---
- Define style classes for the workbook. This can be used
- across all the sheets in the workbook.
- --->
- <poi:classes>
- <poi:class
- name="row"
- style="font-size: 11pt ; color: black ;"
- />
-
- <poi:class
- name="header"
- style="font-weight: bold ; border-bottom: solid 2pt black ;"
- />
-
- <poi:class
- name="hotness"
- style="color: red ; font-style: italic ;"
- />
- </poi:classes>
-
- <!--- Define Sheets. --->
- <poi:sheets>
-
- <poi:sheet name="Smokin' Hotties">
-
- <poi:columns>
- <poi:column width="20" align="center" />
- <poi:column width="50" />
- <poi:column width="30" align="center" />
- <poi:column width="100" />
- <poi:column width="30" />
- </poi:columns>
-
- <poi:row class="row header">
- <poi:cell>Rank</poi:cell>
- <poi:cell>Name</poi:cell>
- <poi:cell>Hotness</poi:cell>
- <poi:cell>Best Feature</poi:cell>
- <poi:cell>Last Seen</poi:cell>
- </poi:row>
-
- <poi:row class="row">
- <poi:cell type="string">1</poi:cell>
- <poi:cell
- style="font-weight: bold ;"
- >Christina Cox</poi:cell>
- <poi:cell numberformat="9.9">9.0</poi:cell>
- <poi:cell class="hotness">Lips / Smile</poi:cell>
- <poi:cell
- type="date"
- dateformat="mmm d, yyyy"
- >02/23/20008</poi:cell>
- </poi:row>
-
- <poi:row class="row">
- <poi:cell type="string">2</poi:cell>
- <poi:cell>Winona Ryder</poi:cell>
- <poi:cell numbermask="9.9">8.0</poi:cell>
- <poi:cell class="hotness">Eyes</poi:cell>
- </poi:row>
-
- </poi:sheet>
-
- </poi:sheets>
-
- </poi:document>
I figure I'll start out with a small amount of functionality and then get more details as I go. The document tag should be able to write to a file or to return the Excel binary into a defined variable (one thing that several people complained about was always having to write the file to disk). You know that I love me some CSS and I am trying to bring that into the Excel creation through this idea of classes. I think CSS is fantastic for visual formatting and we can and should be able to use a subset of it for Excel data and cell formatting. In addition to classes (of which, multiple can be assigned to any given cell), there will also be one-off style attributes that access css.
The next biggest problem was the data value formatting. By default, I will let Excel decide the best type of formatting; this can be overridden with a Type attribute and then further formatted with things like NumberFormat and DateFormat.
So, those are my initial thoughts. I am not exactly sure how the POIUtiltiy.cfc will interact with this. Will it be created in the Document tag? Will it need to be passed in? Will a path to the CFC need to be provided? We shall see. And what about helper components? For instance, I probably will use my CSSRule.cfc or something like it to help parse in the CSS. But where will that live? Does this all need to be in the same folder? Will this break site-structure protocols? We shall see.
Download Code Snippet ZIP File
Post Comment | Ask Ben | Print Page
Reader Comments
I like where you are going with this and think custom tags will be much better than an xml based solution.
Just wondering, but will this allow a user to pass in a query (rather than having to loop over rows and output as poi:row class="row"). Being able to do something like <poi:sheet name="Smokin' Hotties" data="myQueryHere"> would probably make things easier, as I know there are quite a few times I'd like to pass in the data and worry just about the formatting rather than the query loop. Perhaps adding something to the cells within class="row_header" (e.g. <poi:cell columnName="rank">Rank</poi:cell>) to differentiate which query columns to use.
Also, how about adding the data formatting to the <poi:column width="20" align="center" /> tag? So it would become something like <poi:column width="20" align="center" type="date" dateformat="mmm d, yyyy" /> This would then allow control of the entire column rather than on the individual rows (and specific cells). If the programmer needed to get more fine tuning, allowing the poi:cell would satisfy that as well. Still allow the header changes (as those would usually be different than the rest of the data cells), but would be easier to manage again.
Anyway, really cool idea, and programmers are always happy when it requires less coding on their part :)
@Gareth,
Excellent suggestions! Thanks.
If you're going to make something that turns a brand of XML into XLS, why not just hook into the existing XSL-FO stuff for this ?
@Tom,
What is "XSL-FO"?
http://www.w3schools.com/xslfo/xslfo_xslt.asp
There exists an XSLT/POI-based transform to turn the generic XML of XSL-FO into native XLS binary files.
I can't find a link to that right now, but I've done exactly that before from Java.
@Tom,
That sounds really interesting. If you come across the link again, please send it my way.
This looks really awesome, Ben!
I think it's particularly clever to spend considerable time up front to craft a usable and pleasant API. We developers often get so excited by code that we dive into the implementation a little too soon.
I think the success of technologies from ColdFusion to jQuery can be attributed to their well-designed APIs.
I'm afraid soon I will have to give up the POI Utility on a project at work. Unfortunately, we have reached the limit since we need access to generating larger excel files. Our client loves excel. :) So we need support for XLSX. Either way my experience working with your POI Utility has been great Ben and as you know I really want to help you out on your next effort on this project.
@Ben
Here it is http://cocoon.apache.org/2.1/userdocs/xls-serializer.html
I imagine you could poke around in there and figure out what it needs to call to turn XSL-FO/GMR into XLS, then do that from CF.
@Tom,
Thanks, I will look into this.
I'll keep you all posted.
We need support for XLSX.
If I add a date field it does not work!!!!!!!!
error: The value '' cannot be converted to a number.
@IRz,
Try casting the date/time field as a VARCHAR field in your query?? Not sure where that error is even coming from.
Nice stuff, if you add the following to the cell.cfm you can also use formulas
inside in the
<cfswitch expression="#ATTRIBUTES.Type#">
about line 299
<cfcase value="formula">
<!--- Set a formula --->
<cfset VARIABLES.Cell.setCellFormula(THISTAG.GeneratedContent )/>
</cfcase>
then you can use the following
<poi:cell type="formula" value="SUM(A4:A8)" />
@Zac,
I tried making this update, but it didn't seem to work. I was using the MAX() function as a test. Maybe I was doing something or that function is not supported.
I just released the tag library with vast performance improvements. I will start trying to play around with the function stuff next.
that's strange....
new idea: being able to give an id to a cell and then reference the cells by id in formula's rather than having to calculate where a cells lies in a dynamic sheet would be a huge timesaver
z
@Zac,
That would definitely be a cool idea. Let me get regular formulas working first and then I will see about making them more relevant to the tags themselves.
@Zac,
I got both the formulas and the aliasing working. I will try to post updated version at lunch. It works basically like this:
<poi:cell value="5" alias="StartSum" />
.....
<poi:cell value="1322" alias="EndSum" />
.....
<poi:cell value="Sum( @StartSum:@EndSum )" type="formula" />
Works pretty good. I guess when I tried the formula the other day, it was just not a supported formula (MAX).
great stuff :)
@Zac,
Thanks for the great advice. It has been released for the CF8 version of the tags:
http://www.bennadel.com/index.cfm?dax=blog:1307.view
Hopefully, it should be minimal effort to make CF7 compatible. Really, its just the REMatch () function. I just need to make a little UDF somewhere for the tags to utilize.
Thank you very much for your great work. It saved me!
Do I understand correctly that the poi custom tag will not be able to read date fields on EXCEL correctly? Even for the Excel 200-2003?
They are all integers on cfdump?
@springgrass,
Those integers are the dates. They are just not formatted. You should be able to format them as needed with DateFormat() once you read them in.
I am using the formula TTEST, when opening the excel it shows #VALUE! stating that the parameters are not all numeric and all cells and cell ranges are numerics. When I go to the frmula, click on it and press enter, it calculates it and the error disappear.
I think that excel is not realizing that the parameters are numbers! What can I do?
Any plans for .xlsx support?
Why do you want .xlsx support ?
.xslx support is needed because a lot of people are (often unintentionally, because it's a default) using that format now. It's pretty critical to be able to read it for that reason.
@Eric,
XLSX support should be automatically built-in IF you upgrade the POI JAR being used; at least, this is what I have been told - I have not tried it personally just yet.
@Ben
Regarding XLSX support, it didn't seem to work by just upgrading the POI Jars. I had to not only remove the POI Jars currently in \ColdFusion8\lib but load all the jars in this zip:
http://poi.apache.org/download.html/poi-bin-3.6-20091214.zip and change the code in POIUtility.cfc lines 826-837 to:
// Load the Workbook factory.
LOCAL.WorkBookFactory = CreateObject(
"java",
"org.apache.poi.ss.usermodel.WorkbookFactory"
).Init();
// Load the Workbook by loading the File into the factory
LOCAL.WorkBook = LOCAL.WorkBookFactory.create(LOCAL.FileInputStream);
But it works!!
Thanks for all you work Ben. I have used this cfc many times over the years.




