Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at cf.Objective() 2012 (Minneapolis, MN) with:

Initial Thoughts On POI Utility ColdFusion Custom Tag Output

By Ben Nadel on
Tags: ColdFusion

My POIUtility.cfc ColdFusion component has been around for a while and has made reading and writing Excel files in my ColdFusion applications pretty easy. It has its limitations, but for simple reads and writes, it has really been good at getting the job done. For a long time, I have been wanting to upgrade the way it works, but have had trouble either making it a priority, finding the time, and/or wrapping my head around the changes that I wanted to make. I think part of the difficulty is that every time I thought about upgrading it, I thought in terms of both reading and writing Microsoft Excel files. This bogged my thoughts down a bit too much. I don't think I can go about it that way.

In order to get things done, I am going to just think about one aspect at a time and then refactor as I need to. To me, writing Microsoft Excel files in ColdFusion is much more powerful than reading them. Reading in is fairly straightforward; yes, there are some date formatting and formula issues, but for the most part, reading cell values into an array or a query is not a problem. Writing an Excel file on the other hand, that takes a bit more finesse. There are numerous formatting issues both from a data masking viewpoint as well as a cell formatting viewpoint. I have experimented with ways of doing this using the current POIUtility.cfc, with things like headercss and rowcss attributes, even with some dynamic evaluation, but these have all fallen short. Even the most simple reports look 100 times more professional when a good bit of formatting is done - this needs to be a priority and it needs to be accomplished (plus, I have some projects coming up that really need good Excel export features).

Another reason that I want to concentrate on the writing of Excel files rather than the reading is due to the issue of program comparability. The underlying POI library that comes with ColdFusion only works with PRE Excel 2007 formats. Well, it's already 2008; I don't know how long people are still going to be depending on Excel 2000/2003 or whatever POI-compatible versions there are out there. Microsoft Excel 2007 has changed their binary data format and 2007 files cannot be read in by the POIUtility.cfc. As such, that road has a finite length (which is shorter than longer); however, I assume that, and maybe this is way off, Excel 2007 will still be backwards compatible for reading old Excel files. Therefore, while the reading of XLS files is on the road to extinction, perhaps writing Excel files via the POI library will still have a reasonable life ahead of it.

For a while, I kicked around the idea of using a ColdFusion XML document to define the output of the Excel file. We are all used to creating table data grids, so I was going to model is on that style of markup. But, after seeing Jason Delmore's basic JExcel custom tags, I feel like ColdFusion custom tags are the route that I want to go. The use of ColdFusion custom tags has the benefit of being able to do document processing in a piece-wise fashion; meaning, that tags are processed as they are executed - you don't have to read an entire XML document into memory and you don't have to parse a potentially massive XML document before work can be done. I feel like the ColdFusion custom tags are going to hold a large speed gain over the XML document.

Here are some initial thoughts on what this could look like:

  • <!---
  • Create an excel document and store binary data into
  • REQUEST variable.
  • --->
  • <poi:document name="REQUEST.ExcelData">
  •  
  • <!---
  • Define style classes for the workbook. This can be used
  • across all the sheets in the workbook.
  • --->
  • <poi:classes>
  • <poi:class
  • name="row"
  • style="font-size: 11pt ; color: black ;"
  • />
  •  
  • <poi:class
  • name="header"
  • style="font-weight: bold ; border-bottom: solid 2pt black ;"
  • />
  •  
  • <poi:class
  • name="hotness"
  • style="color: red ; font-style: italic ;"
  • />
  • </poi:classes>
  •  
  • <!--- Define Sheets. --->
  • <poi:sheets>
  •  
  • <poi:sheet name="Smokin' Hotties">
  •  
  • <poi:columns>
  • <poi:column width="20" align="center" />
  • <poi:column width="50" />
  • <poi:column width="30" align="center" />
  • <poi:column width="100" />
  • <poi:column width="30" />
  • </poi:columns>
  •  
  • <poi:row class="row header">
  • <poi:cell>Rank</poi:cell>
  • <poi:cell>Name</poi:cell>
  • <poi:cell>Hotness</poi:cell>
  • <poi:cell>Best Feature</poi:cell>
  • <poi:cell>Last Seen</poi:cell>
  • </poi:row>
  •  
  • <poi:row class="row">
  • <poi:cell type="string">1</poi:cell>
  • <poi:cell
  • style="font-weight: bold ;"
  • >Christina Cox</poi:cell>
  • <poi:cell numberformat="9.9">9.0</poi:cell>
  • <poi:cell class="hotness">Lips / Smile</poi:cell>
  • <poi:cell
  • type="date"
  • dateformat="mmm d, yyyy"
  • >02/23/20008</poi:cell>
  • </poi:row>
  •  
  • <poi:row class="row">
  • <poi:cell type="string">2</poi:cell>
  • <poi:cell>Winona Ryder</poi:cell>
  • <poi:cell numbermask="9.9">8.0</poi:cell>
  • <poi:cell class="hotness">Eyes</poi:cell>
  • </poi:row>
  •  
  • </poi:sheet>
  •  
  • </poi:sheets>
  •  
  • </poi:document>

I figure I'll start out with a small amount of functionality and then get more details as I go. The document tag should be able to write to a file or to return the Excel binary into a defined variable (one thing that several people complained about was always having to write the file to disk). You know that I love me some CSS and I am trying to bring that into the Excel creation through this idea of classes. I think CSS is fantastic for visual formatting and we can and should be able to use a subset of it for Excel data and cell formatting. In addition to classes (of which, multiple can be assigned to any given cell), there will also be one-off style attributes that access css.

The next biggest problem was the data value formatting. By default, I will let Excel decide the best type of formatting; this can be overridden with a Type attribute and then further formatted with things like NumberFormat and DateFormat.

So, those are my initial thoughts. I am not exactly sure how the POIUtiltiy.cfc will interact with this. Will it be created in the Document tag? Will it need to be passed in? Will a path to the CFC need to be provided? We shall see. And what about helper components? For instance, I probably will use my CSSRule.cfc or something like it to help parse in the CSS. But where will that live? Does this all need to be in the same folder? Will this break site-structure protocols? We shall see.




Reader Comments

Just wondering, but will this allow a user to pass in a query (rather than having to loop over rows and output as poi:row class="row"). Being able to do something like <poi:sheet name="Smokin' Hotties" data="myQueryHere"> would probably make things easier, as I know there are quite a few times I'd like to pass in the data and worry just about the formatting rather than the query loop. Perhaps adding something to the cells within class="row_header" (e.g. <poi:cell columnName="rank">Rank</poi:cell>) to differentiate which query columns to use.
Also, how about adding the data formatting to the <poi:column width="20" align="center" /> tag? So it would become something like <poi:column width="20" align="center" type="date" dateformat="mmm d, yyyy" /> This would then allow control of the entire column rather than on the individual rows (and specific cells). If the programmer needed to get more fine tuning, allowing the poi:cell would satisfy that as well. Still allow the header changes (as those would usually be different than the rest of the data cells), but would be easier to manage again.

Anyway, really cool idea, and programmers are always happy when it requires less coding on their part :)

This looks really awesome, Ben!

I think it's particularly clever to spend considerable time up front to craft a usable and pleasant API. We developers often get so excited by code that we dive into the implementation a little too soon.

I think the success of technologies from ColdFusion to jQuery can be attributed to their well-designed APIs.

I'm afraid soon I will have to give up the POI Utility on a project at work. Unfortunately, we have reached the limit since we need access to generating larger excel files. Our client loves excel. :) So we need support for XLSX. Either way my experience working with your POI Utility has been great Ben and as you know I really want to help you out on your next effort on this project.

Nice stuff, if you add the following to the cell.cfm you can also use formulas

inside in the
<cfswitch expression="#ATTRIBUTES.Type#">

about line 299

<cfcase value="formula">

<!--- Set a formula --->
<cfset VARIABLES.Cell.setCellFormula(THISTAG.GeneratedContent )/>

</cfcase>

then you can use the following

<poi:cell type="formula" value="SUM(A4:A8)" />

@Zac,

I tried making this update, but it didn't seem to work. I was using the MAX() function as a test. Maybe I was doing something or that function is not supported.

I just released the tag library with vast performance improvements. I will start trying to play around with the function stuff next.

that's strange....

new idea: being able to give an id to a cell and then reference the cells by id in formula's rather than having to calculate where a cells lies in a dynamic sheet would be a huge timesaver

z

@Zac,

That would definitely be a cool idea. Let me get regular formulas working first and then I will see about making them more relevant to the tags themselves.

@Zac,

I got both the formulas and the aliasing working. I will try to post updated version at lunch. It works basically like this:

<poi:cell value="5" alias="StartSum" />
.....
<poi:cell value="1322" alias="EndSum" />
.....
<poi:cell value="Sum( @StartSum:@EndSum )" type="formula" />

Works pretty good. I guess when I tried the formula the other day, it was just not a supported formula (MAX).

Thank you very much for your great work. It saved me!
Do I understand correctly that the poi custom tag will not be able to read date fields on EXCEL correctly? Even for the Excel 200-2003?
They are all integers on cfdump?

@springgrass,

Those integers are the dates. They are just not formatted. You should be able to format them as needed with DateFormat() once you read them in.

I am using the formula TTEST, when opening the excel it shows #VALUE! stating that the parameters are not all numeric and all cells and cell ranges are numerics. When I go to the frmula, click on it and press enter, it calculates it and the error disappear.
I think that excel is not realizing that the parameters are numbers! What can I do?

.xslx support is needed because a lot of people are (often unintentionally, because it's a default) using that format now. It's pretty critical to be able to read it for that reason.

@Eric,

XLSX support should be automatically built-in IF you upgrade the POI JAR being used; at least, this is what I have been told - I have not tried it personally just yet.

@Ben
Regarding XLSX support, it didn't seem to work by just upgrading the POI Jars. I had to not only remove the POI Jars currently in \ColdFusion8\lib but load all the jars in this zip:
http://poi.apache.org/download.html/poi-bin-3.6-20091214.zip and change the code in POIUtility.cfc lines 826-837 to:
// Load the Workbook factory.
LOCAL.WorkBookFactory = CreateObject(
"java",
"org.apache.poi.ss.usermodel.WorkbookFactory"
).Init();
// Load the Workbook by loading the File into the factory
LOCAL.WorkBook = LOCAL.WorkBookFactory.create(LOCAL.FileInputStream);

But it works!!

Thanks for all you work Ben. I have used this cfc many times over the years.

I just had to go through what Jonny Goldsteen did to be able to write xlsx data to my client's database.

The latest update for XLSX support is:

1. recursively extract all JARs from the following:
http://www.apache.org/dyn/closer.cgi/poi/release/bin/poi-bin-3.6-20091214.zip
to Coldfusion8/lib

2. Remove lines 875-885 in POIUtility.cfc.
Replace with:

// Load a Workbook factory.
LOCAL.WorkBookFactory = CreateObject(
"java",
"org.apache.poi.ss.usermodel.WorkbookFactory"
).Init();
// Load the Workbook by loading the File into the factory
LOCAL.WorkBook = LOCAL.WorkBookFactory.create(LOCAL.FileInputStream);

I like the idea of an alternative to XML. The whole xsl-fo language system can get really confusing. There's a decent website that could be a help at http://www.ecrion.com but I still had to have another person put that into language I could understand better.

Ben

The poi custom tags still rule. I have added support for header,footer,password protecting sheets, and unlocking/locking cells on those sheets. I have also added a way to use the javaloader with the custom tags to make this all more portable and easy to upgrade as the poi jars keep upgrading.

. I would be happy To send you the newest code.

Only thing missing now in the custom tag is to create xlsx instead of xls. For all the reasons cited above and the fact that the files are WAY smaller. I am having a hard time doing it so wondered if anyone has done it in the custom tags successfully

Thanks
Matt

@Matt,

That sounds awesome. I'm looking to get all my stuff on GIT one of these days so that this stuff can be much more easily augmented / shared / merged, etc. Fingers crossed that this happens sooner than later.

@Ben

Ohh yeah - and I got your proof of concept working with the progress indicator flushing progress even when in the custom tag for poi! Wrapped your example code around our creation page and with a little bit of tweaking got it working!

Only real change is on the beginning of the loop added logic to break if the thread was terminated as well (due to an error) otherwise it looped till it timed out...

<cfloop condition="(cfthread[ request.threadName ].status neq 'COMPLETED' and cfthread[ request.threadName ].status neq 'TERMINATED')">

Thanks for the proof of concept!

If anyone has figured out how to save xlsx with the poi tag and the newest poi classes (3.7 is current) let us know!

Thanks,
Matt

Matt - could you possibly send me the newest version of the POI code you've outlined above? I need to work with xlsx in the nth degree and any help would be appreciated.

Thanks,
Angela

@Angela

Unfortunately I have not been able to add the xlsx to this tag - i was hoping someone out there figured it out. I did add other good stuff to it like locking cells and sheets and adding print headers and footers, but not the xlsx yet.

@Ben

As many folks noted, and contrary to my earlier post regarding the cfthread output while running the tag, once i started messing with large files things went bonkers. I have taken out the cfthreads and am solving not as elegantly but pretty nicely. Basically doing all of our queries and pre-processing on the page with javascript output as it's happening and then changing status to 'Writing File' and then invoking the custom tag to loop through my collected queries and then spit out via <poi:...> - not perfect but working pretty well for sure!

Thanks,
Matt

@Ben

So one more big post for you on this topic that fuses many questions i've had over the life of the poi custom tags and offers a pretty slick solution that I think you and others may appreciate.

The Task: Turn an existing cf page that creates html tables and other elements that is currently working well into an Excel file using all (eventually) that poi has to offer, while outputting something to the user in the browser window as to the status of the output.

Solution:
Setup a status page with some html and a container for the updated progress and include it on the top of your existing page if you are in file creation mode of your existing page (if excel icon or link was clicked on existing report page).

Wrap your current page in a cfsavecontent with logic along the way, on top of any loops for example, that are not contained in the savecontents to add progress to your innerhtml of your container with javascript.

Most of the time with queries and looping has already happened at this point using the same code that your existing page already has working...the message you output alert the user where you arei n the process.

Now import the poi custom tag and begin your document

Parse the variable from your cfsavecontent of your page with xmlParse, which should give you a structure starting with your top <table> or really any html element you first wrapped in your cfsavecontent variable (eg <div><span>) It seems logical, but valid html is xml.

Parse through your table, tr, and td tags, you can easily parse them and turn them into <poi:sheet>, <poi:row> and <poi:cell> with a few <cfloop>s on the bottom of your page. You can grab from the xmlattributes to turn class into the poi custom tag classes, colspan into colspan, xmltext into the value of the <poi:cell> and so on. Simple to add logic to handle any classes with logic (like do a formula for certain classes).

Since poi tags are so slick this last step happens super quick and teh file is saved - you can output Writing File in your status container right before invoking the custom tag and then when finished change it to say Click to Download to redirect to the page in a mode that streams the file and deletes it.

Sorry for the long post but hopefully it helps someone out there. I already know of several places where it will make creating a beautifully formatted xls file out of an existing page with little work.

Now someone just needs to figure out how to save as xlsx without major overhaul to the custom tags :)

For those having issues w/ getting XSLS to work using Ben's awesome POIUtility.cfc - Tristan's comment got me close, but not all the way. I ran into a bunch of errors, and this is what I was able to use to get it working.

For reference, I'm running CF8 on CentOS.

The follow .jars were all copied to my:

/opt/coldfusion8/lib folder

jsr173_1.0_api.jar
xbean.jar
xbean_xpath.jar
xmlbeams-qname.jar
poi-3.6-20091214.jar

xmlbeams came from (xmlbeans-2.5.0.tgz)

Following that, the modification that Tristan listed:

Remove lines 875-885 in POIUtility.cfc.

Replace with:

// Load a Workbook factory.
LOCAL.WorkBookFactory = CreateObject(
"java",
"org.apache.poi.ss.usermodel.WorkbookFactory"
).Init();
// Load the Workbook by loading the File into the factory
LOCAL.WorkBook = LOCAL.WorkBookFactory.create(LOCAL.FileInputStream);

I also combined this with the suggestion of using Query2CSV

, and made a tiny change so I wouldn't have to specify my header names ahead of time, so that this could be more dynamic:

Change:
<cfset LOCAL.ColumnNames = [] />

To:
<cfset LOCAL.ColumnNames = Query.GetColumnNames() />

Hope this helps someone struggling out a little bit.

Thank you Ben so much for all your hard work!

@Will,

Nice one. Changed the 2 lines in poiutility and can read xlsx like a charm. Thanks for sharing.

Have you messed with trying to create a new xlsx in the poi custom tag document.cfm file at all yet?

Thanks a lot,
Matt

For anyone wanting to read xlsx files or create them. Look for the code snippet in document.cfm where you see the "Create the Excel workbook..." comment. Replace the Workbook and Dataformat object creation with the following snipped.

  • <!---
  • Create the Excel workbook to which we will be writing. Check
  • to see if we are creating a totally new workbook, or if we want
  • to use an existing template.
  • --->
  •  
  • <cfif Len( ATTRIBUTES.Template )>
  •  
  • <!--- Read in existing workbook. --->
  • <cfset VARIABLES.WorkBook = CreateObject( "java", "org.apache.poi.xssf.usermodel.XSSFWorkbook" ).Init(
  • CreateObject( "java", "java.io.FileInputStream" ).Init(
  • JavaCast( "string", ATTRIBUTES.Template )
  • )
  • ) />
  •  
  • <cfelse>
  •  
  • <!--- Create a new workbook. --->
  • <cfset VARIABLES.WorkBook = CreateObject( "java", "org.apache.poi.xssf.usermodel.XSSFWorkbook" ).Init() />
  •  
  • </cfif>
  •  
  • <!---
  • Create a data formatter utility object (we will need this to
  • get the formatting index later on when we set the cell styles).
  • --->
  • <cfset VARIABLES.DataFormat = CreateObject( "java", "org.apache.poi.xssf.usermodel.XSSFDataFormat" ) />

Works like a charm. Enjoy!

@Yonaro

Nice work, this works great. Added logic to my document.cfm that if listlast(attributes.file,".") eq "xls" then use the HSSFWorkbook or if it eq "xlsx" use the XSSFWorkbook. Working solid. N

Needed to fix the addMergedRegion call in cell.cfm with logic based off of the class of the workbook which you can get from:
<cfif variables.DocumentTag.workbook.getClass().getSimpleName() eq "XSSFWorkbook">

Probably another method or two to add logic on (shiftRows I believe) but this works great otherwise for creating .xslx.

Thanks a bunch,
Matt