Ask Ben: Using POI Utility To Move Excel Data Into A Database

Posted February 8, 2007 at 8:16 AM

Tags: Ask Ben, ColdFusion

I am soaking up your work with Excel spreadsheets. I am stuck on this question. I have followed your instructions for creating the multi sheet spreadsheet on meals and I can read the spreadsheet using your POIutility and I can dump out the array. I want to be able to use the data from the spreadsheet and write it into a database. You wrote "all values from the Excel are stored in the resultant ColdFusion queries as CF_SQL_VARCHAR values" - how do I access these?

Getting the data from an Excel using my POI Utility ColdFusion component is just a matter of understanding how the data gets read in from the Excel and how to access it once it is read in. The POI Utility reads each sheet of an Excel workbook into a "Sheet" object. All of these sheet objects are returned in a single array that looks like this (using my previous Meals example):


 
 
 

 
POI Utility Sheet Array  
 
 
 

As you can see each index of the above array contains a single sheet object that looks like this:


 
 
 

 
POI Utility Single Sheet Object  
 
 
 

Within that sheet object, there is the Query object (struct key: Query). This is the query object that contains the actual grid data from the Excel sheet. Now, taking that query object and using it to insert into a database could be done as follows:

 Launch code in new window » Download code as text file »

  • <!--- Create a new instance of the POI utility. --->
  • <cfset objPOIUtility = CreateObject(
  • "component",
  • "POIUtility"
  • ).Init()
  • />
  •  
  • <!---
  • Get the path to our Excel document. Our Excel document
  • workbook contains three sheets with information regarding
  • three meals (Breakfast, lunch, and dinner).
  • --->
  • <cfset strFilePath = ExpandPath( "./meals.xls" ) />
  •  
  • <!---
  • Read the Excel document into an array of Sheet objects.
  • Each sheet object will contain the data in the Excel
  • sheet as well as some other property-type information.
  • We are telling the POI Utility to expect the first row
  • of the Excel document to function as a header row.
  • --->
  • <cfset arrSheets = objPOIUtility.ReadExcel(
  • FilePath = strFilePath,
  • HasHeaderRow = true
  • ) />
  •  
  •  
  • <!---
  • We want to get the data from the Excel document into our
  • local database. The first thing we are going to want to
  • do is loop over each sheet and attach each set of
  • data individually.
  • --->
  • <cfloop
  • index="intSheet"
  • from="1"
  • to="#ArrayLen( arrSheets )#"
  • step="1">
  •  
  • <!---
  • Let's get a pointer to the current sheet object. We
  • could continue to refer to the sheet as an index of
  • the sheets array, but this is more convenient and
  • breaks it up into easier to read code.
  • --->
  • <cfset objSheet = arrSheets[ intSheet ] />
  •  
  • <!---
  • The data from the excel object is stored in a query
  • within this "sheet" object and can be accessed at the
  • key "query." For ease of use and short hand, let's
  • get a pointer to that query.
  • --->
  • <cfset qSheetData = objSheet.Query />
  •  
  • <!---
  • We can treat this query just like any old ColdFusion
  • query because it is just a plain old ColdFusion query.
  • Let's loop over it to get at each row.
  •  
  • NOTE: Since we told the POI Utility to use the first
  • row as a header row, the first row has already been
  • stripped off and returned as part of the sheet object.
  • We will not encounter it in THIS query.
  • --->
  • <cfloop query="qSheetData">
  •  
  • <!---
  • For ease of demonstration, let's get the values
  • out of the query that we want to use for our
  • database insert. This is step that is not required,
  • but helps clarify how things are working.
  •  
  • NOTE: Remember that the POI Utility auto names the
  • columns as it encounters them as COLUMN1, COLUMN2,
  • COLUMN3, .... etc.
  •  
  • We are throwing the second column (quantity) into a
  • Val() method call as we need to get it as a number.
  • The POI Utility reads in everything as a string.
  • --->
  • <cfset strFood = qSheetData.column1 />
  • <cfset flQuantity = Val( qSheetData.column2 ) />
  • <cfset strTastiness = qSheetData.column3 />
  •  
  •  
  • <!---
  • Now that we have the values we are going after, we
  • can insert them into our database.
  • --->
  • <cfquery name="qInsert" datasource="#App.DSN.Source#">
  • INSERT INTO food_diary
  • (
  • meal,
  • food,
  • quantity,
  • tastiness,
  • date_created
  • ) VALUES (
  • <cfqueryparam value="#objSheet.Name#" cfsqltype="CF_SQL_VARCHAR" />,
  • <cfqueryparam value="#strFood#" cfsqltype="CF_SQL_VARCHAR" />,
  • <cfqueryparam value="#flQuantity#" cfsqltype="CF_SQL_FLOAT" />,
  • <cfqueryparam value="#strTastiness#" cfsqltype="CF_SQL_VARCHAR" />,
  • <cfqueryparam value="#Now()#" cfsqltype="CF_SQL_TIMESTAMP" />
  • );
  • </cfquery>
  •  
  • </cfloop>
  •  
  • </cfloop>

At this point, each record of each query of each sheet has been inserted into the local database. I hope that that helps. Please let me know if you have any further questions.

Download Code Snippet ZIP File

Comments (36)  |  Post Comment  |  Ask Ben  |  Permalink  |  Other Searches  |  Print Page




Adobe ColdFusion 8.0.1 Update - Helping Programmers To Be Signifanctly Less Girlie - Download ColdFusion 8 Update 8.0.1 Now.

Reader Comments

Ben,

I have to write something very similar to that today. It has to read a bunch of excel files and import just certain columns/rows into a database. Normally, I would do that in C# but since I've seen this post I'm now wondering if doing it in CF would be faster. Basically, I have to read a directory with about 3500 excel files and import just some data from each one. Any thoughts/suggestions?

Posted by Boyan on Feb 8, 2007 at 9:20 AM


Boyan,

I don't have much experience with scalability and speed of the POI system. However, if you know the exactly cell columns/rows that you need to extract, the POI has a great API for reaching right into the sheets and reading that data.

However, if you have a C# solution that is working and is fast, I guess go with that. But of course, POI has some fun learning :)

Posted by Ben Nadel on Feb 8, 2007 at 9:23 AM


Ben,

all right. I'll probably end up with C# this time but I'll be using your POI library in the future. Keep up the good work!

Posted by Boyan on Feb 8, 2007 at 9:51 AM


Ben,

quick suggestion on your posts that contain code: maybe you can have a a link to generate a file with the code inside the code viewer so people can directly download the code you have posted. If you end up writing something of that sort, send it over (or post it), it would be useful to have for many people.

Posted by Boyan on Feb 8, 2007 at 9:58 AM


Boyan,

At the top of every code snippet there is a link to open the code up in a new window. The new window uses code-coloring (as best it can) and at the bottom there is a text area where you can copy-n-paste the code.

I of course do not advertise that, so unless you have seen it before, there is no way you would know that there is a cut-n-paste box at the bottom :) My bad.

Posted by Ben Nadel on Feb 8, 2007 at 10:55 AM


Ben,

I have seen what you are talking about and I have used it. My suggestion was to put another link next to the "Launch code in a new windows >>" link. When clicking on the new link, instead of opening a new window, pass the code snippet to a cf code that sticks it into a new file and prompts you to download the snippet. You know what I'm saying? Or is it just a stupid idea?

Posted by Boyan on Feb 8, 2007 at 11:24 AM


Boyan, that is stupid only if you think brilliance is stupid! I think that is a wicked awesome idea! Why are you such a badass?

Posted by Ben Nadel on Feb 8, 2007 at 11:27 AM


He, he...you are going a bit over board with the "brilliance" and "badass" stuff :-) Funny enough, I named my PC "badass" since I just upgraded it to Core 2 Duo E6600 with 2GB ram and Radion X1950XT (and the guys at work were making fun of me). Anyway, the idea came to me as part of the CodeSnippets web application I want to develop, so you wouldn't mind sharing what you write, that would be awesome!

Posted by Boyan on Feb 8, 2007 at 11:45 AM


No problemo :)

Posted by Ben Nadel on Feb 8, 2007 at 11:48 AM


Ben,

I'm getting this error when I read your cfc:

Element SHEETDATA.QUERY.RECORDCOUNT is undefined in LOCAL.

What can I be doing wrong?

Thanks for a great component.

Posted by sophek on Feb 8, 2007 at 12:02 PM


Sophek,

Oh man! I know I have seen this error before. I can't remember where. I think someone contacted me about getting the same error. Oh wait, I found it. Check in the comments of this post:

http://www.bennadel.com/index.cfm?dax=blog:484.view

Take a look at the comments that Josh G made. He was getting the same problem. I cannot remember what the solution was, but I think he realized that he was using an outdated version of the POI package or something.

I wish I could find his email back to me. Perhaps if you send me the data file, I can give it a go on my end and see what I can do.

Posted by Ben Nadel on Feb 8, 2007 at 12:12 PM


Ben,

Thanks for the fast reply, I downloaded the latest version 1-23-07 version, and it's still getting that error message.

Thanks
Sophek

Posted by sophek on Feb 8, 2007 at 1:02 PM


Please send me the XLS file you are working with: ben [ at ] bennadel [ dot ] com. Thanks.

Posted by Ben Nadel on Feb 8, 2007 at 1:05 PM


i am getting this error while using ur component "The selected method GetSheetName was not found. "

Posted by nagendra on Apr 24, 2007 at 10:47 AM


Great component! It works perfectly for me! Thank you!

Posted by Gp on May 8, 2007 at 7:13 PM


@GP,

Glad you like it. Here is the most up-to-date version of it:

http://www.bennadel.com/index.cfm?dax=blog:624.view

Posted by Ben Nadel on May 9, 2007 at 7:13 AM


This is great, I am able to run queries now, however, for some reason it is reading from all tabs when I try to output some data, basically I am not able to specify which sheetindex is should be reading from. I may not want all sheets read together.

<cfset arrSheets = objPOIUtility.ReadExcel(
FilePath = strFilePath,
HasHeaderRow = true
) />

When I tried to modify the above code to add SheetIndex = 0

<cfset arrSheets = objPOIUtility.ReadExcel(
FilePath = strFilePath,
HasHeaderRow = true,
SheetIndex = 0
) />

I get an error

Object of type class coldfusion.runtime.Struct cannot be used as an array

What should I do?

Thank you

Posted by Feras Nabulsi on Aug 9, 2007 at 9:52 AM


Does the error give you a line number in the POIUtility.cfc that is breaking? Also, make sure you have the most up-to-date POI Utility.cfc:

http://www.bennadel.com/projects/poi-utility.htm

Posted by Ben Nadel on Aug 10, 2007 at 8:35 AM


Thanks Ben, I figured it out, it was my fault. it is working like a charm. Thank you and good work.

Posted by Feras Nabulsi on Aug 10, 2007 at 8:59 AM


Hey Feras Nabulsi i'm getting the same problem, what did you do to sort it out? also thanks Ben this is awesome!

Posted by barry reid on Jan 25, 2008 at 9:59 AM


Hi,

All I did was download Ben's latest code, and it works fine.

Thanks
Sophek

Posted by Sophek Tounn on Jan 25, 2008 at 5:07 PM


HI, ben when i try to upload the excel, i am getting this error:

7:34:00.000 - Expression Exception - in C:\ColdFusion8\CustomTags\POIUtility.cfc : line 1157

Value must be initialized before use.

07:34:01.001 - Expression Exception - in C:\Inetpub\wwwroot\Alex2\emailtest.cfm : line 40

Object of type class coldfusion.runtime.Struct cannot be used as an array

i am using in the page as:

<cfset arrSheets = objPOI.ReadExcel(
FilePath = "#filename#",
HasHeaderRow = true,
SheetIndex = 0
) />

i am using the updated version of the poi utility.

Posted by Mat on Feb 6, 2008 at 2:37 AM


@Mat,

And line 40 is that ReadExcel() method call? The Struct as array error doesn't make sense in the POI Utility since that is an internal use of an internal variable.

Posted by Ben Nadel on Feb 6, 2008 at 7:09 AM


hey ben Thanks, i modified the code and try to use it very simple like removing the cfloop tag in which it counts the sheets..

<cfset arrSheets = objPOI.ReadExcel(FilePath = "#filename#",HasHeaderRow = true, SheetIndex = 0) />

<cfset qSheetData = arrSheets.Query />
<cfloop query="qSheetData">
<cfset firstname = qSheetData.column1>
<cfset surname = qSheetData.column2>

instead of doing it like this:

<cfloop
index="intSheet"
from="1"
to="#ArrayLen( arrSheets )#"
step="1"> --->
<!--- Get a short hand to the current sheet. --->
<!--- <cfset objSheet = arrSheets[ intSheet ] /> --->
<cfset qSheetData = arrSheets.Query />
<cfloop query="qSheetData">
<cfset firstname = qSheetData.column1>
<cfset surname = qSheetData.column2>

but even removing the cfloop still it generates the error:

intiliazilation erro as above

Posted by Mat on Feb 6, 2008 at 7:21 AM


@Mat,

If you comment out everything BUT the ReadExcel() line, do you still get an error? I am just trying to narrow down where the error is coming from.

Posted by Ben Nadel on Feb 6, 2008 at 7:27 AM


yes i stll get error at line 1157:

Expression Exception - in C:\ColdFusion8\CustomTags\POIUtility.cfc : line 1157
Value must be initialized before use.

Posted by Mat on Feb 6, 2008 at 7:55 AM


@Mat,

Are you still having issues with this or did you get it figured out?

Posted by Ben Nadel on Feb 11, 2008 at 6:34 PM


Like what I've seen so far with your POIutility.cfc...thank you for creating it! My question is: How can you dynamically reference the columns without using EVALUATE?

(c = loop index)

this works:
#evaluate("objSheet.Query.column" & c)#

this errors:
#objSheet.Query["column#c#"]#
(error = Complex object types cannot be converted to simple values.)

this errors:
#objSheet[Query["column#c#"]]#
(error = Variable QUERY is undefined.)

I tried assigning objSheet.Query to a new query and referencing that...no luck.

I'm sure it's something simple I'm overlooking...and I can always fall back to EVALUATE...but I try to avoid it whenever possible.

Posted by JD on Mar 6, 2008 at 10:00 AM


Whoops...figured it out.

Forgot you had to reference the current row of the loop:

(r = current row, c = current column)

#objSheet.Query["column#c#"][r]#

Thanks again.

Posted by JD on Mar 6, 2008 at 10:11 AM


@JD,

Try objSheet[ "column#c#" ][ objSheet.CurrentRow ]

Posted by Ben Nadel on Mar 6, 2008 at 10:11 AM


Sir why is it that xls file exported from crystal report 9 cannot be read by POIUtility. it creates an error Object Instantiation Exception.

Posted by Guest on Mar 7, 2008 at 10:16 PM


Ben, I am running against an Excel spreadsheet I am working with, and I get this error:

Object Instantiation Exception.
An exception occurred when instantiating a Java object. The class must not be an interface or an abstract class. Error: ''.

The error occurred in E:\Web\CFC\POIUtility.cfc: line 871
Called from E:\Web\Index.cfm: line 33
Called from E:\Web\POIUtility.cfc: line 871
Called from E:\Web\Index.cfm: line 33

869 :
870 : // Create a file input stream to the given Excel file.
871 : LOCAL.FileInputStream = CreateObject( "java", "java.io.FileInputStream" ).Init( ARGUMENTS.FilePath );
872 :
873 : // Create the Excel file system object. This object is responsible

My environment is:

Windows Server 2003 64 Bit
ColdFusion 8
Excel spreadsheet is pre 2007

Thanks,

Bruce

Posted by Bruce on Apr 20, 2008 at 3:59 PM


I'm having issues with running queries of the sheet queries. For instance, I have:

<cfset qUsers = arrSheets[ intSheet ].Query >

<!--- User must be unique --->
<cfquery name="qErr1" dbtype="query">
SELECT column1 as email
FROM qUsers
</cfquery>

This gives me an error of:
Unsupported Cast Excpetion: Casting to type NULL is unsupported.

As you can see, column1 is an email address - a string is fine, which I believe the default conversion is.

Help?

Posted by Cat on Apr 25, 2008 at 2:57 PM


I love this, thanks. However, I have noticed a bit of a problem that is driving me crazy. If I do this:

<cfset dataQuery = arrExcel.query>

and then dump it, no problem, query as expected. I can also loop through the query. However I am getting an error when trying to do a query of this query . . .

This simple query fails:

<cfquery name="thisData" dbtype="query">
select column1 from dataQuery
</cfquery>

I am getting an error that it can't convert a value to a number.

The reason I want to do this is because my data columns may be in varying order and am trying to query 'o query the data to put things in back with the correct column names to efficiently insert the data into the db (one query, as opposed to a loop line by line). Here is my complete code . . .

<cfset columnArray = arrExcel.columnnames>
<cfset dataQuery = arrExcel.query>
<cfset columnorderList="">

<!--- just gets my columns back to names using the colum name array --->
<cfloop list="STYLENAME,SELLCOMPNAME,CUTPRICE" index="thisColumn">
<cfloop from="1" to="#arraylen(columnArray)#" index="foo">
<cfif ucase(columnArray[foo]) is thisColumn>
<cfset columnOrderlist=listappend(columnOrderlist," COLUMN#foo# as #foo#")>
</cfif>
</cfloop>
</cfloop>

<!--- build the query --->
<cfset strQuery="select #columnOrderlist# from dataQuery">

<!--- execute --->
<cfquery name="thisData" dbtype="query">
#strQuery#
</cfquery>

This is a simple query but it generates the error above, about the conversion.

Posted by Matt on Sep 4, 2008 at 3:14 PM


Hi Ben

I have downloaded the POI Utility and I am using the ReadExcel utility for importing data from an excel sheet. There is a column in the excel sheet which contains dates. When I import the rows the date values are getting converted in to a number. Is there any code change required from my end?

Kindly help

Thanks

Parag

Posted by Parag Shah on Sep 10, 2008 at 4:16 PM


@Parag,

Reading dates is a funny thing. Really, dates are stored in the Excel file as a number (they are simply formatted to look like dates). ColdFusion should be able to handle these properly, you just have to format them:

#DateFormat( qExcel.date_column )#

I am working on trying improve the reading in of Excel files. I have been concentrating lately on the creating of Excel files, so I have been distracted.

Posted by Ben Nadel on Sep 12, 2008 at 9:39 AM


Post Comment  |  Ask Ben


Home   |   Web Log   |   ColdFusion   |   Projects   |   Resume   |   Job Form   |   Search   |   Contact
Epicenter Consulting - Custom Software Solutions for Business Evolution HostMySite.com - The Leader In ColdFusion Hosting