Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at Scotch On The Rocks (SOTR) 2011 (Edinburgh) with:

Using MySQL's TO_DAYS() Function To Group Same-Day Events

By Ben Nadel on
Tags: ColdFusion, SQL

The other day, I was creating a reporting and statistics page for a small project. One of the reports that I had to generate required me to group records by Date, outputting the number of records created on each day. The records themselves had a dateCreated field containing a date/time value. Due to the time portion of the date/time value, however, I couldn't group the records directly by their dateCreated value; but, I could group them together based on their calculated TO_DAYS() value. Unfortunately, MySQL and ColdFusion have two very different ways of representing a date as a "day index". As such, a little in-SQL date math had to be performed.

The date/time functionality in ColdFusion is pretty freakin' amazing! And, if you've used ColdFusion for long enough, you've probably come to know and love the fact that ColdFusion dates can be represented as decimal values. Each date can be represented as the number of days that have passed since ColdFusion's "zero date."

MySQL also has the ability to represent a date as the number of days since the "zero date." Unfortunately, MySQL and ColdFusion do not agree on what the zero date should be. To see this discrepancy, take a look at the following code:

  • <!--- Get MySQL's representation of today as an int. --->
  • <cfquery name="today" datasource="testing">
  • SELECT
  • TO_DAYS( #now()# ) AS asInt
  • ;
  • </cfquery>
  •  
  • <cfoutput>
  •  
  • ColdFusion Today: #fix( now() )#<br />
  •  
  • MySql Today: #today.asInt#<br />
  •  
  • </cfoutput>

Here, we are using MySQL's TO_DAYS() to get today's date (as provided by ColdFusion) as an integer. We then compare this to the integer value calculated by ColdFusion's fix() method. This gives us the following output:

ColdFusion Today: 40736
MySql Today: 734695

As you can see, there is a rather large discrepancy between the two systems.

This doesn't mean we can't use the TO_DAYS() function to our advantage; it simply means that we have to take into account the difference between the zero dates of the two systems. And, since we'll be outputting data in a ColdFusion context, it's probably best that these calculations take place within the MySQL database.

In the following demo, we're going to populate a table with 15 days worth of randomized date/time values. Then, we'll group that table based on its calculated TO_DAYS() value and output the number of records "created" on a given day. When we determine the "date index" of each group, we'll have to take into account the offset between the two systems.

  • <!--- First, let's populate our testing table with random dates. --->
  • <cfquery name="insert" datasource="testing">
  •  
  • <!--- Clear out the table. --->
  • TRUNCATE TABLE date_based;
  •  
  • <!--- Let's create 15 days worth of data. --->
  • <cfloop
  • index="dayOffset"
  • from="-14"
  • to="0"
  • step="1">
  •  
  • <!--- For each day, insert a random number of rows. --->
  • <cfloop
  • index="i"
  • from="1"
  • to="#randRange( 1, 20 )#"
  • step="1">
  •  
  • <!---
  • When we insert the record's dateCreated value, notice
  • that we are using the rand() function to define a
  • date/time that appears randomly within the 24-hour
  • period of the given day offset.
  • --->
  • INSERT INTO date_based
  • (
  • dateCreated
  • ) VALUES (
  • <cfqueryparam
  • value="#(fix( now() + dayOffset ) + rand())#"
  • cfsqltype="cf_sql_timestamp"
  • />
  • );
  •  
  • </cfloop>
  •  
  • </cfloop>
  •  
  • </cfquery>
  •  
  •  
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  • <!--- ----------------------------------------------------- --->
  •  
  •  
  • <!--- Get the report items. --->
  • <cfquery name="items" datasource="testing">
  •  
  • <!---
  • First, we need to figure out the difference between this
  • database's (MySQL) "zero date" and ColdFusion's zero date.
  • That way, we can group same-day items together using
  • an index.
  •  
  • Once we get the date offset, we can convert MySQL's day-
  • based dates to ColdFusion's day-based dates using the
  • simple formula:
  •  
  • ColdFusionDayDate = (MySqlDayDate - DateOffset)
  • --->
  • SET @dateOffset = ( TO_DAYS( #now()# ) - #fix( now() )# );
  •  
  •  
  • <!---
  • Now that we have an offset that we can use to conversion
  • to group dates that occurred on the same date as defined
  • by their date-index.
  • --->
  • SELECT
  • <!--- Get the date index for this group. --->
  • (TO_DAYS( d.dateCreated ) - @dateOffset) AS dateIndex,
  •  
  • <!--- Get the count in this group. --->
  • COUNT( * ) AS itemCount
  • FROM
  • date_based d
  • GROUP BY
  • TO_DAYS( d.dateCreated )
  • ORDER BY
  • dateIndex ASC
  • ;
  • </cfquery>
  •  
  •  
  • <!---
  • Now, output the items that occurred on each date. Since we
  • have synced the MySQL date and the ColdFusion date, the dateIndex
  • field can be converted to an actual date without concern.
  • --->
  • <cfoutput query="items">
  •  
  • #dateFormat( items.dateIndex, "mmm dd, yyyy" )#
  • --
  • #items.itemCount# items.<br />
  •  
  • </cfoutput>

The first part of this code is there to populate the database. In the second half of the code, you can see that we create a @dateOffset value which olds the difference in days between the two systems' zero date. Then, as we GROUP BY the TO_DAYS() version of each dateCreated timestamp, we can convert the MySQL value into a ColdFusion value by subtracting the @dateOffset.

Once we do this, we can easily convert the numeric (int-based) date back to a user-friendly date using the dateFormat() function. And, when we run the above code, we get the following output:

Jun 28, 2011 -- 9 items.
Jun 29, 2011 -- 12 items.
Jun 30, 2011 -- 19 items.
Jul 01, 2011 -- 11 items.
Jul 02, 2011 -- 7 items.
Jul 03, 2011 -- 6 items.
Jul 04, 2011 -- 9 items.
Jul 05, 2011 -- 17 items.
Jul 06, 2011 -- 11 items.
Jul 07, 2011 -- 2 items.
Jul 08, 2011 -- 7 items.
Jul 09, 2011 -- 10 items.
Jul 10, 2011 -- 18 items.
Jul 11, 2011 -- 7 items.
Jul 12, 2011 -- 8 items.

Typically, in a database, I try my best to avoid running any functions on column values. This tends to be much slower than just retrieving raw values. But, given the impromptu nature of the reporting page, I didn't have a denormalized dataset ready to go. I am happy to say, however, that even with millions of records, this approach to the reporting was extremely fast (I'm talking milliseconds of execution time).




Reader Comments

Hmm ... a couple of thoughts about this.

Why not use CreateODBCDate( Now() ) to tell MySQL what today's date is?

Also, why not just use Date() in your query? Doesn't that return the date part of a date/time value?

Reply to this Comment

@Dave,

I used to use CreateODBCDate(); but, I found that when working with a direct date object (ie. now()), it didn't seem to be necessary (unless I am misunderstanding your question).

As far as the date() question, I simply didn't think of it :) That said, I would assume that converting the value to an INT would probably be faster than extracting the date portion of the date/time stamp. In the past, I've had some pretty poor performance when it comes to running any kind of functions on columns. So, I think I just assumed going INT would be the fastest.

Good point though, I can't believe I didn't think of that. I probably spend too much time thinking about dates as numbers!

Reply to this Comment

@Ben, you're not missing anything ... that was just the result of throwing guesses at the wall to see what would stick. I don't have MySQL on my laptop, so I couldn't test it for sure. Thinking about it, of course no CreateDate function is going to get today's date into MySQL because CF is the one creating the date. If you wanted to base something on now(), then you'd just use MySQL's NOW() function in your query.

With respect to performance, I'm not sure there's a clear-cut winner. I think it depends heavily on what you're doing with the data. DATETIME and TIMESTAMP fields give you, well, date and time operations ... so if you've got something stored as a date, then it's easy to group results by month or day or year. If all you need is elapsed time, then INT will probably work just fine, given that it's smaller.

The difference in our approach is probably just a difference in our backgrounds. :) I was working in VBA and Access long before I got into CF, and with exposure to Oracle and SQL Server after that, I'm accustomed to looking at DB stuff from that perspective (as you've probably seen in other answers!). I think I prefer storing dates as dates for clarity, but I think it's really just preference. If you prefer working with ints, there's nothing wrong with that! (Well, until you get into elapsed time ...)

Reply to this Comment

@Dave,

Don't get me wrong - I definitely store my dates as dates. I am a fan of the date/time value :) I just think of them in terms of Int's when I have start working with sets of dates, typically in charts and what not.

In the past, when I had to do reporting on tables that had a massive amount of records, I would actually create a somewhat denormalized column set that would, if needbe, include separate columns for date (where time had been removed) if I need to perform grouping.

I'm definitely not all that experienced in reporting. I just add some stuff to get the job done. One time, I was gonna learn all about Star Schema for reporting... but that client got dropped before we had the chance (they brought their dev team in-house after funding came through).

Reply to this Comment

FYI, here's the MSSQL version if you are working with DATETIME date:

FLOOR(CONVERT(FLOAT, dateCreated))

If you haven't modified the default starting datetime in the server settings, you can also do this:

SELECT DATEDIFF(d, '1/1/1900', dateCreated)

If you're working with DATE, you can convert right to INT:

CONVERT(INT, dateCreated)

Reply to this Comment

@Roland:

Actually, for what Ben's really trying to do (just convert the date/time object to a date) in SQL Server you could just do:

convert(char(10), getutcdate(), 120)

Which would convert the date/time object to the string "yyyy-mm-dd".

This will allow you to sort asc/desc in a way to group dates together in the correct order and CF can still use the dateFormat() on the values without any additional processing.

Reply to this Comment

@Dan

That's also a string sort/group vs an integer sort/group if you use that method though (10 byte vs 4 byte). If you're dealing with a decent sized record set, as time-series data tends to be - you'll want them as integers for sorting and grouping, or heavens-forbid comparison.

That, or just use the DATE datatype if it's available and save all the conversion trouble in the first place.

Reply to this Comment

@Roland,

Funky! I also remember it being different in CF and MS SQL. Seems like everyone has their own zero date :)

Reply to this Comment

I just cannot get anything to work for me - using SQL2000. I just want to sort a record set by the date/time field event date but then group by date.
SELECT eventdate as ???????
FROM FieldEventNotes
WHERE region = '#region#'
GROUP BY eventdate???
order eventdate desc
Spent hours on this

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.