Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
Ben Nadel at the Angular NYC Meetup (Jan. 2019) with: Igor Minar
Ben Nadel at the Angular NYC Meetup (Jan. 2019) with: Igor Minar@IgorMinar )

Passing isArray() Decision Function Does Not Ensure Member Methods In Lucee CFML 5.3.3.62

By Ben Nadel on
Tags: ColdFusion

This post is primarily a note-to-self so I don't make this mistake again. But, the other day, when I was working on the memory-leak detector code for Lucee CFML, I ran into a fun edge-case having to do with Reflection-style programming. In that post, I used Lucee's Decision functions (ex, isArray(), isStruct(), isBinary()) in order to figure out how to generate a string-based representation of a complex value. What I discovered, once my code hit production, is that passing the isArray() decision function does not ensure that the given value has array member methods in Lucee 5.3.3.62.

To demonstrate this, I can recreate the case that bit me: binary values. A binary value is an array of Bytes. But, it's a native Java Array - not a "ColdFusion Array". As such, it passes the isArray() call, but doesn't expose methods like .len():

<cfscript>

	value = charsetDecode( "hello world", "utf-8" );

	if ( isArray( value ) ) {

		echo( "Length: #value.len()#" );

	}

</cfscript>

The charsetDecode() function converts the given String to its Binary representation (a byte array). And, when we run the above code, we get the following ColdFusion error:

Array member method error in Lucee CFML 5.3.3.62.

As you can see, the binary value passed the isArray() decision function; but, didn't provide the Array member-method, .len().

On one hand, this is a surprising behavior since you might expect the decision functions to allow you to make safe decisions about data handling. However, on the other hand, this is not that surprising since a ColdFusion Array is just a type of Array, not a base definition for all Arrays.

Consider the concept of Promises. In JavaScript, we often have to normalize Promises in order to guarantee a set of methods. For example, if we have a Promise that may or may not have been generated by Bluebird, we would have to normalize the value as a Bluebird Promise:

var trustedPromise = Bluebird.resolve( untrustedPromise )

In this case, both untrustedPromise and trustedPromise are Promises. However, they potentially have a different set of public methods. This is because, a Bluebird Promise is a type of Promise - not the base class for all Promises.

Similarly, if we have an untrusted array in Lucee CFML, we could always normalize the value using arraySlice():

var trustedArray = ArraySlice( untrustedArray, 1 )

Assuming the untrusted array has a non-zero length, this will result in a trustedArray value that guarantees Lucee CFML Array member methods.

Of course, when we can't trust the Array, we could always just fallback to using the more traditional global Array function, arrayLen(). The arrayLen() function is more flexible in the type of values that it can process; and, will happily report the length of a Binary value (byte array).

Ultimately, it comes down to trust: how much do you trust the values you are working with? If you wrote the code that generated the values, then the trust is complete and implicit. However, if you're writing some sort of reflection-style code, as I was, then you're consuming values that you didn't create. As such, you can't trust them. And, if you can't trust them, you either have to cast them to values you can trust; or, you have to fallback to using functions that are more flexible.

In short, this was not a bug in Lucee CFML - this was a bug in the way I was thinking about the data, its source, and its consumption context. And, hopefully by writing this down, I won't make this mistake again.



Reader Comments

In short, this was not a bug in Lucee CFML

I would disagree here :) What you have described has been one of the biggest screw ups in member functions IMO. Adobe and Lucee both fell into this trap and it's drove me crazy the number of times it's bitten me. The issue is, people assume that

VariableOfCertainType.typeSpecificMemberFunction()

is the same as

typeSpecificFunction( VariableOfCertainType )

i.e.

myString.len()

is the same as

len( myString )

Sadly that's not the case as you found as the compiler doesn't unpack the member function to use the corresponding BIF in the bytecode, but instead at the low level Java objects, the member functions have been added there. The problem is that CFML is loosely typed and while len() will accept any data type that can be successfully converted to a string, the len() member function ONLY exists on actual specific strings. Give it an integer that came back from a DB query and boom! There is no way to effectively code against this as isSimpleValue() will say "true" which really makes me mad from a poor language design standpoint.

Adobe improved this in 2018 (yay Adobe!) by ensuring string member functions will work on booleans and numbers, but Lucee is still trailing in that area, and both engines won't let you do struct or array member functions unless the object you're dealing with is a real live first class CFML struct or array. That means, any Java lib that gives you a HashMap or ArrayList and suddenly your code breaks even though it looks like an array, smells like an array, and talks like an array. Other loosely typed languages like JS don't have this issue like this as they have no corresponding headless functions and they have much much fewer possible datat ypes. An object or array in JS is simply an object or array with no such thing as vast numbers of subclasses implementing shared interfaces like Java has.

You can't imagine the hours I have wasted arguing with the engineers on both sides that based on CFML's loose typing and "convert-on-the-fly-as-necessary" behavior, member functions need to work the same as their headless counterparts, converting the object as necessary so they always work. I haven't been 100% successful in this however, which unfortunately leaves member functions with a giant asterisk next to them any time you deal with data coming into a UDF that you don't know where it came from. Is that array really a CFML array? is that struct really a CFML struct? There's no way to tell and if it's not, your code won't work. Makes me so sad :sad-panda:

Reply to this Comment

Oh, I totally forgot to add, my typical insurance against this sort of thing is to run them through a BIF as you suggested which will return a "real" CF object. Annoying, but generally effective so long as you don't need to go N-levels deep.

myStruct = {}.append( myJavaHashMap )
myArray = [].append( myJavaArrayList, 1 )
myString = trim( myJavaDouble )
or
myString = myJavaDouble & ''

Reply to this Comment

@Brad,

I definitely share your frustration. In a perfect world, it would definitely "just work". But, that seems like a massive functionality-gap to overcome. That said, the people who write the Lucee platform know way more than I do; so, what seems like a large technical problem to Me may just be more of a philosophical problem to them (as it sounds like it might be from what you are saying).

It's a strange place to be. On the one hand, I've always loved how loosely typed ColdFusion is; but, on the other hand, I also appreciate that it has moved a little closer to stronger types (like null support and better JSON support). I think trying to straddle both those worlds is a sticky situation.

The goods news, in the vast majority of cases, I do know where the data came from and I am able to use the member methods, which I enjoy much more than the BIFs.

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
NEW: Some basic markdown formatting is now supported: bold, italic, blockquotes, lists, fenced code-blocks. Read more about markdown syntax »
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.