Five Months Without Hungarian Notation And I'm Loving It
Posted November 19, 2009 at 6:58 PM by Ben Nadel
About five months ago, I expressed a growing concern that my coding methodology at the time was forcing me to create contradictions within my naming conventions. As such, I decided to completely dump my pseudo Hungarian notation for a while and adopt a more traditional, headless-camel-case approach to see how it felt. That was back in June; now, in November, I have to admit that I am really liking it. Consistency is just so freakin' sexy, and adopting headless-camel-case across the board has made things extremely consistent.
| || || || || |
| || |
| || || |
I abhor camel casing as much as Hungarian notation, at least insofar as it was used by Microsoft.
Look at some old WINAPI code and you'll see lovely variable names like this:
lpsz means long pointer to a zero-terminated string.
When I was doing C++ programming, I adopted something one of the authors I liked recommended: Use underscores. I know that is verboten in the CF/Java world.
loop_index is so much nicer to me than loopIndex or LoopIndex.
Another annoyance was when coders would shorten variable names by basically removing the vowels. Ugh.
Ben, I prefer the name 'nerdCaps', but if you're calling it headless camel case I fixed your picture...
...maybe I've been playing too much Left4Dead2? :)
I had to click the link. I just had to....
Atta boy. I've been a HUGE fan of headlessCamelCase since I discovered it right at the beginning of my career. It's just so readable and fluid to type. Pretty much the only time I don't use it ( and use underscores instead ) is in certain types of dynamic and generated code where a physical delimiter simplifies parsing and stitching.
nerdCaps FTW fuhSho.
iLoveHeadlessCamelCase = true;
i_dont_like_using_underscores = true;
I_USE_UNDERSCORES_FOR_CONSTANTS = true;
/* Hmmm, still first one looks better. I couldn't read 500 lines full of the 2nd one. But I do use underscores in the html view templates because the underscores makes it much easier to spot in a bowl of spaghetti :) */
@Adam +1000 for your pic! LOL
Love the pic @Adam.
Personally, I prefer headless Camel Case. For the most part I stopped identifying my variables with prefixes (ie. str int boo) when I started with CF.
Yeah, shortening variable names is insane! I have seen developers shorten a variable by removing ONE vowel. Really? Really? Is that necessary?
Ha ha ha ha, classic! Great PS job. Did you use the healing brush?
I used to be a huge fan of prefixing with data types (ie. strName, intCount). But, after a few months of rocking out "name" and "count", it's been really good times.
The only problem I have is when I have variable names that are different only by data type. For example, when I convert a response string to binary for use with CFContent, I start to have to pick names like:
... but in cases like that, I suppose it's one of the few times that the data type IS actually relevant.
> Did you use the healing brush?
Actually, I think that was the torture and disfigure brush. It's hidden in one of the sub-menus.
If Dave Watts were here he'd link to this article and remind us that true Hungarian notation isn't actually just prefixing variable names with their type.
My mentor started me on headlessCamelCase and I've never felt the need to deviate. It Just Works.?
Yeah, I remember learning that; although, I remember the actual explanation of Hungarian notation being a bit too conceptual for me. Either way, been liking what I am doing now.
I hear you.
I use the following and love it:
One thing I always try is to CamelCaseBuiltInColdFusionFunctions() so others can tell the difference and don't spend time looking through inherited components just because they forgot a built in function existed.
Where I work we have a rule, no abbreviations for anything and names that have meaning; it is funny how quickly you can get used to functions and variables that GoOnForeverButClearlyIdentifyWhatTheyAreFor. Makes code like a book that almost eliminates the need for comments. At fisrt I hated it but mostly because the rule was laid down by the .NET team, but after a while I got used to it.
Yeah, I always err on the side of longer but more readable variable names. As for the camel casing of CF methods and the headless camel casing of custom items, I get around this by always scoping my custom methods (ex. this.doSomething()).
I've used headless camel case for years for not only ColdFusion variables, but also SQL tables and fields... pretty much everything involving code. I also subscribe to the "don't abbreviate and clearly describe your variables with their names" mentality.
I've always hated Hungarian notation -- what a mess. Variables should be named pretty well so that they already hint to the data type, without having to put goofy abbreviations in front of the variable name. For example, my boolean / bit variables are always named something such as "isActive", "isPublished", etc, where reading them essentially asks a question and makes it instantly obvious that they are true/false boolean variables.
I also enjoy putting "is" or "has" before boolean-style values. It just feels good.
+1 for headless camel case ... been using it for years. Worked for awhile with the underscore in file names, but it's just not easy to type. Working on a code for several hours a day means always moving the hands up to get the _, and I'd rather not ...
Just like you, Ben, I strongly dislike the tblTableName ... I *know* it's a table already, thanks. And same for Boolean vars: 'active' could be a column holding activity types or something, but isActive always feels 'right' to hold the Boolean.
Ah, the joys of personal preference ;)
I used CamelCase and headlessCamelCase back when I worked on SQL Server - the former for table names, the latter for column names. So I used it in ColdFusion as well just for consistency's sake. But now that I work primarily on Oracle, I find the underscore indispensable (Oracle table- and column names are generally stored as all caps so underscores are necessary), and my variable names are mostly all_lowercase_with_underscores.
I can't stand Hungarian notation at all.
Can you expand on why you use "_" in Oracle?
I ask mostly because SQL is the one place where I feel most emotionally tied to using "_". I cannot explain it more than well, it's the way I learned to do it (not that that has anything to do with right vs. wrong).
I know when I fully adopt headless Camel Case across the board, table names / column names are going to be the one place where it feels the most awkward.
Luckily, with ColdFusion 9's ORM integration, table column names == object properties, which will force me to use standard naming (and not emotionally fall back on "_" usage).
We use "_" in Oracle since all object names (including table names and column names) are all-uppercase by default (unless you wrap double-quotes around them). For example, if I do:
create table FooBar (etc.)
in Oracle, then a table named "FOOBAR" is created. Now, in some cases this might be OK. But when you have table names made up of 3-4 words then the underscores really help. Compare LOREMIPSUMDOLORSIT to LOREM_IPSUM_DOLOR_SIT - one of those is definitely easier to read.
Now, you could create lowercase or CamelCase tables in Oracle if you use quotes:
create table "FooBar" (etc.)
Here a table named "FooBar" is created. The problem with this is that you must always use quotes when referencing the table:
SELECT * FROM "FooBar"
and that can get a bit tiresome IMHO.
SQL Server doesn't have this issue, but I tend to use underscores there as well to ensure portability.
Really??? That's is kind of crazy! I am, of course, referring to Oracle's man-handling of names. I wonder what the philosophy behind that is.
In that context, however, using underscores makes a lot of sense.
There's also the fun that Oracle puts a limit on the length of table and column (and other object) names: 30 characters. Now, I'm not a big fan of giant names to any object, but I do prefer descritpive titles over arcane abbreviations. Especially in mapping tables, you can't always pull off TABLE_A_TO_TABLE_B, if the table names are long, and the same with index names, like FK_TABLE_A_TABLE_B. Not an issue unless you're on Oracle, but can make design for multi-RDBMS platforms a real pain.
Yes, it is a bit crazy. And, of course, if you want to find anything out about a table in Oracle by querying Oracle's system views, you have to put the table name in all-uppercase (since comparisons in SQL queries are case-sensitive)!
The 30-character limit is really absurd in this day and age. Historically I've used the dollar sign ($) for mapping tables to shorten their names, e.g. TABLE_A$TABLE_B. This works on SQL Server as well as Oracle.
Is Oracle still a main stream database system? I don't ask that as a joke - I've never used it. I've really only ever heard of MS SQL and MySQL. Maybe one or two others. I've heard or Oracle, of course, but I rarely talk to people who use it.
Does it have a selling point for a particular set of use cases?
Yes, yes, yes, very widely used! Large enterprises that spent $2-$3 million on Oracle years ago and continue to pay Oracle DBAs to babysit the thing will allow only Oracle to be used going forward ... they have to make that investment look like it's worth it. Gets more difficult when Enterprise SQL Server can be installed for well under $100k ...
Ha ha, gotcha.
Well, Oracle does have some features that SQL Server lacks. For example, although SQL Server seems to have gotten around the silly 8060-byte limit for table rows, pages are still always 8K and extents are 8 pages or 64K, while in Oracle you can specify larger extents to ensure contiguous storage for large objects, meaning fewer chained rows.
I think SQL Server and MySQL tend to have more prominence in the CF world since a hosted solution will probably use one or the other.
Good point re: hosting. I wonder if there are any shared hosting solutions that offer Oracle?
But even in the corporate world, the cost of bringing up a Windows Server box with a $5,000 SQL Server license makes it very attractive. I have also found over the years that SQL Server does a great job of handling all the thread requests that a web application throws at it, while I have yet to meet an Oracle DBA that understood what needed to happen to get Oracle to handle the open/close of many requests per second. I know Oracle syntax and a lot of tricks to help it out, like doing massive queries and then managing all the data processing in my middleware, but the tuning stuff was always a nightmare.
Everything you write is perfectly true, and SQL Server has some nice tools built into it that make administering it a pretty easy task (in most cases). However Oracle does have its advantages even apart from the tired old excuse of making an investment look like it's worth it. I will say that from a developer's point of view SQL Server has made some great strides; the difference between 6.5 and SQL Server 2008 is staggering, while the same can't really be said for the difference between Oracle 7.3 and 11g. Again, this is from a developer's point of view; a DBA might have different ideas (especially since one can more-or-less run SQL Server without a DBA; the same can't be said of Oracle).
I believe there are some shared hosting solutions available for Oracle, but not nearly so many as for SQL Server and MySQL. One solution that is readily available is Oracle XE (eXpress Edition) - if you have a hosted Windows or Linux VPS you could download it for free and install it there.
Absolutely true on the tools. I was lucky enough to get introduced to SQL at version 7 and by the time we were ready for implementation, we were on 2000, with a very nice toolset. I know I definitely dodged a bullet on 6.5 and earlier. All I know on Oracle, is that several clients each running 1 website apiece on CF could rarely get those sites to stop pegging the DB server, while we had SQL boxes running upwards of 50-60 databases for as many or more applications without a hiccup. I know that's not Oracle's fault per se, but it does speak to something about the architectures and their preferences for web connections, or so I've been told by several Oracle DBAs who thought I was nuts to suggest that a web application makes several requests to the database per minute.
Nothing against it, not a MS fanboy, but it just always gave us headaches that Oracle just didn't.
All of which is OT, of course ... sorry, Ben!
That should be "that SQL Server just didn't" ... end of a long day ...
Sorry I'm so late to the game, though this is a very interesting topic. Everyone brings to this table their experiences and conclusions as to what makes the most sense. I'd like to share mine, as they culminate in what is sometimes a struggle with respect to case enforcement in programming environments. Having spent five years apiece in both Oracle and M$ SQL Server shops, I believe anyone having spent time in the former will likely say that underscores are used in their database_object_names. This has the tendency of really screwing with your code style in programming languages if the prevalent style in that language is CamelCase or headlessCamelCase.
One thing that bothers me when attempting any variation of the CamelCase strategy is an environment that does not enforce case, such as ColdFusion or SQL Server. In my experience, this lack of enforcement leads straight toward inconsistency in implementation. Across our many SQL Server database servers, I may see these (or more) variations on a theme:
I'm not kidding. Of course, this speaks to a lack of team cohesion. I agree, however in long term scenarios the lack of explicit enforcement has in my experience always lead to this end. Anyone having worked in a case enforced language or environment is bound to find this frustrating at best. With that in mind, early on I developed a lowercase_underscore_strategy across the board with very few notable exceptions. Specifically, with method and argument names in ColdFusion, I stuck with headlessCamelCase for method and argument names.
This was an attempt to follow ColdFusion's methodology. Note that native ColdFusion function names are CamelCase, and I did headless on purpose to signify that a function was custom vs. native. Unfortuantely, I find ColdFusion itself to be inconsistent with respect to things like abbreviations. Take for instance these functions: IsXML vs. IsXmlAttribute. Argument capitalization is also inconsistent. While most arguments that are logically two or more words are headlessCamelCase, some are listed in the documentation as all lowercase. The mustunderstand argument for the AddSOAPRequestHeader function is one such example.
The documentation itself proves in the case inconsistency point... Maddening! Whatever methodology is employed, I would simply encourage consistency. I find that, almost unequivocally, folks that have dealt with enforced case fall in the camp of solutions that end up enforcing that consistency in some way.
I've seen countless arguments for CamelCase or headlessCamelCase in that it is more readable. I don't understand this, as the human brain is very well trained to detect "words" with_something_like_an_underscore_as_a_delimiter.
WhatAboutSomethingLikeCapitalLettersAsADelimiter? This works okay until you come across single letter words, or a lowercase l followed by an uppercase L. Edge cases? Perhaps, but over the long term, one method is 100% consistent to my eyes while the other is not. This angle almost always ends up like a vi vs. emacs debate, so I should probably stop. ;)
The developer / DBA / performance discussion is interesting. I've taken on both roles in many RDBMS, and the bottom line from my point of view is: Oracle is made for hardcore developers to tweak and tune and SQL Server is made to run out of the box. Consequently, Oracle is far more "tuneable" in my opinion, while SQL Server in the end encourages one to spend more on hardware. Who is right? Both depending on your point of view, level of in-house expertise, budgets, etc.
The only serious recommendation I can make is this: Ensure that whatever methodology is employed has some built-in way of enforcing consistency across the board.
When going through the CFWheels documentation, they define camelCase vs. PascalCase.
When I first started programming, it was in QBasic, which was not case-sensitive. I taught myself (poorly, mind you) and I didn't use Hungarian Notation of any sort. It got frustrating because I had limited clues to the type of each variable.
I went to college to learn software development and learned about Systems Hungarian. This was helpful for someone who had a history of problems recognizing variable types by looking at the names.
Having just read the article on Apps Hungarian vs. Systems Hungarian, I'm definitely going to start using Apps Hungarian in the future. Having a consistent notation is like having consistent icons: as long as you have a map somewhere, it's a lot more efficient than spelling out the meaning. You don't need to spell out "Save" when you can give the person the image of a disk, "Print" when you can give the person the image of a printer, or "Email" when you can give the person the image of an envelope. Similarly, you can use "hs" for HTML-safe string instead of spelling out "HtmlSafe" and "rs" for user-provided strings instead of spelling out "RequestString".
As for Hungarian Notation of database objects, I found a middle ground. I won't prefix most tables, but I will prefix a view (since there are limitations on those) and I will prefix the intermediary table of a many-to-many relationship. I might have a table called "People", a lookup called "lkup_Parents_Children" (always using the plural), and a view called "view_Oldest_Child". I use the underscores because a MySQL installation on Windows will convert all table names to lowercase. (Each table definition is stored in a FRM file and Windows filenames are not case-sensitive.)
As you may or maynot be able to tell from my code, I like a good, descriptive name ... and white space :D Joking aside, though, I think there are times were some sort of prefix / suffix does go a long way to help clarify what something. Taking the database, as an example, if a table has *just* foreign keys in it, I will typically name it with the relevant names and "_jn" for "join".
So, if I had the table "user" and the table "friend" and I had to join them, I would created:
Of course, if I had to "upgrade" the type of table to more of an entity table, then I would rename it with something more appropriate. For example, if I wanted to not just model users and friend, but rather "friendship", I would create a different table:
friengship: id, userID, friendID, startedAt, endedAt.
And, while I haven't used Views in a while, I used to prefix them with "v" as "v_user" ... or was is suffix... hmm, I can't remember :)