Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at CFUNITED 2010 (Landsdown, VA) with:

SQL Server Text Matching Is Case INSENSITIVE

By Ben Nadel on
Tags: SQL

I am officially retarded. I can't believe that I didn't know that the text equals operator "=" in SQL Server queries was case insensitive. I am not sure where I got this in my head, but I always assumed that "=" was case sensitive for text comparison and LIKE was case INsensitive. But in reality, these three SQL statements all return the same result:

  • SELECT
  • id
  • FROM
  • address
  • WHERE
  • street1 = '123 Street St'
  •  
  •  
  • SELECT
  • id
  • FROM
  • address
  • WHERE
  • street1 = '123 STREET ST'
  •  
  •  
  • SELECT
  • id
  • FROM
  • address
  • WHERE
  • street1 = '123 StReEt sT'

Ugggg! To think of the times I have used LOWER() on a field to search for case insensitive text matches. Thank God I don't deal with too many queries that needed to do case insensitive searches (at least the fallout of my ignorance will not wreak too much havoc).

I just hate making mistakes like that or finding such glaring knowledge gaps :(




Reader Comments

If I recall correctly on SQL server 2000, when installing you can decide if your install to be case sensitive can't remember if this is per server or database though.

@Bob,

I will look into that. I have heard of using COLLATE in order to search with extended ASCII characters (I think) but I have never actually used it.

@Tony,

The stars were finally in the right alignment... I have arrived.

@Kola,

Thanks for the heads up. I will talk to my boss about it as he administers the database server installations. And, when I create a new database, I just use the default settings - I have never seen what kinds of things can be set.

I thought all databases were this way, except for PostGreSQL. I've read that this is one of the big things people /don't'/ like about PGS is that SLQ queries are case sensitive by default. Someone correct me if I'm wrong.

By default SQL Server 2000 (and 2005) installs with a case insensitive collation. It's the least problematic anyway but the most restricted in order of dev options like case sensitiveness, alphabetic order, etc.

So I guess when installed your RDBS you left it with its default options. Chances to convert ONLY your db into a case-sensitive collation are high, the opposite is alomost impossible though (from a case-seinsitve db collation to case insensitive db collation).

A good solution (my opinion) is just download SQL Server 2000 Books Online

http://www.microsoft.com/downloads/details.aspx?FamilyID=A6F79CB1-A420-445F-8A4B-BD77A7DA194B&displaylang=en

and read about the installation options. Or goolge, there are plenty of ideas.

The best advice though is simply switch to SQL Server 2005 (if you have an option for this), it's much less problematic, have tons of options to fix such issues without the need to reinstall the server or play around with the system tables, etc. Before that though don;t forget to back up your databases, right :-)

Oracle is case sensitive too. Remember, the LIKE clause is for string comparison. It does not deal with case directly, as it is designed to find a string pattern. In fact, LIKE should be case sensitive. Honestly, I feel that all RDBMS should process everything with case sensitivity. This can help you find poorly formatted or erroneous data. jmtc.

You don't need to reininstall to do a case sensitve search. Here is an example:

SELECT Col1, Col2, Col3
FROM MyTable
WHERE Col3 COLLATE SQL_Latin1_General_CP1_CS_AS LIKE '%foo%'

Here you set Col3 case sensitive. You can also use the "=" statement in stead of LIKE.

Hi,
COLLATE is the Keyword which help you to compare case sensitivity.
Syntax: COLLATE (Collation Value)
EX: COLLATE SQL_Latin1_General_CP1_CS_AS
you should keep this part between the comparision statement
EX: SELECT * FROM USERS
WHERE USER_NAME COLLATE SQL_Latin1_General_CP1_CS_A = 'Your String'....

Cheers,
Rajendra Prasad panchati.

@Rajendra,

Collation is something that I've seen before, but never really understood. I should take some time to learn more about it.

Ben,This has worked for me:

SELECT * FROM Report_Lender_View_Field
WHERE (CAST(name_of_col AS VARBINARY(10)) = CAST('searchstring' AS VARBINARY(10)))

As developers we might not have access for the changing collate stuff.

Cheers,
Ashok

@Jagadish, @Ashok,

I've never understood the varbinary data type. I've used it before when I find it in an example; but, I don't fully get what it is. Based on the name, I understand that it is a variable-length binary data value; but, I am not sure why converting things to varbinary has "unexpected" benefits (as far as I see).

Casting to varbinary will convert the string in binary data. Since 'A' and 'a' are diffents characters, the cast will result in different binary data, so case sensitive comparison.
Not really unexpected behavior.

@Pascal,

Ah, I think I had it backwards in my mind. I thought people were casting to varbinary to do case INsensitive comparison. Yeah, sensitive comparison makes more sense. Sorry for the confusion.

Hi,

One small question here. You have mentioned usage of:
(a) VARBINARY
(b) COLLATE SQL_Latin1_General_CP1_CS_AS

Could you also please state - Of these two models, which is better and why?

Thanks

@Manoj,

I'm afraid that question is beyond my understanding. From what I know, a varbinary is basically a string value stored as a byte-array rather than a character array.... or something like that. Those data types are outside my experience. Sorry.

The collation determines the rules used to compare different strings(case-sensitivity/ ordering of accents)

The varbinary treats the data as raw bytes and comparison will be done on value.

If you are using strings, you should go with collation as it offers you control over different aspects of the ordering. By convertng to varbinary you will only order your strings on byte value and might end with 'รง' way after 'z' when it is not your intention.

Excellent information here. Ben, I enjoyed reading your admission of being retarded even though it's far from true, and I was even more perplexed because my queries were actually case sensitive... 'Airport', 'airport', and 'AIRPORT' produced results varying from all to some to none. CF docs identify that query of queries are case sensitive. So... stumbled upon a way to force case sensitivity by querying a master (case insensitive by default) then query the master (case sensitive by default).