NULLS IN THE DATABASE!!!

Coding Horror One of my biggest bugaboos is nulls in a database where null is NOT a legitimate value.  Nulls do have their place; for example, a null is a legitimate value for a column that contains optional foreign key.  When the value is null, the record is not associated with any rows in the related table.

There are other places where nulls are definitely NOT a legitimate value.  Bit flags have two states: 1 or 0.  Null is not one of those states.  Let me say it again-- bit flag = 1 or 0, not 1, 0 or null.

Allowing this kind of faggotry into your database can have unintended consequences; like something that just happened to me yesterday on a client's installation.

Long story short--client couldn't see stuff they were supposed to see. There was something in my access control checking queries (yes, unfortunately, queries) that was returning a false positive when testing to see if a user was blocked.

Breaking down my multiple queries, I tested each in turn to see what result I was getting. I narrowed it down to one particular subquery that was returning less rows than expected. Logically it looked fine. Narrowing it down further I saw that one particular check wasn't behaving as I expected:

... OR ( Controlled <> 1 AND ...

My gate-keeper check on Controlled was returning false more often than expected. A quick peak at the data and I saw nulls. Floating in the table like scraps of shit float in the toilet after a co-worker fails to flush fully. Nulls. In the fucking database.

Now, normally, if I was to ask you, "Is null not equal to one?", you would answer, "What, are you stupid?" Normally, you would be. But if you were on the ANSI board deciding on the SQL standard in '92, you'd be all like, "I dunno! Durrrr!"

According to ANSI SQL-92 (where, I don't know--I looked for a few minutes then fell asleep), comparing anything with a null value using equals or not equals must return false.

This adds a third state to boolean logic. This state is called "Who the Fuck Knows". Does A = B? The answer may be True, False, or Who the Fuck Knows. In ANSI SQL-92, the logic table for comparisons looks like this (both A and B are bits):

A B A <> B
1 1 False
1 0 True
1 null Who the Fuck Knows
0 1 True
0 0 False
0 null Who the Fuck Knows
null 1 Who the Fuck Knows
null 0 Who the Fuck Knows
null null Who the Fuck Knows

Microsoft has some switches that control this behavior but they are considered harmful due to issues you may face with moving your data layer to another database system.

So, beware of nulls in the database. Not only can they screw with your casting in your logic layer, they can also screw up your queries in your database layer. And, remember kids, if somebody asks you if two things aren't equal, say "Who the Fuck Knows?", 'cause one of 'em might turn out to be null!

Programming Post by: McGurk at 01:57 PM | 18  Replies | Reply
Kick this post:

In a certain Table That Shall Reamin Nameless, the original designer (not me!) utilized this kind of behavior for sales records: Empty field=no sales, Number=(sales-returns). Which means "0" isn't the same thing as "blank."

The fun starts here: If you touch the field with an editing function, the engine will leave a number behind. (So don't do that!) It also means the table has to be completely rebuilt at the end of the year, when the current year data gets archived.

1 Posted by Old Grouch on April 19, 2007 03:12 PM (24ezj)

replace "Empty field=no sales" with "Empty field=no transactions"

2 Posted by Old Grouch on April 19, 2007 03:15 PM (24ezj)

Utterly brilliant.

3 Posted by McGurk on April 19, 2007 03:25 PM (Ri74D)

Amazingly I have seen this type of thing before. WHAT THE MOTHERFUCK?!?

If modifing the database default values and removing all of the nulls in the field is not an option due to  other code specifically looking for nulls in the field (as it was either originally designed (or not designed depending on how you look at it) to allow nulls there very easily could be code that fires when nulls in the field are found) , a work around you could use would be the ISNULL(field) = T or F (not pretty I know).  This would return the correct value and correct your query without having to make any major database mods that could break who knows what else.  In my expierence, if these type things are not considered and mapped out in the design phase, making database modifications after the product has been released can hold many surprises that could blow the scope of your current bug fix out of the water.

...or ((ISNULL(Controlled) = False and Controlled <> 1) And ...)  

this will return you all of the records where Controlled is = 0.  Of course this does make this portion of the query non-sargable (can't use an index).

In redesigning my "seat of the pants" database for my next major release I followed a lot of tips from http://www.sql-server-performance.com/datatypes.asp

Foreign Keys really are the only place that should allow Nulls.

4 Posted by NotMcGurk on April 20, 2007 11:14 AM (Ri74D)

Testing.WHAT THE MOTHERFUCK?!?

Testing 1 2 3

5 Posted by NotMcGurk on April 20, 2007 11:18 AM (Ri74D)

6 Posted by NotMcGurk on April 20, 2007 11:19 AM (Ri74D)

k.  I got it now.  This is what my previous comment was supposed to look like. I am still learning this BB stuff so bare with me. Huh?

Amazingly I have seen this type of thing before. WHAT THE MOTHERFUCK?!?

If modifing the database default values and removing all of the nulls in the field is not an option due to  other code specifically looking for nulls in the field (as it was either originally designed (or not designed depending on how you look at it) to allow nulls there very easily could be code that fires when nulls in the field are found) , a work around you could use would be the ISNULL(field) = T or F (not pretty I know).  This would return the correct value and correct your query without having to make any major database mods that could break who knows what else.  In my expierence, if these type things are not considered and mapped out in the design phase, making database modifications after the product has been released can hold many surprises that could blow the scope of your current bug fix out of the water.

...or ((ISNULL(Controlled) = False and Controlled <> 1) And ...)  

this will return you all of the records where Controlled is = 0.  Of course this does make this portion of the query non-sargable (can't use an index).

In redesigning my "seat of the pants" database for my next major release I followed a lot of tips from Here

Foreign Keys really are the only place that should allow Nulls.

 

7 Posted by NotMcGurk on April 20, 2007 11:23 AM (Ri74D)

comparison with null is always false.. who the fuck doesn't know that?

8 Posted by Durage on April 20, 2007 01:51 PM (/Be2r)

Null isn't difficult to deal with - it is simply the absence of a value.  Sure some people use them incorrectly but they are an essential part of database design.

A bit field can logically be nullable - imagine a scenario where you have a 'Accepted' bit field in a database.

True means it is accepted
False means it is rejected
Null indicates a decision has not been made

Your query was wrong, plain and simple.

If you wanted records where Controlled was True then you should have specified that insead of writing "Not false" and being surprised at getting nulls.

[)amien

9 Posted by Damien Guard on April 20, 2007 02:17 PM (yWZ4L)

Yep, I can't deny the query was wrong. Definitely wrong according to ANSI SQL-92. The fact still remains that if you ask anyone "Is one not equal to nothing?" And the logical answer is yes. Like Paul Anka said, "That's the way it fucking is".

I believe that SQL is different than any other programming language in this respect. The following:

object o=null;
bool isNotEqual = o != 1;

returns what any normal person would expect--true.  Is this not reasonable?  Of course not.  In fact, if this would return false all the time, there'd be a whole shitload more null reference exceptions floating out there.

Is how SQL handles null comparisons a good thing or a bad thing? I believe that it is a bad thing. If you don't, then eat a big bowl of dicks. Nyah.

For Damien:  The null check is in there as a gateway to prevent subqueries from executing.  I didn't design the tragedy that is the database; I'm just trying to get out of it what I need in the fastest time possible.

For Durage:  Thank God the world is populated with ultrasmart people like you who know everything there is to know.  You probably even know what I'm thinking about doing to your moms.  Aaah, what sweet, sweet thoughts I'm having...  Excuse me, must take a shower now.

10 Posted by McGurk on April 20, 2007 06:10 PM (Ri74D)

Just thinking about it... Its kind of ironic.  The SQL-92 group decided that nulls should hold no meaning or value, yet the people who support their decision do so because they believe nulls do hold meaning and that they are a legitimate value.  I guess I also agree--nulls do have meaning.  And pretending they don't is a bad idea.  Bad, ANSI, bad.

11 Posted by McGurk on April 20, 2007 06:53 PM (Ri74D)

Annoying yes, but important to logical consistency.  To understand why, look at trivalent logic.  A taste:

True OR NULL -> not "who the f knows" but True... the value of the second operand doesnn't matter if first is True

False AND NULL -> not "who the f knows" but False... again, second operand doesn't matter.

Incidentally, THE DEVIL'S PROGRAMMING LANGUAGE implements the above logic and is big on trivalence.  Not so with the newer incarnation.

While this is tangential to the point you raised, I am pretty sure that any move towards acting like NULL has a value can lead to some logical consistenency errors in the formal system.  You'd have to find a real logician to explain it any better.

12 Posted by Chase Saunders on April 20, 2007 09:08 PM (7uy1T)

Lol v b 6 is vb6

13 Posted by McGurk on April 21, 2007 01:07 PM (8GzzQ)

Damn, it is case sensitive!  vb6 is THE DEVIL'S PROGRAMMING LANGUAGE

14 Posted by McGurk on April 21, 2007 01:07 PM (8GzzQ)

When I was designing my own programming language back in the good old days (2002), I had four-state logic: True, false, don't know, and don't care.

Don't know was the same as null in SQL.  Comparisons where one side is don't know return don't know - except for certain cases where both sides are logical expressions.  An if test on don't know always fails.

Don't care is the reverse; again, comparisons return don't care, but if tests succeed.

The truth tables get freaky, even if they are perfectly orthogonal. True and don't care is true.  False and don't care is false.  True and don't know is don't know.  False and don't know is false.  False or don't care is don't care.  Don't know or don't care is don't care; don't know and don't care is don't know...

15 Posted by Pixy Misa on April 23, 2007 05:39 PM (PiXy!)

It makes more sense after you realize the difference between null in .net and null in a datamodel.  In .net, null indicates the abscence of a value.  An order object with a null customer object is an order without a customer.  In sql, a null value indicates an unknown value (not the abscence of a value).  An order record with a null CustomerID has an unknown CustomerID, not the abscence of a CustomerID.  The difference is subtle and most people miss it until they get nailed by it.

This is the same reason why in c# we use:

if (order.Customer == null) { ... } // no value

and in sql we use

select * from orders where customerid IS NULL -- is unknown

16 Posted by John on April 24, 2007 02:48 AM (zPxIb)

oh, and don't blame the sql guys.  this is the fault of your local math geek and his relational algebra..

17 Posted by John on April 24, 2007 03:00 AM (zPxIb)

Gah.  I think its a conspiracy to keep SQL guys employed.  If it wasn't for gotchas like this, nobody'd hire those uber-normalizing freaks.

18 Posted by McGurk on April 24, 2007 05:57 PM (Ri74D)

Hide Comments | Add Comment

Press butan, recieve imagelet. Hover for preview. Imagelets are pasted at the end of your comment. Think ahead.


Post is locked.