22 February 2011

Nuances of NULLity (2026)

In the 21 February quiz, we asked:

Players wrote with the following concerns and my responses:

1. "I think there was a slight mistake in this quiz...it is hard to see the diff between '' and ' ', but this makes big difference in result. see SQL below. and results. That's why I didn't choose 3rd answer correctly. select replace ('abc#def#', '#', '') from dual; Result: abcdef select replace ('abc#def#', '#', ' ') from dual"

SF: Well, I don't think this was a mistake, per se, but I can see how it might have been difficult to tell. I am going to add to my list of "cautions" when writing and reviewing quizzes that we should avoid relying on the appearance of white space in our quizzes.

2. "It was asked to replace all occurrences of # with NULL. I wasn't aware that an empty string ('') was the same as NULL, While I understand what the spec says, (NULL or empty string will replace all occurrences, is it really equivalent to say it is setting it to NULL?"

SF: ah, yes, the old "NULL vs. zero-length string" issue. There are many viewpoints on this (see this thread for such a discussion, including a fine contribution by Justin Cave, winner of 3rd place in most recent playoff). The bottom line is that Oracle treats these as the same today and very likely always will.

[Side note: the link to Justin's name is the first time I have taken advantage of our public profile URL in a blog. :-) I hope you are doing the same, on your own blogs or websites. Give everyone an opportunity to learn about your PL/SQL career and your activity on the Challenge!]

3. "I would like to share with you that I just got (maybe too) suspicious because of the evil word 'NULL' and your formulation of the question as 'Which of the following uses of REPLACE will replace all the # symbols with NULL?' How can you tell if a certain character is being replaced by a null? All we can we see is the output. So why not ask: which of these usages would lead to 'abcabc' being printed. Or: which of these usages would lead to removing all of the '#' occurrences from the input string. Well never mind. Other developers probably do understand the question." and similarly "On today's quiz, I assume "replace all # characters with NULL" means "remove all # characters from the string, leaving the other characters." But this wasn't completely clear to me."

SF: Yes, I think your formulation is better - and I also need to be more careful about referring to NULL as opposed to "the NULL value" - but I don't believe that any changes in scoring is required.

4. Finally, a reviewer and player both noted that I could have (and felt that I should have) included another choice:
BEGIN DBMS_OUTPUT.put_line (replace ('abc#def#', '#'));
That is, test your knowledge of the impact of relying on the default value for the third argument.

SF: Ah well, that would have been good to do, you are right. I will create another quiz on this particular nuance of REPLACE!

Your thoughts, dear players?

9 comments:

  1. Oracle doesn't always treat them the same, for example in PL/SQL you can assign a NULL to a BLOB but not a ''.

    ReplyDelete
  2. Re #1: I see no problem here. You provide the question as an image, so the appearance of white space should be the same on all browsers and devices. Unless you manage to OCR and copy the text into some IDE...

    Re#2: The link to Justion's profile works perfectly in IE8, but not in Firefox 3.6, where it's somehow redirected to the main page of the PL/SQL Challenge.

    Regards, Urs

    ReplyDelete
  3. I'm not convinced. As a programmer I always have to test for null values as they are different from empty space. And the lesson I'm getting from this exercise is that you cannot insert a null value into a string. Which makes sense since null cannot be compared to or add to anything in oracle.

    ReplyDelete
  4. Problem with link to Justin's page should be fixed now.

    ReplyDelete
  5. It seems to me that internal to PL/SQL the values NULL and '' aren't identical but the only place where I have recognized the difference was in a project where I called an external dll on a windows server. I was a little bit lazy and don't passed an indicator for NULL for a parameter. When I first tested it all was ok until I passed a value of NULL to the parameter (it was a string): it crashes with an access violation (access to address $0). But when I passed an empty string to it, all works fine.

    So it looks to me that internally an assignment of an empty string to a variable initializes a pointer to an memoray area where the first byte is #0 (empty pchar string) and if you don't initialize a variable it points to address $0. But perhaps can Bryn clarify this point.

    Also astonishing is the description of the VARCHAR (without the "2") datatype - in the 8.1.7 SQL Reference (September 2000):

    "The VARCHAR datatype is currently synonymous with the VARCHAR2 datatype.
    Oracle recommends that you use VARCHAR2 rather than VARCHAR. In the future,
    VARCHAR might be defined as a separate datatype used for variable-length
    character strings compared with different comparison semantics."

    In the 11g R2 SQL Reference (October 2009):

    "Do not use the VARCHAR data type. Use the VARCHAR2 data type instead. Although the
    VARCHAR data type is currently synonymous with VARCHAR2, the VARCHAR data type
    is scheduled to be redefined as a separate data type used for variable-length character
    strings compared with different comparison semantics."

    Perhaps this will be the type where NULL and an empty string won't be the same (perhaps in 2018?).

    The 11g reference also says:

    "Note: Oracle Database currently treats a character value with a
    length of zero as null. However, this may not continue to be true in
    future releases, and Oracle recommends that you do not treat empty
    strings the same as nulls."

    But how can we treat them different if we can't see if they are different (and a comparison to '' don't works).

    ReplyDelete
  6. Hello All,
    Very interesting philosophy about NULLS !

    There clearly exists an inconsistency between
    the way that PL/SQL treats an empty CHAR(1)
    string and SQL-s way.

    PL/SQL not only treats the CHAR(1) variable
    having an empty string assigned to it as NOT NULL, but it also pads it with one blank,
    in conformity with the CHAR padding semantics.

    Along the lines of Justin's sample:

    DECLARE
    empty_char CHAR(1) := ''; -- empty string here !
    BEGIN
    IF empty_char = ' ' -- one blank here !
    THEN
    DBMS_OUTPUT.put_line('empty_char EQUALS one blank');
    ELSE
    DBMS_OUTPUT.put_line('empty_char DOES NOT EQUAL one blank');
    END IF;
    END;
    /
    empty_char EQUALS one blank

    PL/SQL procedure successfully completed.


    But:

    BEGIN
    IF '' = ' ' -- compare an empty string with one blank
    THEN
    DBMS_OUTPUT.put_line('empty string EQUALS one blank');
    ELSE
    DBMS_OUTPUT.put_line('empty string DOES NOT EQUAL one blank');
    END IF;
    END;
    /
    empty string DOES NOT EQUAL one blank

    PL/SQL procedure successfully completed.

    or even simply:

    BEGIN
    IF '' IS NULL THEN -- empty string here !
    DBMS_OUTPUT.PUT_LINE('IS NULL');
    ELSE
    DBMS_OUTPUT.PUT_LINE('IS NOT NULL');
    END IF;
    END;
    /
    IS NULL

    PL/SQL procedure successfully completed.

    So, even inside the PL/SQL layer, the behavior is not completely consistent.

    When assigning the empty string to a CHAR(1)
    variable, it exhibits a CHAR semantics behavior and is padded with 1 blank.
    But when using it directly, like in the last
    "IF" tests above, it behaves like a VARCHAR2,
    though, as per the PL/SQL documentation, string literals are considered as CHAR datatypes
    and obey CHAR comparison semantics (ignoring trailing blanks), even at the SQL layer.

    As by our experience, we have lots of problems
    in our applications in places where (due to unclear historical reasons) many database columns were declared as CHAR type, instead of VARCHAR2 ... lots of trimming, padding and other "performance killers" ...

    So, my personal opinion is that, at least in the database, VARCHAR2 should ALWAYS be preferred
    upon CHAR datatype (and the NULL issue is one of them), except maybe for some very special cases.

    Best Regards,
    Iudith Mentzel

    ReplyDelete
  7. Another one difference in SQL:

    SQL> select dump(''||'a') char_value, dump(null||'a') varchar2_value from dual;

    CHAR_VALUE VARCHAR2_VALUE
    ---------------- ---------------
    Typ=96 Len=1: 97 Typ=1 Len=1: 97

    ReplyDelete
  8. @Iudith

    Your first testcase describes the behaviour of the CHAR datatype(fixed size, padded with spaces until the specified size) and not the empty string behaviour.

    ReplyDelete
  9. If you do a
    create view vnull as select '' c from dual;
    followed by a DESC VNULL, you can see that a '' is a CHAR(0), rather than a CHAR(1).

    ReplyDelete