09 September 2011

Mysteries of Implicit Conversions (7254)

The 8 August quiz asked you to determine how many conversions Oracle performs for us implicitly when the following block is executed:
DECLARE
   l_timestamp   TIMESTAMP := '2011-08-01';
   l_increment   PLS_INTEGER := SUBSTR ('10x10=100', 4, '2');
BEGIN
   DBMS_OUTPUT.put_line (
      TO_DATE (l_timestamp) + (l_increment || 0));
END;
/
We scored 8 as the correct answer. Two players wrote with the following observations:

1. The strange thing here is the fact that over the different versions, Oracle has changed the overloadings of PUT_LINE in the DBMS_OUTPUT package. Specifically, in Oracle 7.3.4.0.0 there were 3 overloads, for VARCHAR2, NUMBER and DATE. This was even explicitly mentioned in the following: http://www.en8848.com.cn/reilly%20books/books/oracle/advprog/ch02_08.htm In Oracle 8.1.7.4.1, there remained only two overloads, for VARCHAR2 and NUMBER. In Oracle 10g and higher there remained only a single overload, for VARCHAR2. Why did Oracle do this ? Just for introducing an additional implicit conversion ? I even remember a version where using DBMS_OUTPUT.put_line( a_number ) used to fail, and you had to use an explicit DBMS_OUTPUT.put_line( TO_CHAR(a_number) ). In the older versions, when the overloads exist, the argument passed was indeed stored in the DBMS_OUTPUT buffer in its native data type, and only converted to a string when retrieved for display (ex. by SQL*PLUS ). This one is probably the one that many players have missed ... In fact, we use this "bad practice" day by day in the quiz, DBMS_OUTPUT.put_line is the top star of each pl/sql block ... Does anybody think that we should change this ? I would hardly believe it ... 

By the way, enabling the PLSQL_WARNINGS do NOT produce any warnings for any of these 8 implicit conversions, strange by itself ... ( tested in Oracle 11.1.0.7.0 ). 

Anyway, this quiz was a too high price payed for a small issue ... Like in the last playoff, 0 or 7 versus the correct 8 were scored the same :( :( ... which is not completely right in my opinion, because here recognizing each implicit conversion by itself has its own value as "piece of knowledge", and cannot/should not be completely cancelled ... Specifically for such "total" quizzes, better than making them single choice is to allow for different choices with different "weights" (and scoring), that would reflect knowledge more accurately.

2. I think there may be two additional implicit conversions that are taking place, revolving around the Date + Number in the dbms_output.put_line statement. Although it is often said that dates are internally stored in a numeric format, it is not the same format as a numeric datatype (See reference to Concepts manual). In addition, although the (+) operation can be used with date variables, the actual (+) operation only takes place with numeric datatypes or any datatype that can be implicitly converted to a numeric datatype (see reference SQL Reference manual). Thus, to perform the operation Date + Number, Oracle is internally converting the Date to a number, performing the addition operation, and then converting the result back to a Date. This implicit conversion is different from the others referenced in the question, as there is no (to_xxx) operation that can explicitly perform this operation; it is all done internally to perform (+) operations with Date datatypes. Thus, the total number of implicit conversions for the block of code is 10. 

From 10.2 Concepts Manual Chapter 26 "Oracle stores numeric data in variable-length format. Each value is stored in scientific notation, with 1 byte used to store the exponent and up to 20 bytes to store the mantissa. The resulting value is limited to 38 digits of precision." "Date data is stored in fixed-length fields of seven bytes each, corresponding to century, year, month, day, hour, minute, and second." From 10.2 SQL Reference Chapter 4, page 3. "You can use an arithmetic operator with one or two arguments to negate, add, subtract, multiply, and divide numeric values. Some of these operators are also used in datetime and interval arithmetic. The arguments to the operator must resolve to numeric datatypes or to any datatype that can be implicitly converted to a numeric datatype."


Your thoughts?

9 comments:

  1. Hello All,

    One additional "philosophical" question:

    When we use TO_DATE(l_timestamp), this is by the theory considered an explicit conversion,
    though, doing it by relying on the current default DATE format of the session,
    without specifying explicitly a value for the format argument makes it in fact no different
    from the first assignment, l_timestamp := '2011-08-01'; , which is considered as implicit conversion.

    The only difference between the two relies in the fact that the target data type is once specified by the TO_DATE function, and once by the defined TIMESTAMP data type of variable l_timestamp.

    So, why in fact, at least in the counting context of this quiz, is one conversion considered explicit and the other one implicit ?

    Maybe all "one-way-only" conversions are in fact always explicit, even if NOT CODED
    explicitly, with a conversion function ?

    I mean, when assigning a string value to a number variable, the conversion can only be from a string to a number, and no other way.
    This is different from the case when in a comparison or other operation involving a string and a number Oracle decides to convert one way or the other, and the result probably depends on this decision.
    Here the decision in which way to convert is the implicit one, not the conversion itself.

    Just some thoughts around the issue ...

    Thanks & Best Regards,
    Iudith

    ReplyDelete
  2. At first, I did find the quiz a little bit nasty, but a closer look reveiled it to have no hidden agenda. At least I thought so. The Date + Number concern is interesting. I have always accepted the DATE data type as something numeric and the documentation quote (under 2.) says: "The arguments to the operator must resolve to numeric datatypes or to any datatype that can be implicitly converted to a numeric datatype."

    Sysdate + ‘1’ would be an obvious candidate for implicit conversion, but sysdate +1 is (to me) simple syntax. Or, one could say that the conversion is internal to Oracle. How would you differentiate between implicit and explicit? There is only the one way. To me, Date + Number are both numbers and the internal mechanics cannot be categorized as any kind of conversion.

    Maybe the best explanation is offered by the player himself. There is no explicit counterpart to the perceived implicit conversion. In that case, how can we even talk about a conversion?

    Surely something is happening, but it is internal, works in one internal way only, and is neither explicit nor implicit. You cannot affect that internal behavior and that means no conversion. At least not in context of what is addressed in the quiz.

    The internal DATE format is not available to us in a way that we could use the raw data, unless reading the value bit by bit. And, Oracle does not support implicit conversion of raw data.

    Bottom line. I don’t think implicit conversion is going on between date + number. How to prove it? I have no idea.

    This is hard. But, how often do (most of us) dive into details in a way measurable with the Challenge? Quizzes and questions are one thing and not easy to manufacture, but the (very) close attention by players is what really makes the Challenge rock :-)
    Mike (humble reviewer)

    ReplyDelete
  3. Just to add a factoid into the discussion.
    There are two date datatypes. There's a 'seven-byte' one for storage, but an eight byte one used for calculated/derived/'in flight' dates.

    select dump(created) dump_cre, dump(sysdate) dump_sys, dump(created+1) dump_cre1 from all_objects where rownum=1;

    DUMP_CRE DUMP_SYS DUMP_CRE1
    ----------------------------------- ----------------------------------- -----------------------------------
    Typ=12 Len=7: 120,111,8,27,9,21,48 Typ=13 Len=8: 219,7,9,10,9,31,55,0 Typ=13 Len=8: 219,7,8,28,8,20,47,0

    ReplyDelete
  4. You can always tell whether any piece of code will involve an implicit conversion by examining the data type of the argument vs. the data type of the parameter value. You don't need to dive into the internal implementation of PL/SQL.

    For example, with TO_DATE(l_timestamp) we have the TO_DATE function which according to the documentation only accepts a character type (e.g. CHAR, VARCHAR2, etc) as its first parameter (http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/functions183.htm#SQLRF06132). However, l_timestamp is declared as a TIMESTAMP. By definition, this means that this will require 1 implicit conversion.

    Another example is the following expression:

    TO_DATE(x) + (l_increment || 0)

    This is calling the "+" operator, with a DATE as the first argument, and a string as the second argument (the result of the || operator). Since the first argument is a date, Oracle will implicitly convert the second argument to a number (http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/sql_elements001.htm#SQLRF00208) which will be interpreted as a count of days (or parts thereof).

    ReplyDelete
  5. date + number has no conversions. The proof is in package STANDARD:

    function '+' (LEFT DATE, RIGHT NUMBER) return DATE;

    ReplyDelete
  6. Hello All,

    Related to the TO_DATE(l_timestamp) issue, mentioned by Jeffrey Kemp:

    The TIMESTAMP data type is in fact nothing more than an "extended DATE", with the fractional seconds added.
    I think that the main (or maybe single) reason for which these two are still different data types is the historical one, because the DATE data type exists from the first Oracle version and should be supported "as is" in all the subsequent versions.
    It simply did not contain a fractional second because, probably, in those good old times no one thought of this as something very important ... and maybe indeed it is not, at least for one's every day life ...

    This backward compatibility is probably also the reason for which the TO_DATE function accepts only a string (CHAR,VARCHAR2) argument, and NOT a TIMESTAMP argument,
    I think Oracle simply chose not make this natural overload addition to the implementation.

    Put in another way, TO_DATE(l_timestamp) should have simply behaved as a truncation operation,
    truncating the fractional seconds from the l_timestamp and NOT going at all through the steps of converting l_timestamp first to a string, which it "is not", and then back to a DATE, which in fact "it is".

    It looks like, strictly speaking, there is some "amount of relativity" in the way in which
    the notion of explicit or implicit conversion is defined, accepted, or perceived.

    Thanks & Best Regards,
    Iudith

    ReplyDelete
  7. "TO_DATE(l_timestamp) should have simply behaved as a truncation operation"
    Iudith, to stop wrong speculations I ask you what do you say on to_date(l_date), where l_date has obviously date datatype?

    ReplyDelete
    Replies
    1. This comment has been removed by a blog administrator.

      Delete
  8. Hello Vitaliy,

    On the same line, of course, TO_DATE(l_date) should have simply reproduced the argument passed "as is", without any conversion.
    In fact, all the conversion functions should have behaves as an identity function ( f(x)=x ) if passed an argument of the same datatype as their target datatype, with no conversion at all, neither implicit nor explicit.

    Best Regards,
    Iudith

    ReplyDelete