23 December 2010

Is a CLOB a string? A question raised about the 22 December quiz (1803)

The 22 December quiz tested your knowledge of the capabilities of both EXECUTE IMMEDIATE and DBMS_SQL to parse very long strings. We scored the following statement as incorrect:

"You cannot execute a string of more than 32,767 bytes as a dynamic SQL statement."

Oracle offers various mechanisms, especially in Oracle11g to bypass this limitation (the maximum size, that is, of a VARCHAR2 variable or literal).

One player emailed the following concern:

One of the choices for today's quiz looks somewhat ambiguous, namely the choice that says: "You cannot execute a string of more than 32,767 bytes as a dynamic SQL statement". If we take this literally as it is worded, then we could object that there is no such thing at all as a "string of more than 32,767 bytes", because this is the maximum length allowed to a VARCHAR2 string. If so, then this choice is supposed to be marked as correct. On the other hand, if string means "anything that is character data", including a CLOB, then this choice (though apparently in need to be marked as incorrect) falls over 2 other choices, namely: a) the choice that says "You can use a subprogram of the DBMS_SQL package to parse a SQL statement whose length exceeds 32767 bytes ( by the way, the wording of this choice ellegantly avoids speaking of a "string whose length exceeds 32767, but speaks of a STATEMENT whose lengths exceeds 32767, thus avoiding the above problem). and b) the choice that says "In Oracle11g you can pass a CLOB to EXECUTE IMMEDIATE. These 2 choices in fact do cover in entirety the 2 cases of using a statement contents as a CLOB. I think that the wording of the choice that says "You can use a subprogram of the DBMS_SQL package to parse a SQL statement whose length exceeds 32767 bytes" is excellent, covering (even without specifying it explicitly) ALL the cases by which a statemnent longer than 32767 can be made "to fit into" the requirements of DBMS_SQL, be it by using a CLOB or by "breaking" the statement's contents into elements of a DBMS_SQL.VARCHAR2A array. So, in my opinion, the problematic choice recommends itself to be rescored due to the ambiguous wording.

The SQL Language Reference says the following about CLOBs: "The CLOB data type stores single-byte and multibyte character data." CLOBs are, therefore, character data, which is another general term for string. VARCHAR2s are also character data, with a maximum length of 32767. But I believe that it is correct to interpret string as a general category of data including CHAR, VARCHAR2, NCHAR, NVARCHAR, CLOB and NCLOB (any others?). Others also feel this way; see:  http://www.orafaq.com/forum/t/98334/2/.

You are correct that two other choices covered the correct descriptions of ways to parse very long strings. I do not see, however, what bearing that fact has on the other choice, which is incorrect.

So I do not see the need to rescore. What do you think?

17 comments:

  1. I don't see the wording has a bearing on the answer. I agree that 'strings' can be represented by many datatypes in many languages.

    When reading the question then options, I did think that "I bet someone will comment/have issue on this"

    With all due respect to that commenter, and some of the nitpicks previous to this that you have posted here - I think sometimes we cross a line as to how deeply we should read into phrases and terminology used.

    I think common sense should ultimately prevail, and in some parts of the world, it doesn't seem all that common ;-)

    ReplyDelete
  2. I don't see the need to rescore generally. However if a player marked that a CLOB could be used with Execute Immediate in 11g and that a string longer than 32,767 cannot be executed as dynamic SQL, that player has probably interpreted the word string to exclude CLOBs. In that case an individual rescore COULD be justified.

    Of course they may have just picked out options at random, or be edging their bets.

    As an example of latter, say you have three mutually exclusive choices which you believe are equally likely to be right. If you choose randomly then one third of the time you'll get 100%, but most of the time you'll get 33%. If however you decide not to choose any of them, then you'll always get 66%. As such, sometimes you may actually chose a set of answers which are actually inconsistent in order to minimize incorrect choices rather than maximize correct choices.

    ReplyDelete
  3. Steven,

    I had that choice right in the quiz, but after re-reading I have my doubts...

    Th option was: "You cannot execute a string of more than 32,767 bytes as a dynamic SQL statement". Because 11g is not mentioned in this option, I assume it's 10g.
    And indeed you cannot parse a STRING of more than 32,767 bytes(a string to me is 1 variable of a character type: char, varchar2, clob,...).
    But you can parse a STATEMENT of more than 32,767 bytes by using the overloading procedure.
    And that's a difference.

    So, that player has a point...

    Gert

    ReplyDelete
  4. The minimum version for a quiz is NOT specified at an option level. Instead, it is "declared" above the question text and applies to the entire quiz. I do not think that is basis for an objection.

    ReplyDelete
  5. OK, that's my mistake in re-reading the solution and not looking back at the "minimum oracle version". So I looked it up in the 10g documentation. Sorry again...

    And that's probably why I had the answer right in the quiz ;-)

    ReplyDelete
  6. Hello All,
    Whether rescoring or not, I'd like to clarify some points:
    1. Once there was a separate choice referring to a CLOB, and terminology was indeed looked at as an issue in many previous blogs, then,
    for the sake of correctness, at least within the frame of this quiz, STRING seems to be something different than a CLOB.
    In fact, a STRING cannot be longer than 32767 bytes, so that choice itself, IN THE WAY IT WAS WORDED, makes no sense.

    2. Regarding the version issue, it is NOT relevant for the wording of that choice,
    but, however, in spite the fact that the overall version for the quiz was specified as
    being 11g in the quiz header, one of the choices explicitly repeated that it specifically refers to 11g, not clear why.

    3. Regarding the issue of chossing answers randomly, I find it at least weird that somebody even mentions it here ...
    I also wanted to raise some objections to this issue already in the past:
    There are quizes for which it is clear from the question itself that only one single choice can be correct, specifically those quizes that ask "What will be seen on my screen when running the following code".
    So, if such a quiz has 4 choices, then a wrong choice always ensures 50% correctness, which is somewhat unfair, regardless of the fact that the wrong choice was marked randomly or deliberately (by the way, I could never understand how can a 75% correctness be achieved at all on such a question, except by marking more than one choice as correct, which is even less logical ...).
    Such quizes are "all or nothing" quizes, so,
    in my opinion, the correcness of such a quiz
    should only be 100% or 0%, but this issue maybe raises a larger problem, that might suggest reconsidering the whole scoring theory.
    Anyway, such questions are essentially different from the current one, there terminology DOES NOT play any role at all,
    the code in the question is clear and the correct answer is NOT arguable at all, which is a completely different case from the problematic choice in the current quiz.

    Best Regards,
    Iudith

    ReplyDelete
  7. Had string appeared in all uppercase, then I would agree that "You cannot execute a string of more than 32,767 bytes as a dynamic SQL statement" since STRING is a PL/SQL defined subtype of VARCHAR2 (see Oracle Database PL/SQL Language Reference 11g Release 2, Chapter 3, "PL/SQL Data Types"); however, the use of lowercase letters indicates that string refers to the general concept of sequences of characters instead of a specific type. Within Oracle's documentation, the term string is used to reference any of the data types CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB (e.g., REPLACE in Oracle Database SQL Language Reference 11g Release 2, Chapter 5, "Functions").

    One of the issues appears to be an attempt by some to interpret the set of potential responses in aggregate instead of independently. The Quiz Advice clearly states that each candidate response should be considered on its merits without regard to any of the other choices.

    ReplyDelete
  8. I don't see why "string" could reasonably be taken to mean anything more or less than a "string of characters", and anyone working with Oracle should be familiar enough with the range of datatypes available to know that VARCHAR2 is not the only string datatype.

    A "string" is not the name of any data type in Oracle (with good reason), so to suggest that it has any limit at all is incorrect.

    The fact is, strings are not limited to 32767 characters - only VARCHARs (when used in PL/SQL) have this limit, and it is inaccurate to conflate the terms "string" and "VARCHAR2".

    The question was worded unambiguously, in my opinion - in fact, it gave obvious clues that it might not necessarily be talking about VARCHARs (it even mentions the CLOB data type!).

    On another note, I sort-of agree with ludith's comments regarding the "all or nothing" type quizes - where it is obvious that the answers are mutually exclusive. Thankfully these are rare - most of the quizes have answers that are not so obviously exclusive.

    ReplyDelete
  9. You can execute a string which is longer than 32767 Bytes with EXECUTE IMMEDIATE. It's a long time ago I have tested it (I think it was in 10g) but you can pass the concatenation of more then one string to it where each string can habe a maximum length of 32767, like:
    EXECUTE IMMEDIATE string1 || string2;

    ReplyDelete
  10. Regarding quizzes with only one correct answer: I have been thinking about this, as well. I have been reluctant in the past to do this, as I felt that I was giving away too much information by letting people know there is just one right answer. But I would say that this concern is outweighed by the downside of a player getting so many points for checking "No answers are correct." I will look into implementing this for 1.9.

    Thanks, SF

    ReplyDelete
  11. Quizzes with one right answer: Do you think that a player should be told upfront that only one answer can be right or when they press Submit should we tell them that only one choice can be selected? Should we still offer "No choices are correct" as one of the options?

    ReplyDelete
  12. Hello Steven,
    No, I don't think by any means that a player should be told that there is only one correct answer and, also, the "No correct choices" should remain as an option.

    Even for quizes having one single correct choice, they can have different "flavors".
    What I mean is that, for example, a quiz asking "What will be the output of the following" has a different flavor than another quiz, that just happens to have one single correct choice simply because other choices
    maybe refer to a feature not yet implemented in a specific version, but which theoretically might be correct in a "richer" or future implementation.
    What I feel is that some method of scoring would be welcome that distinguishes between the two types.
    One possible idea could be to score differently the wrong choices that a player DID NOT choose, depending on whether he did or did not choose correctly the right choice(s).

    I still think that, maybe even prior to this aspect, something should be attempted to lower the wording ambiguities in the "text type" quizes, even by formulating a longer and more explicit text, simply because they give place to interpretations that might easily punish somebody for having chosen one possible interpretation rather than the other, and thus,
    missing the real purpose of the quiz, which is checking the PL/SQL knowledge.
    Each time when such a quiz came up, I am sure that the fairness of the scoring finally had to suffer.
    I think that everybody will agree that the type of frustration is completely different when one fails "to guess" the right (or rather say intended) interpretation for an ambiguous choice when he/she does posess the complete knowledge, versus the case when somebody really does not know the right answer for a question.

    Maybe a nice idea for an upcoming version would be to offer the players a possibility of "ranking" the quizes (say, on a scale of 1 to 5), which will give a good image of what type of quizes are preferred.
    And, of course, to publish a ranking list of the best quizes in the quarter ... just in parallel with the players ranking :).

    Though the quarter is not yet over,
    I have already crowned my unofficial winner at this category !

    Towards the end of this year,
    Thanks Steven a lot for this really challenging daily adventure from which we learn a lot and which makes our professional lives more beautiful and exciting !

    To you and to all the players,
    Happy Holidays and Best Wishes for the coming year, with lots of PL/SQL fun !

    Iudith Mentzel

    ReplyDelete
  13. Iudith,

    As of 1.8 (current version of website), you can in fact provide an overall rating of 1-5 for a quiz (both current and past).

    We will publish these results as of 1.9.

    Cheers, SF

    ReplyDelete
  14. Hello Steven,
    Thanks a lot, I just remarked it now on the
    Quiz Survey page, probably because the rating stars were dimmed ...

    Enjoy a nice weekend !
    Iudith

    ReplyDelete
  15. Why is it a problem for a person who chose "none of the above" getting %75 percent on a question of four choices where only one of them is correct? They did choose the correct answer for 3 of four choices. Unless the answers are clearly mutually exclusive of each other and have no unrelated implications, I don't understand why a person choosing none of the above deserves less than full credit for the "unselectables" that were left properly unselected.

    ReplyDelete
  16. I do think the overall quiz experience and scoring will be improved if I score differently those quizzes for which there is clearly only one correct answer (mutually exclusive results).

    But I would definitely notify the player that the quiz has just one right answer (including "none of the above") because the scoring will be different.

    I expect to put this in place with 1.9.

    SF

    ReplyDelete
  17. Hello Steven,
    I don't think that the player should be notified that there is only one right answer,
    because for questions where this is not obvious, it will give out too much information.

    Maybe the different scoring should be applied for those quizes that, by the question itself,
    regardless of the content, can only have one single answer, like
    "What will be the output when executing the following block ?".
    Or, rather the other idea, that of ALWAYS (for any type of question), instead of scoring the wrong choices NOT chosen as "100%" correct answers, to somehow give them a "weight" based on whether (or in which percent) the correct choices WERE chosen.
    This will tend to minimize the "bonus" for random choices that happen to be correct.
    Of course, a precise algorithm needs to be worked out for this.
    Maybe, for a period (say, a quarter), to keep a parallel (and hidden) scoring based on such a modified method, in addition to the regular scoring, and at the end compare the results of the 2 methods and analyze in detail
    the cases that show big differences, which may give a picture of how such a scoring may work.
    Or, maybe to do the same for a past quarter.

    Thanks & Best Regards,
    Iudith

    ReplyDelete