26 October 2010

25 October Quiz Requires Correction (1561)

First, my apologies for the lateness of this correction. I flew from Chicago to Brussels last night for the OPP/APEXposed conference. Just catching up with all things PL/SQL Challenge now (13:30 Tuesday). So: this quiz tested your (our) knowledge of how the ASCII function handles strings with more than one character. Namely: it ignores all characters after the first and "does its thing" with just that first character. I therefore scored as correct the following choice: "The ASCII code of just the first character is returned." I even wrote some code to test it. My reviewers checked it over. We all liked it. But one player, Ryan, objected. He claimed the choice is not correct, because: "The quiz assumptions make it clear that you are NOT using ASCII, since they state that 'the database character set is an 8-bit character set.' ASCII is a 7-bit character set. It is sometimes encoded as 8-bit, but even then the character set itself is still 7-bit. You cannot guarantee you'll get the ASCII value out of the function when the database is using an 8-bit character set, as the first character of the supplied string could easily be a character which does not have an ASCII representation. If you added an assumption to the question that either that the supplied string either does not start with (almost gives away the correct answer) or does not contain any non-ASCII characters, then you could get away with the answer as written. Or you could just state that the database was using a 7-bit ASCII character set." Ah, Ryan, you are so right. And I find myself now saying: "If only I had checked the Oracle documentation first." Because it describes the ASCII function as follows: "ASCII returns the decimal representation in the database character set of the first character of char. char can be of datatype CHAR, VARCHAR2, NCHAR, or NVARCHAR2. The value returned is of datatype NUMBER. If your database character set is 7-bit ASCII, then this function returns an ASCII value. If your database character set is EBCDIC Code, then this function returns an EBCDIC value. There is no corresponding EBCDIC character function." In other words, if I had only phrased my choice as: "The decimal representation of just the first character is returned." Then everything would have been fine. That was careless of me (well, and I must admit, ignorant). I have now added another item to my checklist for reviewing quizzes: "Check the Oracle documentation. It may not be 100% correct, but it certainly is an important place to start." In addition to all that, one of my reviewers noted that our assumption states "the database character set is an 8-bit character set; the national character set is AL16UTF16." But that character set is not 8-bit, it is 16-bit. So we have a problem there as well. The 10.2 Character Set Support documentation states: "Oracle uses UTF-8 (8-bit) encoding by way of three database character sets, two for ASCII-based platforms (UTF8 and AL32UTF8) and one for EBCDIC platforms (UTFE). If you prefer to implement Unicode support incrementally, you can store Unicode data in either the UTF-16 or UTF-8 encoding form, in the national character set, for the SQL NCHAR datatypes (NCHAR, NVARCHAR2, and NCLOB)." If anyone's eyes are glazing over at all of this, I can sure sympathize! Character sets, Unicode, globalization, localization: these are well outside my comfort zone. Why can't we all just speak (and write) English? :-) I will take the following actions in light of this mistake: 1. Change the text of the choices to use "decimal representation" instead of "ASCII code". 2. Give everyone credit for a correct answer on that choice (and change your choice accordingly). 3. Rescore and recalculate rankings. 4. Change the assumptions so that the assumed character set is AL32UTF8. And as Ryan was the only player to report this issue, he wins his choice of an O'Reilly Media ebook. Thanks for the education, Ryan! Cheers, StevenSo: t

2 comments:

  1. Steven,

    You are quite correct: single- and multi-byte character sets and encodings, fonts, and cross-domain compatibility can be a difficult niche to master as far as tech skills go. In a former employment, one of my duties was being in charge of all the label-printing software, hardware, and configurations for a fairly large factory and distribution warehouse which had to print labels and documents in many different languages such as Spanish, French, Chinese, and others from various environments with various hardware based on destination and customer requirements. I seem to remember that there was a fairly steep learning curve involved.


    I decided that I am going to ask for your (et. al) ebook "Oracle PL/SQL Programming, Fifth Edition", since the Third Edition of this book already lives on my desk, and is used often enough to make the idea of returning it to the bookshelf seem silly, and I am already waiting for management to order my copy of "Oracle PL/SQL Best Practices, Second Edition", which I requested on Sept. 10 of this year (Insert same-old joke here about the speed of government procurement).


    Keep up the great work with the PL/SQL Challenge!


    -Ryan

    ReplyDelete
  2. I am honored. I hope you like the new and expanded content in areas of performance and code management. And of course 11g features.

    ReplyDelete