17 July 2011

Column headers "fair game" for choices? (4180)

The SQL quiz on PIVOT for the week of 9 July asked which of the choices produce this desired output:
DEPARTMENT_ID       LESS       MORE
------------- ---------- ----------
          100          2          0
          200          2          1
And we scored a choice as incorrect simply because the column headers would not match what is shown above. Specifically, the headers would be:
DEPARTMENT_ID     'LESS'     'MORE'
A player raised a concern about this as follows:

I just answered today's SQL quiz a few seconds ago, and it looks to me, as far as I can remember, that something is missing from the general assumptions for the SQL quizzes, namely, a clear statement regarding the column headers. It would be very sad if those that perfectly master the PIVOT clause (including the alias usage for the column headers in this context), and have answered correctly today's quiz would be finally penalized because of eventually not accepting the quoted string as a correct column header, while the query result is the correct one, including the ordering. While from the previous quizzes we already learnt "silently" that the ORDER BY does matter, that is, a correct solution is supposed to produce the same ordering as presented in the sample shown in the quiz (ordering being always an intrinsic part of the SQL), the column headers issue is a little bit more "volatile", as, what we see it is not always strictly a part of the SQL statement itself ( for example, columns headings can be set by the client tool used, like the COLUMN statement in SQL*PLUS, a.s.o. ). In my opinion, the correctness of a query result set should not be made dependent on the column headers, though others might think otherwise. Regardless of the very specific case of the PIVOT clause, and maybe exactly because of it, and since our competition is still at the beginning, maybe a clarification on this issue in the assumptions would be most welcome and would remove ambiguities.

The author of the quiz offers this response:

We could perhaps make the assumptions a little bit clearer – but the assumptions should never become a multipage description that no one will read J.

I can understand a wish for the assumptions to make everything 110% unambiguous – but the danger is if we keep making the assumptions more and more detailed and larger and larger, then practically every tiny objection could argue that it should be part of the assumptions, and then the assumptions would become a document as large as the SQL documentation J. The more we add to assumptions, the less a player will feel that he/she has to use “common sense”…

The reason I included the “un-aliased” answer in this PIVOT quiz (and scored it false) was to enhance awareness of proper column naming. If for example this SQL statement were to be used in some client environment where the client code references columns of the resultset by name, then you would get errors if you didn’t alias. Example: IF (Resultset("LESS") > 1) {do something} would fail if the column was not aliased. So actually the correctness IMHO does depend on the headers as well as the correct data – if you are a “back-end” developer, how you name columns in your output will matter to the front-end developer who has to use your result sets in his or her code.

I would argue against rescoring this particular quiz – it was very much on purpose I provided a choice that failed only in the headers.

My (Steven) feeling about this: generally, we should not make choice correctness dependent on issues that are irrelevant to an understanding of the technology or how it plays out in our day to day experience. In this case, however, Kim argues persuasively that it was quite relevant and something for developers to be aware of.

What do you think?

4 comments:

  1. Hello Steven, Kim, All,
    So, it looks like Steven somehow agrees with me that correctness should not depend on issues not relevant for understanding the technology.

    Regarding the Assumptions, I think that this is the "least common denominator" to which we can refer back in case of any controversial issue.

    Especially for the case of SQL much more that for the PL/SQL, I think that a short sentence
    regarding a "common agreement" that column headers ARE ALWAYS PART OF THE SOLUTION
    would not represent such a big over-addition to the assumptions .

    The other alternative would be to clearly emphasize this in the quiz itself, which can become annoying if this is really going to be always the case ...

    For following Steven's line of avoiding irrelevant issues, maybe it could have been much better not to have this choice in the quiz, along with the other one that did have the aliases in place.

    If it comes to database backend programming,
    I just happened that exactly a few days before this quiz I wrote exactly a PIVOT query and, as always, used my own aliases, rather than relying on the default column names ...

    I really don't want to argue too much, but things should be made completely clear, to avoid any ambiguities, misunderstandings, and, why not to say it ? ... also not few frustration in this specific case ...

    Thanks & Best Regards,
    Iudith

    ReplyDelete
  2. I agree with Kim’s argument that column names are important when the results are destined for use by additional code; however, there is absolutely no indication in the quiz that the result of the query is intended to be used by any additional code. The story line for this quiz is that the boss wants to know by department how many employees fall into the two categories. Whether or not the column names include quotes is of little significance when the output is solely intended for human consumption. Within the context of the story line, the results of the disputed option are probably good enough.

    Because I evaluated the choices based on the story line, I chose the un-aliased response. Despite this, I do not advocate changing the scoring nor do I think the assumptions should be amended. The quiz did not ask which queries would answer the boss’s question; it asked which would produce the desired output. The specification is sufficiently precise.

    ReplyDelete
  3. Thanks for your comments, Iudith and jhall. I appreciate the feedback as I am still learning how to formulate un-ambigous quizzes (I have not made so many quizzes as Steven, yet ;-)

    Actually I think my preference would be not to change the assumptions, so the normal case would be that correctness does not depend on things not specifically relevant to the quiz.

    In retrospect I think I would have worded the question in a way that would have made it clear that headers were significant. I could easily have put in the storyline, that I was to create a view for our intranet report developer and the view were to have such and such specs.

    Hopefully I'll keep improving over time :-)

    ReplyDelete
  4. I think the question was fine as is and no rescoring needed. The unaliased output does not look like what was requested. It looks "similar" but not the same. If "similar" was good enough, then there are probably dozens of quizzes that would need rescored.

    I thought the two answers in question were sort of a "gimme" because they were identical except for the aliasing which, to me, acted as a hint that the aliases were important. I had originally accepted the unaliased answer but when I read the aliased one it clicked that I had forgotten this feature and then corrected my response before submitting.

    ReplyDelete