02 November 2011

"Average" function not correct due to lack of rounding? (9397)

The 1 November quiz tested your knowledge of the ability to select from a nested table using the TABLE operator and, in this case, apply the AVG function to the contents of the collection. The question asked in the quiz was:

Which of the choices implements a function named plch_avg that calculates the average of all elements in a collection of the above type so that the following block displays "3.5" after execution

We compared the TABLE operator approach to algorithms one might write themselves to compute the average. One of these, marked as correct, is:
CREATE OR REPLACE FUNCTION plch_avg (numbers_in IN plch_numbers_t)
   RETURN NUMBER
IS
   l_index   PLS_INTEGER := numbers_in.FIRST;
   l_average     NUMBER := 0;
BEGIN
   WHILE l_index IS NOT NULL
   LOOP
      l_average := l_average + numbers_in (l_index)/numbers_in.count;
      l_index := numbers_in.NEXT (l_index);
   END LOOP;

   RETURN l_average;
END;
Several players objected and believe that this choice should be marked as incorrect, since it will not always produce the expected result. For example, if I pass a nested table of 9 rows (values 1 through 9), this function returns:
5.00000000000000000000000000000000000001
This is certainly true, and if we asked you to identify algorithms that would work under all circumstances, we would be wrong to mark this as correct. We would also be wrong to mark choice 8476 as correct, since an empty collection would cause the function to fail with a divide-by-zero exception.

But we asked you to identify choices so that the following block displays "3.5". It certainly does that, and so we believe that this choice should be scored as correct.

Your thoughts?

6 comments:

  1. Hello Steven, All,

    While strictly mathematically this choice is correct, math is still superior to any computer :) :) and this is the (maybe only) positive message !

    Regarding the choice itself, its mere inclusion in the quiz somehow "did injustice" to those who
    marked as correct the TABLE function choice and as incorrect this one, and may it be for whatever reasons that may extend from "bad practices" up to "performance" (that is,
    to avoid several useless division operations)
    or even the rounding problem described here, versus those who marked these two choices
    the opposite way ...

    They scored the same, but the first ones were more correct versus the main topic of the quiz
    and the "Quiz Lesson" and all the explanations given, that clearly favor the usage of the TABLE function.

    So, maybe we can think of a possibility to create "weighted choices", that is, to divide the total achievable points for a quiz not necessarily evenly among the different choices, but giving a higher weight to the recommendable (or best practice) choices.

    Just an idea that seems to "offer itself" as a better solution for several quizzes up to now,
    maybe to be considered in the future.

    Thanks & Best Regards,
    Iudith Mentzel

    ReplyDelete
  2. As a reviewer I was a bit cautious about that choice, but then realized I was thinking too far... Questions and choices are (usually) provided in a very narrow context and we should leave it at that. The idea is to present features, and sometimes quirks, of the PL/SQL language. The idea is not to provide full-blown, tested and certified production solutions.

    It is Math-101 that you should not compute an average as presented here, but that is besides the point. With the simple integers presented, who would anticipate a deviation from a straight 3.5? I'll admit that I was a little surprised to see a deviation with the range from 1 thru 9, and probably even more so when seeing the result from shifting the numbers a little: plch_numbers_t (9,2,3,4,5,6,7,8,1); returns just 5.

    Should we have added a ROUND to kill any potential too many decimals? Could have, but... honestly... Was there a reason to believe that such simple numbers could create an issue? Don't forget, the PL/SQL Challenge is not out to trick players, rather the idea is to increase awareness of PL/SQL features.

    I think the biggest problem here is that we all defaulted to that basic training; Don't compute averages this way! Is it always bad? No, just don't do it! Do we know when it is a problem? Well, most times we would need to try it out.

    I think the choice is scored correctly. It does produce the desired result and there is really no reason, when looking at it, to believe that it would not.

    Mike

    ReplyDelete
  3. I agree with the scoring of this quiz, which is consistent with the scoring of similar quizzes in the past (ones where the algorithm would not work for all situations, but does work for the question given).

    ReplyDelete
  4. I did not type it in and test it, but I had that choice marked incorrect. My reasoning was the following:

    "There is already one choice showing the bad way of doing your own average calculation. This choice is just an even worse implementation of that choice. Why would the author include that choice if not for a reason? Ah, rounding problems! Yes, I bet those data will cause a rounding error. I'll mark it as incorrect."

    Yes, I know one of the premises of the quiz is that each choice is completely independent. But it is difficult not to try to second-guess the author ;-)

    I believe the scoring of the quiz is correct - I must just teach myself not to think in "over-complicated" ways ;-)

    I do not believe we need to complicate quiz scoring by weighting algorithms. My experience is, that the more you try to create an algorithm that has to be meticulously fair in all possible and impossible situations - the more impossible it gets to understand the scoring algorithm and that opens up for constant arguing. (Witness the income tax system in many countries :-)

    (I could even argue that the present scoring system and wildcards and forgiveness for un-played days is too complex, but we have learned its quirks by now. Just don't change scoring algorithm every now and again over each tiny issue that pops up - consistency is the best :-)

    ReplyDelete
  5. Strictly spoken is is not about rounding problems, but about Numerical Methods. Dividing is a calculation which introduces numerical problems, the same as subtraction. Small errors or deviations in presentation can have a big influence on dividing and subtraction. Adding and multiplication introduces less errors.
    So in practice, try to avoid dividing and subtraction as much as possible, because the are ill behaved and easily introduces very big errors. Try the following 1-(3/3). Could go anywhere depending on computer language and implimention.

    ReplyDelete
  6. Hi, S. I was one who did not mark a answer where was a lot of computing with the floating point - and yes, my reason was a "there is a few operation with real numbers and there "must" be a some trouble with periodis number in binary format". If was or was not right i do not know, but I believe that your marking of what is right as best as you can ;). I just say, yes, there was probably people who thinked about rounding and I was one of them too ;)

    ReplyDelete