13 January 2011

Constant Instead of Literal a Best Practice? (1862)

In the 11 January quiz, we asked:

Which of the following statements correctly describe a way to improve the performance, readability or maintainability of this block of code?
BEGIN
   FOR month_index IN 1 .. 12
   LOOP
      UPDATE monthly_sales
         SET pct_of_sales = 100
       WHERE company_id = 10006 
         AND month_number = month_index;
   END LOOP;
END;
We scored as correct the following choice:

"Replace all hard-coded literal values with named constants or function calls."

Several players did not agree. I will offer one such comment and then open it up for discussion:

While it's generally a good idea to replace magic numbers with sensibly named constants, in the case of month names I feel the month number itself is a very well readable and commonly used name (as in "Quiz for 2011-01-11 Tuesday"). And did you know that even in German we have two names for the first month, namely "Januar" and "JÃnner"? But why is this "AND month_number BETWEEN 1 AND 12" clause there anyway? Unless someone invents more months (e. g. "Tricember") all months in any real world table may be assumed to range between 1 and 12 unless they are null. So I would probably change the statement like UPDATE monthly_sales SET pct_of_sales = percent_in WHERE company_id = company_id_in Yet another point is binding: If you introduce constants, they will be bound where they are used in SQL statements. The literals won't. Binding constants is probably not as good an idea as binding variables. Well these are just my 2 cents.

Ah - one other thing: another person objected to scoring this choice as correct because he feels it would WORSEN performance (I will leave it to the player to post his comments here). Even if that were true, the question uses the word "or" not "and" - so as long as replacement of literals with constants satisfies improved readability or maintainability, it does NOT have to improve performance.

So, dear players, what do you think?

17 comments:

  1. Hello Steven,

    I totally agree with the first commenter: this should be a single SQL statement. Suggesting code to send 12 update statements instead of one is not helping performance, readability and maintainability.

    When thinking about replacing the literals with constants you should be aware of the performance implications. It's a trade off: sharing execution plans and giving less information about to the optimizer versus not sharing and full information to the optimizer.

    Regards,
    Rob.

    ReplyDelete
  2. > Even if that were true, the question uses the word "or" not "and"

    You are right. Thanks for do not letting me to forget that questions and choices (another “or”) can be noisy, very noisy. In next time I’ll count all conjunction, prepositions and adverbs. The PL/SQL is really boring enough. Keep it in same way. Linguistic game is interesting too and can collect more players.

    ReplyDelete
  3. > he feels it would WORSEN performance (I will leave it to the player to post his comments here)

    Is “would WORSEN” exactly what I say about the whole choice? Let third party decide:
    [quote]
    1) Replacing literals with function call don’t improve but worsen performance (via context switches).
    2) Replacing literals with named constants in SQL from PL/SQL means replacing with bind variable. I'm not sure when it can improve performance. But sometimes it can lead to worse performance because of CBO fails to choose optimal plan for that value.
    Good developer can intentionally replace named constant with literal to drastically improve performance (it’s not a rule but it takes place).
    So this choice cannot be marked as correct.
    P.S. I usually use named constants almost everywhere.
    [/quote]

    ReplyDelete
  4. "Using PL/SQL Control Structures: Controlling Loop Execution (LOOP, EXIT and CONTINUE)" is a proper name for topic of that quiz :)

    ReplyDelete
  5. Hello Steven,

    you've already got my point on it :). I can just repeat the mantra: 'If you can do it in simple SQL, do it'. All other option are unacceptable

    Best regards,
    Serge

    ReplyDelete
  6. Can we really measure readability?

    Anyway, to me, "For Month_index in 1..12" is more readable than "For Month_index in c_january .. c_december" because:

    1. Less typing, Less to read, easy to understand
    2. I don't have to page-up to read the declaration and page-down to read the code (Assuming declare -- end; does not fit in a page, which is more likely to happen with more typing)
    3. "For Month_index" implies that "1..12" are "Month Indexes", so I do not get your point that ""1" and "12" can mean many things, not necessarily the numbers of months"
    4. For more 'readability' if a Japanese coder writes "for month_index in c_ichigatsu .. c_junigatsu", then non-Japanese reviewer will surely have problem to understand. "1..12" avoids that issue.

    ReplyDelete
  7. Hi Steven,

    I get the feeling that some people are too much looking into what the code itself is about and not concentrating on the scope of the question itself. Whether these values are representing months or something else does not matter to me, just that, it is a more readable code when using named constants.
    Of course most of us would change the code so that the loop is gone completely, that is given, but in context of the question this particular choice of using named constants or fuctions is correct.
    Who knows what is kept in the month_number column. Someone might be using this table also for whole year or quarters and therefore it might contain other numbers than 1 to 12. Therfore the best solution is to have "between 1 and 12" in the where clause.

    Regards,
    Ingimundur K. Guðmundsson

    ReplyDelete
  8. It would be a good idea to use l_start_month and l_stop_month variable instead of c_january and c_december in some cases. So it's not a clear explanation for this choice, but the choice is still true in sense of maintainability.

    ReplyDelete
  9. We have a financial year of April to March, so using constants could lead to code like "AND month BETWEEN c_April AND c_March"? So using constants in this environment could lead to code that is not clear at all.
    But I agree with Igimundur that we are too much looking into code. But statements like using constants is always good practice are not true in all cases. And this example is a bad case.
    Using constants here is not KISS (Keep It Simple & Stupid)

    ReplyDelete
  10. you can "overdesign" a database design, you can "overdocument" a program ... maybe you can also exaggerate when replacing hard-coded literal values with named constants

    ReplyDelete
  11. Assume the month numbers go beyond 12. I'd argue that filtering on month number implies that some rows are being excluded. Maybe months 1-12 are last year's figures and 13 onwards are the current year.

    Then it would improve readability to replace them with constants, such as c_prev_year_start_month and c_prev_year_end_month. If the explanation used those constants, it would be more obvious that the numbers 1 and 12 can be mis-interpreted.

    I'd be very hesitant to use a constant like c_december though. Does it have the value 12, or 'DECEMBER' or 'December' or 'Dec' ? If a constant name is the constant value with the prefix "c_" (eg c_20111231 or c_yes) then it has gone too far. If it also has the format (eg c_december_upcase) then it has definitely gone too far.

    I wouldn't recommend replacing all literals with constants, but I would recommend replacing any literal with a constant where it improves maintainability or readability without harming performance.

    PS. Don't assume the first month is 1. C programmers have a habit of starting to count from 0 and so should never be asked to split the bill at a restaurant.

    ReplyDelete
  12. Yes, in most cases it is a good idea to use named constants - but not in all cases. In the quiz code their usage would hinder both readability and (likely) performance. So corresponding choice shall be deemed incorrect.

    ReplyDelete
  13. In my opinion this is bad design of the relation tables. Why would you choose to design a table like this above? Such a design results in such discussions about readability ;)

    Use proper datatypes (DATE for date values and not NUMBER) and things are clearer.

    ReplyDelete
  14. Hi,

    I think it was me with worse performance statement :) You are right you asked "or" and not "and", so from that point of view the answer is correct. I would have choosen it if you would not have added the function calls to the answer. So I had to choose between readability / maintainability and performance (context switches).

    ReplyDelete
  15. Hi,

    "You are right you asked "or"" - in my understanding it should be a limited "or" - improve at least one without the scarifying another.

    Regards,
    Aleksandr

    ReplyDelete
  16. It's a storm in a glass. There is a B O S S .

    ReplyDelete
  17. To conclude: many excellent points were raised that do not argue against the improvement of using constants in place of literals, but instead offer better names for constants or guidance for how and when they should be used.

    I am going to change the question text a little to use "better" names for the constants. Otherwise, no change.

    ReplyDelete