05 July 2011

TOO_MANY_ROWS and indeterminate behavior (3574)

The 4 July quiz tested your knowledge of the impact of querying more than one row with a SELECT INTO. The TOO_MANY_ROWS exception is raised, but the question then is: what if anything will the variable(s) in the INTO clause of the SELECT statement be set to?

As we state in the answer:

According to Oracle documentation, the state of a variable that you SELECT INTO will be undefined if the statement raises TOO_MANY_ROWS....In practice, however, it seems that the data from the first row returned by the query is assigned to the variable(s) in the INTO clause; only when the second row is fetched is the TOO_MANY_ROWS exception raised.

We have had several quizzes that delve into such "indeterminate" behavior and generally have taken the position that if the actual, day-to-day experience does not seem to be indeterminate (in this case, the variable(s) are populated with the data from the first row fetched), then it is a legitimate topic and answer for a quiz.

Several players raised objections (of course!). One person objected to making a quiz involving undocumented behavior like this an "only one choice correct" quiz. That is too harsh a penalty for getting it "wrong" (which in this case would mean following what the documentation says). I agree with this. We are definitely going to change the question type to "more than one choice might be correct" and rescore. But we will likely also correct the scoring on some of the other choices.

Before I do that, though, I invite players who submitted concerns and code examples to post them on this blog. They've made some very interesting discoveries!

23 comments:

  1. I didn't know how Oracle would handle this and obviously it could not be checked in the documentation. Therefore the only way of finding the behaviour is to test with pl/sql which according to the site is classed as cheating in the quiz rules [or am I wrong? It actually states that copying to clipboard & pasting into your pl/sql environment is cheating. Maybe it's OK to type in similar code? It's a slightly grey area.]

    ReplyDelete
  2. Hello All,

    I will just repost the same as I have posted on the site's feedback page, with a small addition:

    Regarding some ambiguity that the result of this quiz might raise:

    I checked it and found the following:

    - For Oracle10gR2 and Oracle11gR1, the correct answer is:
    A:Jones
    B:Jones

    - For Oracle7.3.4 and 8.1.7.4 the correct answer is:
    A:
    B:

    The Oracle PL/SQL Documentation for both 10gR2 and 11gR1 says the following:

    "If a SELECT INTO statement returns more than one row, PL/SQL raises the predefined
    exception TOO_MANY_ROWS and %ROWCOUNT yields 1, not the actual number of rows
    that satisfy the query."

    but in another place also says the following:

    "By default, a SELECT INTO statement must return only one row. Otherwise, PL/SQL
    raises the predefined exception TOO_MANY_ROWS and the values of the variables in the
    INTO clause are undefined."

    It is interesting to remark that this second phrase was removed from the 11gR2 documentation,
    probably in an attempt to make the documentation conform with the effectively experienced behavior.

    In my opinion, setting SQL%ROWCOUNT to 1 and the INTO variables to undefined could be misleading,
    so the behavior we see from 10g onwards looks more consistent.

    But, anyway, the bottom line should be that this exception should be always handled
    as appropriate for the application.
    =============================================
    and I would just add this:

    It looks like the 11gR2 documentation is signalling that the behavior we observe now starting with 10g is probably NOT going to change.

    The behavior is consistent also with what we see when performing a SELECT BULK COLLECT into a VARRAY variable, when the select returns more rows than the VARRAY's defined size.
    When the defined size is exceeded, an error is raised, BUT the VARRAY remains populated with the first "n" rows of the query.

    It is up to the developer to decide whether this is what he wants at all or whether this should be treated as an error.
    If Oracle were treating always this case as an error and NOT populating the INTO variables at all, then we should probably have no choice but to use an explicit cursor for cases where we know a priori that there might be several result rows, but we definitely are interested in the first one, usually by a specific ordering.

    In summary, I think that the behavior we see in the last versions is legitimate and consistent.

    If it comes to rescoring this quiz, I think that the maximum that may be legitimate to do is to also accept as correct that choice that was correct in Oracle's previous versions (the 2 nulls), but by no means to accept as correct
    any other possibility.

    Oh God, we only started this quarter and are again at this point ...

    Thanks & Best Regards,
    Iudith Mentzel

    ReplyDelete
  3. While testing answers to this quiz, I noticed that the behaviour seems to depend upon the type of the out parameters, as well as the version!

    It seems that integers don't have values assigned when too_many_rows is caught, but varchar2s do:

    SQL> create table t (x integer, y varchar2(10));

    Table created.

    SQL> insert into t values (1, 'test1');

    1 row created.

    SQL> insert into t values (2, 'test2');

    1 row created.

    SQL> commit;

    Commit complete.

    SQL>
    SQL> create or replace procedure p (oint out integer, ochar out varchar2) is
    2 begin
    3 select x, y into oint, ochar from t order by x;
    4 exception
    5 when TOO_MANY_ROWS then
    6 dbms_output.put_line('int:'|| oint || ' varchar:' || ochar);
    7 end;
    8 /

    Procedure created.

    SQL> sho err
    No errors.
    SQL> set serveroutput on
    SQL> declare
    2 i integer;
    3 v varchar2(10);
    4 begin
    5 p(i, v);
    6 dbms_output.put_line('int:'|| i || ' varchar:' || v);
    7 end;
    8 /
    int: varchar:test1
    int: varchar:test1

    PL/SQL procedure successfully completed.

    I've tried this out on 10.2.0.3 and 11.2.0.1, so I'm reasonably confident that this is a general effect and not the result of a specific patchset. I'm interested to see if other players can reproduce this.

    I'm not sure how other datatypes are affected at the moment...

    ReplyDelete
  4. Thanks everyone for your keen eyes!

    It certainly has been very educational proposing a quiz like this - we all benefit from many more eyes on Oracle, and how it behaves under different conditions.

    I agree the quiz should have been a multi-choice, and that the "A: B:" option should be rescored as correct, while not penalizing players who didn't mark it as correct.

    In that sense it wasn't a good quiz, because two of the offered answers turn out to be both "right" and "wrong" (perhaps the answers should be merged into one answer: "the result may be A:Jones B:Jones, or it may be A: B:"; or perhaps the correct answer should simply read "the result is indeterminate").

    ReplyDelete
  5. Just to add to Iudith's comment, I had the opportunity to check on a 9.2.0.8 instance and that returned nulls. It looks like the behaviour changed for 10g. Of course the quiz is aimed at 10g/11g so that doesn't necessitate a rescore in itself.

    ReplyDelete
  6. PS. The 11.2.0.2 documents still states "After a FETCH or SELECT statement raises an exception, the values of the define variables after that statement are undefined."

    http://download.oracle.com/docs/cd/E11882_01/appdev.112/e17126/static.htm#sthref532

    ReplyDelete
  7. Perhaps one of the criteria used to evaluate quiz submissions should be that a proficient PL/SQL programmer could ascertain the code’s behavior without having to execute it. Quizzes involving undocumented behavior would be unlikely to pass such a test.

    Many of us have probably encountered the "too many rows fetched" exception. The usual response would be to correct the defective code instead of exploring the side-effects associated with the unintended exception. These side-effects may reveal interesting details about Oracle's implementation decisions; however, such details are matters of trivia unrelated to the proper use of the PL/SQL programming language. As others have already reported, Oracle’s documentation states that the content of SELECT…INTO variables is undefined following an exception. Undefined does not imply inconsistent, nor is consistency sufficient to infer that a behavior is specified. Retaining the value from the first row fetched may simply be the most efficient course of action.

    ReplyDelete
  8. It's a tough call - I can't remember what decision was made for similar quizzes where the documentation didn't match functionality.

    I do actually remember working with Jeff when we encountered this issue - and if I recall it had a relatively serious bearing on the result of our software. We had to actively account for this seemingly simple behaviour, regardless of what the documentation said.

    I know you can't pick and choose which rules to follow, but this is a fairly basic flaw that's easily encountered - it's not like it's some really obscure bug that one person in a million would find.

    I guess marking both options as correct would be fair - but a final thought - "undefined" sounds a little ambiguous, it's not "null", "empty" or "cleared".

    Hmm...

    ReplyDelete
  9. Hello All,
    Chris Saxon's comment looks really interesting ...
    so I checked it in 11gR1 and I can confirm it is the same, BUT ...
    If we perform the same test using a NUMBER column instead of INTEGER,
    we see again the same behavior as for VARCHAR2:


    create table t2 (x number, y varchar2(10))
    /

    insert into t2 values (1, 'test1')
    /

    insert into t2 values (2, 'test2')
    /

    create or replace procedure p (onum out number, ochar out varchar2)
    is
    begin
    select x, y into onum, ochar from t2 order by x;
    exception
    when TOO_MANY_ROWS then
    dbms_output.put_line('number:'|| onum || ' varchar:' || ochar);
    end;
    /

    declare
    i number;
    v varchar2(10);
    begin
    p(i, v);
    dbms_output.put_line('number:'|| i || ' varchar:' || v);
    end;
    /
    number:1 varchar:test1
    number:1 varchar:test1

    PL/SQL procedure successfully completed.


    Far back in one of the previous versions of Oracle ( I think it was even Oracle V5 !!! )
    we always used such SELECTS with ORDER BY in SQL*FORMS applications, when even PL/SQL was not yet existing,
    and they always returned the first row, and no error !
    When we migrated to a later version and PL/SQL, we had to convert those constructs to explicit cursors,
    because of this changed nehavior ...

    Now I really think that it's time for Oracle to come out with a clear and unambiguous statement
    regarding this issue.
    I don't believe that the reason of not stating it clearly resides in any tendency of maybe to change this
    behavior (again ?!?) in a future version.
    I think that raising the TOO_MANY_ROWS and at the same time populating the INTO variables with the
    first row's values is fairly reasonable and can be adopted as official, clearly documented
    and unambiguous behavior.
    Of course, bugs like we saw above with the INTEGER type should be corrected.

    Regarding Gary's comment above about the 11gR2 documentation, it indeed mentions the case of
    a SELECT or FETCH raising an exception, but DOES NOT mention explicitely the TOO_MANY_ROWS,
    as the previous version's documentations DID ( I performed a search after TOO_MANY_ROWS ).


    Thanks & Best Regards,
    Iudith Mentzel

    ReplyDelete
  10. This behaviour is described in Oracle Support Document "TOO_MANY_ROWS exception initialising the variable after upgrade to 8i/9i from 7.3.4" (Doc ID 258112.1). And there is workaround to eliminate this behaviour ("Declare the variable and column with different datatypes").

    "Undefine" meant, means and will mean "you can't rely on it's value". It is a very simple rule to do not fall into a trap of changed behaviour.

    ReplyDelete
  11. By stating that the content of SELECT…INTO variables are undefined when an exception is raised by the SQL statement, Oracle has unambiguously declared that any observed behavior related to those variables should not be relied upon. Whether or not INTEGER, NUMBER, or VARCHAR2 types have their content handled in the same manner is irrelevant, any content meets Oracle’s stated specification when an exception has occurred.

    Encountering a "too many rows fetched" exception indicates a defect in the application code or in the data model (for example, a missing unique key constraint). Attempting to use what appears to be consistent behavior despite clear warnings that such behavior is not guaranteed does not correct the code. Swapping one defect for another is not a solution, even when the revised code seems to behave as desired.

    Other’s have tested and reported that the observed behavior is affected by a variable’s type. What other factors affect the behavior? We have no reasonable way of knowing. Three of the candidate answers for the July 4 quiz are definitely wrong, the other two options may be considered equally correct (or incorrect):

    A:
    B:

    and

    A: Jones
    B: Jones.

    "None of the answers are correct," is probably the most correct response since there is no guarantee that either of these will be displayed.

    As I expressed in an earlier post, quizzes should not require execution of the code to ascertain behavior. I didn’t have convenient access to a database instance when I took this quiz and thus had to guess at the answer. I selected the choice with NULLs based on the principle applied to failed transactions: none of the intermediate changes are kept (I have also observed in other cases that variables are often assigned NULL when Oracle’s documentation says the result is undefined). A rescoring, giving full credit to all who did not select any of the absolutely wrong answers, seems to be in order.

    ReplyDelete
  12. Hello All,

    I checked and can confirm for Oracle8i the behavior described in the Oracle support document indicated by Vitaliy, but this does not seem true for VARCHAR2 columns.

    The original example from the quiz used %TYPE to declare the target variable (so, same type and size) however, running it in Oracle8i left the variable NULL after the SELECT.

    Maybe it does however explain at least partially the strange behavior seen above with the INTEGER variable, even for Oracle11g, because, as we know, the database INTEGER type is "not the same" as the PL/SQL INTEGER type.

    Of course we don't discuss here best practices, on that issue I think that we all have completely agreed, however, I think that the opinion of somebody from Oracle on this issue would be most welcome, at least to satisfy our curiosity and hear their pros and cons about one behavior or the other.

    Best Regards,
    Iudith Mentzel

    ReplyDelete
  13. Hello All,
    Just one more addition regarding the above case for Oracle8i:

    For VARCHAR2, the behavior is different when selecting into a local variable or directly into the OUT argument of a procedure.

    For a local variable, it is always populated with the first row, regardless of whether it has the same or different size as the column,
    while for an OUT variable, it remains NULL:

    -- same type/size
    declare
    dummy varchar2(10); -- same type as emp.ename ..VARCHAR2(10)
    begin
    dbms_output.put_line('Body: dummy='||dummy);
    select ename into dummy from emp;

    exception when too_many_rows then
    dbms_output.put_line('Exception: dummy='||dummy);
    dbms_output.put_line(sqlerrm);
    end;
    /
    Body: dummy=
    Exception: dummy=SMITH
    ORA-01422: exact fetch returns more than requested number of rows

    PL/SQL procedure successfully completed.


    -- different type(size)
    declare
    dummy varchar2(20); -- different type than emp.ename ..VARCHAR2(10)
    begin
    dbms_output.put_line('Body: dummy='||dummy);
    select ename into dummy from emp;

    exception when too_many_rows then
    dbms_output.put_line('Exception: dummy='||dummy);
    dbms_output.put_line(sqlerrm);
    end;
    /

    Body: dummy=
    Exception: dummy=SMITH
    ORA-01422: exact fetch returns more than requested number of rows

    PL/SQL procedure successfully completed.

    -- anchored type
    declare
    dummy emp.ename%type; -- same type as emp.ename ..VARCHAR2(10)
    begin
    dbms_output.put_line('Body: dummy='||dummy);
    select ename into dummy from emp;

    exception when too_many_rows then
    dbms_output.put_line('Exception: dummy='||dummy);
    dbms_output.put_line(sqlerrm);
    end;
    /

    Body: dummy=
    Exception: dummy=SMITH
    ORA-01422: exact fetch returns more than requested number of rows

    PL/SQL procedure successfully completed.


    -- OUT variable, instead of local variable
    declare
    v_result emp.ename%type;

    procedure p ( dummy OUT emp.ename%type)
    is
    begin
    dbms_output.put_line('Body: dummy='||dummy);
    select ename into dummy from emp;

    exception when too_many_rows then
    dbms_output.put_line('Exception: dummy='||dummy);
    dbms_output.put_line(sqlerrm);
    end;
    begin
    p (v_result);
    dbms_output.put_line('Result: v_result='||v_result);
    end;
    /
    Body: dummy=
    Exception: dummy=
    ORA-01422: exact fetch returns more than requested number of rows
    Result: v_result=

    PL/SQL procedure successfully completed.

    As we dig deeper, things become stranger and stranger ... so I think we can all agree that Oracle10g and higher are at least consistent on this aspect...

    Best Regards,
    Iudith Mentzel

    ReplyDelete
  14. That "Declare the variable and column with different datatypes" factoid sounds interesting. Logically when the data is fetched it needs to be put somewhere. If the column is character and the PL/SQL variable is numeric, then the PL/SQL engine has to put it 'somewhere' where it can check whether the value can be converted successfully. But if the column and variable correspond it can bypass that check and write the value direct to that bit of memory. Correspondence will need to check data length too (maybe why OUT parameters are odd, since they don't have a length).

    In 10.2.0.4, the following script shows the different contents of the result variables depending on the whether the destination variable is guaranteed to be big enough to take the column value.

    set serveroutput on
    declare
    cursor c_1 is
    select privilege from session_privs;
    v_rec c_1%rowtype;
    v_t session_privs.privilege%type;
    v_c10 varchar2(10);
    v_c40 varchar2(40);
    v_c90 varchar2(90);
    begin
    begin
    select privilege into v_rec.privilege from session_privs;
    exception
    when too_many_rows then
    dbms_output.put_line('1:'||v_rec.privilege);
    end;
    --
    begin
    select privilege into v_t from session_privs;
    exception
    when too_many_rows then
    dbms_output.put_line('2:'||v_t);
    end;
    --
    begin
    select privilege into v_c10 from session_privs;
    exception
    when too_many_rows then
    dbms_output.put_line('3:'||v_c10);
    end;
    --
    begin
    select privilege into v_c40 from session_privs;
    exception
    when too_many_rows then
    dbms_output.put_line('4:'||v_c40);
    end;
    --
    begin
    select privilege into v_c90 from session_privs;
    exception
    when too_many_rows then
    dbms_output.put_line('5:'||v_c90);
    end;
    --
    end;
    /

    Is there anyone NOT learning something from this quiz ?

    ReplyDelete
  15. I've updated the explanation on the quiz to the following (feedback is welcome):

    "According to the documentation http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14261/selectinto_statement.htm#sthref3054 the state of a variable that you SELECT INTO will be undefined if the statement raises TOO_MANY_ROWS.

    "In practice, however, it seems that - sometimes - the data from the first row returned by the query is assigned to the variable(s) in the INTO clause; only when the second row is fetched is the TOO_MANY_ROWS exception raised.

    "We can tell because in our exception handler we output the value of the variable. Since the exception is handled, the procedure completes without error and the OUT parameter is assigned as well.

    "See http://plsql-challenge.blogspot.com/2011/07/toomanyrows-and-indeterminate-behavior.html for descriptions of different behaviour in different versions of the database, and under differing conditions (e.g. with different data types used). It's important to note that there is *no right way* to use the variables in a TOO_MANY_ROWS exception - if you need the values from the first row returned from a query, use a suitable predicate (e.g. make sure your predicate is on a uniquely constrained set of columns) - don't use the TOO_MANY_ROWS exception to get them."

    ReplyDelete
  16. Gary, the only thing that should be learnt from the quiz is to do not rely on undefined values.

    ReplyDelete
  17. Undefined does not imply inconsistent, - it is not so.
    Undefined means it may be anything, we provide no guarantee what it may be, we provide no guarantee of a repeatability.

    From my point of view the quiz should be not simply re-scored but taken off-record altogether.

    ReplyDelete
  18. Excellent discussion; as Gary says, "Is there anyone NOT learning something from this quiz?"

    Here is what I will do with this quiz:

    1. Change the "B:" choice to "Results are undefined" and give everyone credit for that. 98% of players got this right anyway; it was not a very interesting alternative.

    2. Change explanation on A:B: to note that in earlier versions of Oracle these values were, as a rule, returned. But that behavior changed in 10g to A:Jones B:Jones.

    3. We will continue to score the first choice (A:Jones B:Jones) as correct, which is consist with the approach we have taken at the PL/SQL Challenge to reflect not only theoretically correct behavior but people will experience in their day to day work. But we will also make clear that it cannot be relied upon.

    Changes in score and ranking will be seen tomorrow.

    ReplyDelete
  19. Al0, rather than refuting my statement that "undefined does not imply inconsistent," your argument affirms it. As you observe, there is no guarantee of repeatability. Conversely, there is no guarantee of variability; thus, no implication of inconsistency.

    Given the nature of programming, consistency is more likely than inconsistency. The absence of requirements means that the developers were not constrained as to the handling of returned values in the case of an exception; however, at some point design decisions were made and implemented. Unless deficiencies are found or specifications changed, it is unlikely that the code will be modified in future releases.

    ReplyDelete
  20. Hello Steven, All,
    I see that the above changes were already applied to the quiz choices and answers, but the choice
    "B:" that was modified as per your point 1) above
    is NOT scored now as correct for everybody.

    I think that that choice was not controversial at all, and should not have been modified in any direction.
    That choice has only one row of output, and regardless of the content of that row, it can never be correct, the output could only contain 2 rows.

    I think that the only change that may be done
    is the one suggested in a post above by jhall62,
    namely,
    "A rescoring, giving full credit to all who did not select any of the absolutely wrong answers,
    seems to be in order.".
    ( the absolutely wrong answers being those numbered 5905,5906,5907 ).

    Thanks & Best Regards,
    Iudith Mentzel


    After all this discussion, only the 2 choices
    "A: B:" and "A:Jones B:Jones" were ever correct at any point in time along Oracle's history ...

    As a general principle, in case of ANY controversy, I think that the text itself of any choice should NEVER be modified after that quiz was taken.
    All that can be modified is the explanation and/or scoring of a choice.
    As such, the "B:" choice should remain as is
    and be considered incorrect, as most of the players in fact have chosen, without any modification.

    Thanks & Best Regards,
    Iudith Mentzel

    ReplyDelete
  21. Iudith, I appreciate your suggestion about general principle "the text itself of any choice should NEVER be modified after that quiz was taken." and generally I try to stick with that. However, the initial taking of the quiz is not the end of this quiz. You will soon be able to take the quizzes again as practice quizzes, managers can put them into their own
    tests, etc.

    So the priority for me is to make sure that each question is valid for FUTURE usage. Sometimes very minor changes are needed, but sometimes (and this is one of them) more edits are needed.

    I also do not plan to give credit to the "A:B:" choice. That is also "always wrong" based on the assumptions of the quiz (10 and higher).

    ReplyDelete
  22. Hello Steven, All,

    Ah, yes, ... the big picture that we usually forget in our permanent run after the "absolute truth" ...

    I think that Jeff was correct by putting a link to this blog thread into the Overall Answer,
    what this thread contains is, indeed, the best tutorial possible about this issue :) :) :).

    As Gary said, EVERYBODY has learned from it, I strongly hope that Oracle authors also will be notified
    of it and take a cutting decision ... may that be our 2 penny to the Product documentation improvement.

    Also, I think that it could be a very good practice to automatically add a link to the blog thread
    on the Quiz Summary page each time when such an interesting thread develops, like this one.
    and, of course, keep this material online for the "future generations of learners and players :) :) :)

    Thanks a lot & Best Regards,
    Iudith

    ReplyDelete
  23. Iudith, we already do what you suggest (unless I forget to post a news item). You should see a "Quiz Commentary" section when there is a blog discussion for that quiz.

    ReplyDelete