18 April 2011

Infinite files? REALLY? (2205)

The 15 April quiz regarding infinite loops scored the following choice as incorrect:
PROCEDURE read_file (file_in IN UTL_FILE.file_type)
IS
   l_line   VARCHAR2 (32767);
BEGIN
   LOOP
      UTL_FILE.get_line (file_in, l_line);
   END LOOP;
END;
explaining: "This simple loop contains no EXIT, RAISE or RETURN statements, so at first glance one might think it will never terminate. This loop will not, however, go on and on and on. The reason is that there is no such thing as an infinitely large file (even if there is no limit to a file size, each and every file that is opened will have a finite number of lines in it). And when UTL_FILE.GET_LINE reads past the end of a file, it raises the NO_DATA_FOUND exception."

A player wrote with the following objection:

"The answer with UTL_FILE.get_line is not completely correct. At least the answer is stating "The reason is that there is no such thing as an infinitely large file ..." which is not correct. When I was answering it, pseudo files like /dev/urandom came to my mind, which make the loop really indefinite. If you are running on Win, you can check following wiki site: http://en.wikipedia.org/wiki//dev/random."

Big news to me. Has anyone had experience with this? Would anyone like to test out this hypothesis?

10 comments:

  1. Barring unexpected problems such as memory leaks, bugs in the underlying OS, or hardware failure, I'd expect special files like dev/random will continue to provide lines for as long as the Oracle session cares to read from it.

    Obviously it is impossible to prove that any given process will run forever - after all, the universe will end well before an infinite time has passed - so the only reasonable thing to say is that we "expect" it to run forever, as long as nothing external to the code causes it to stop (such as the operator killing the session or pulling the plug on the server).

    It is more reasonable to say that under "normal" conditions (i.e. when reading an ordinary file), this code will terminate eventually.

    ReplyDelete
  2. The player’s objection to the explanation is valid: it is possible to have a file that is unbounded in size. Jeff Kemp mentioned one case, the use of special files. Another case would be the use of pipes to connect an output of one process to an input of another.

    The potential for a file to be unbounded in size does not; however, change the correctness of the choice given the provided definition of infinite loop. There is a condition for terminating the loop (the end of data in the file), this condition can be met (although when it will is indeterminate), and the loop is not being restarted.

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. If you were reviewing code from a security point of view, then I think you would say that an attacker COULD abuse the code by providing a pointer to a file that wouldn't end.
    I'd be interested in what UTL_FILE.FGETATTR returned as the file length for such a file.

    ReplyDelete
  5. > I'd be interested in what UTL_FILE.FGETATTR returned as the file length for such a file.

    I tried. fGetAttr returned "Tricky. I'll have to think about it." I'm still waiting for the final answer :)

    ReplyDelete
  6. Hi guys,

    I have tested it and seems to be working as expected.

    My test code:
    DECLARE
    l_file UTL_FILE.file_type;
    l_buf VARCHAR2 (32767);
    l_counter PLS_INTEGER := 0;
    BEGIN
    l_file := UTL_FILE.fopen ('MYDIR', 'urandom', 'r', 32767);
    LOOP
    UTL_FILE.get_line (l_file, l_buf);
    l_counter := l_counter + 1;
    EXIT WHEN l_counter >= 1000000;
    END LOOP;
    UTL_FILE.fclose (l_file);
    DBMS_OUTPUT.put_line (TO_CHAR (l_counter));
    EXCEPTION
    WHEN OTHERS
    THEN
    IF UTL_FILE.is_open (l_file)
    THEN
    UTL_FILE.fclose (l_file);
    END IF;
    RAISE;
    END;

    Finished successfully after 1M loops.
    The exit clause is used just because I don't want to have it infinite. ;)

    Michal Cvan

    ReplyDelete
  7. I must admit to feeling a little bit torn on this topic, regarding whether or not to change the score for the point raised by Michal.

    I am supposed to avoid "trick" questions....and this, I would say, is a "trick" answer. This "device" pointed out to us by Michal is not a file, but then I do not specify in my choice that the "thing" pointed to by the "valid file handle" is actually a file!

    In addition, I am impressed to see that Michal actually knew at the time he took the quiz that you could define a device like this - or at least he answered that this choice could cause an infinite loop.

    So a correction must be made! I will give everyone credit for a correct answer on this choice, change the text to make clear that we are talking about FILES and not arbitrary kinds of devices that you can attach to a file handle, and add some explanation on this point to the answer for that choice.

    Thanks, Michal, for bringing this to our attention. You will receive a free O'Reilly ebook as your prize.

    By the way, this correction comes after about 45 days of mistake-free playing on the PL/SQL Challenge.

    I am very proud of and thankful to my reviewers, and the players who have submitted so many excellent quizzes, for the very high quality of our quizzes through this period.

    Cheers, Steven

    ReplyDelete
  8. Feature request for the next version: a "Days since last correction" count, much like the "Days since last accident" signs at construction sites :)

    /tongue-in-cheek

    ReplyDelete
  9. About the "trick" answer from Michal. I think it is wrong to consider it a trick answer. Such things can happen in real life and can lead to unexpected problems. I once had an application which creates a file with the codename of the customer and only [A-Z][0-9] allowed. And then there was this customer with the name 'COM4'. Which is a device name (it still is). So if you call something a file, it suddenly could be something else. Important lesson here.

    I'm glad that Michal got a price for making this statement.

    ReplyDelete
  10. Jeff: "Days since last correction" - I have thought of doing that myself. I think it would be "cute" - but I hate to draw attention to mistakes. :-) I have added it to the ER list.

    Wim and "trick": I should have clarified that by "trick" I meant that it was a test case that didn't really have anything to do with the point of the question. It was, however, a perfect learning opportunity - and thanks to Michal for that!

    ReplyDelete