tag:blogger.com,1999:blog-8677649049588007585.post7467755018938301743..comments2023-06-18T16:15:22.432+01:00Comments on PL/SQL Challenge: 26 May 2010 quiz: UTL_FILE and newline character: were we wrong?(323)Steven Feuersteinhttp://www.blogger.com/profile/16619706770920320550noreply@blogger.comBlogger14125tag:blogger.com,1999:blog-8677649049588007585.post-66580460075395134072012-10-11T08:59:49.386+01:002012-10-11T08:59:49.386+01:00You know what's even worse than the automatic ...You know what's even worse than the automatic CHR(10) translation? The fact that it happens not only at the end of a line but whenever the character occurs!<br /><br />Let's assume the following VARCHAR2 in a variable vText:<br /><br />"This a two-line<br />text"<br /><br />As in, you have a LF/CHR(10) stored in the varchar2 between "line" and "text". If you write this to a file like this:<br /><br />UTL_FILE.PUT_LINE(OutputFile, vText);<br /><br />Then you end up with the CHR(10) WITHIN the varchar being replaced with a CHR(13) CHR(10). Needless to say, this is very destructive if the file you were writing was a script that manipulates the database again. E.g. if you create an INSERT statement in the file like this, you will end up with faulty data in the newly created line. You might even exceed the size limit of the VARCHAR2 if the original one was close to it and had multiple linebreaks. This is, imho, a bug. The replacement of CHR(10) should only ever happen at the end of a line.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-8677649049588007585.post-177618554133077542010-06-11T09:40:42.587+01:002010-06-11T09:40:42.587+01:00I'm also not happy with using a non-document /...I'm also not happy with using a non-document /hidden feature for the automatic CHR(10) translation in a program, and I'm sure that anyone that new the distinction between /r and /n would also think that chr(10) was not the solution unless they had make that tracking of that "feature".<br />Using /n would be the best pratice in that case and not relying in an automatic chr(10) translation that was a surprise to everybody in the challenge (including the creators)Filipe Silvahttps://www.blogger.com/profile/07846537897685950997noreply@blogger.comtag:blogger.com,1999:blog-8677649049588007585.post-50799218918033538032010-05-31T10:47:10.857+01:002010-05-31T10:47:10.857+01:00In relation to the quiz question from 2010-04-30 a...In relation to the quiz question from 2010-04-30 about the order of trigger execution order:<br /><br />"It seems that answer 'you cannot tell what order of the trigger is execution' is the correct one BUT IT IS NOT TRUE. Triggers are fired in the order that depends on their OID (order is reversed). So you actually can tell what will be the order. That way answer 'There is no way to know for sure' is not correct. Oracle will execute the trigger with highest OID as last - so it seems like answer 'triggers are executed in the order in which they are created' could be correct as well (because the OID is sequential) however the order is reversed - thus it is not correct. I have given the answer that you can never know (that is what Oracle documentation mentioned somewhere) although I don't really feel that the answer is correct."<br /><br />This is a cite from my mail. I can perform a set of experiments like you have presented - but is there any point to do it?<br />Note that I have described this "semi-documented Oracle feature" and given an answer that I felt that is expected. It is not about how it works and I am not interested in re-calculation of the points. I am only interested in clarification of the answers and not saying that "this is the correct answer because we can find a situation when it will be true although".Unknownhttps://www.blogger.com/profile/16279008641149475715noreply@blogger.comtag:blogger.com,1999:blog-8677649049588007585.post-30450770354344451352010-05-31T07:23:24.409+01:002010-05-31T07:23:24.409+01:00You are correct about the behavior of fprintf on W...You are correct about the behavior of fprintf on Windows platforms. Most of my C programming experience has been on Unix/Linux and embedded systems and I had forgotten about Windows’ disparate handling of text and binary mode files (oh, the dangers of inference from inadequate sampling).<br /><br />The “\n” used in the demonstration code is not a control character; it is an escape sequence that generates a platform-specific newline control character. IEEE Std. 1003.1 states:<br /><br />“In addition to the escape sequences shown in the Base Definitions volume of IEEE Std 1003.1-2001 Chapter 5, File Format Notation ( '\\', '\a', '\b', '\f', '\n', '\r', '\t', '\v' ), "\ddd", where ddd is a one, two, or three-digit octal number, shall be written as a byte with the numeric value specified by the octal number.” (http://www.opengroup.org/onlinepubs/009695399/utilities/printf.html)<br /><br />On ASCII platforms the “\n” and “\012” escape sequences evaluate to the same byte value (decimal 10). On EBCDIC platforms the “\n” and “\045” escape sequences each evaluate to the same byte value (decimal 37). As shown in the demonstration code, on Windows platforms, when writing to a file that has been opened in text mode, newlines are translated to CR-LF pairs. This occurs regardless of whether “\n” or “\012” is used to embed the newline character (it should be noted that such behavior conflicts with the IEEE specification cited above). Passing CHR(10) to UTL_FILE.PUT is the equivalent of passing “\012” to the C I/O library utilities; it is not the equivalent of “\n” on all platforms.<br /><br />I failed to select two of the correct answers because I read too quickly and failed to notice that they used PUTF instead of PUT. I did not misread the candidate answer that used CHR(10) but answered based on the knowledge that Oracle 10gR2 is available for EBCDIC platforms and that CHR(10) is not a newline on such platforms. The interpretation of control codes is platform specific and is not properly part of PL/SQL. The documentation for UTL_FILE does not specify the control character values but instead states that PUT_LINE, “\n” with PUTF, or NEWLINE with PUT should be used to end a line.<br /><br />I recognize the difficulties with creating questions and candidate answers that are free of ambiguity and unstated assumptions and greatly appreciate the openness with which challenges have been accepted. Since neither the quiz administrators nor the participants seem to have access to non-ASCII platforms, I suggest that the declared assumptions be amended to exclude such platforms.jhall62https://www.blogger.com/profile/10339038131928463003noreply@blogger.comtag:blogger.com,1999:blog-8677649049588007585.post-28771408778604185312010-05-31T07:15:14.676+01:002010-05-31T07:15:14.676+01:00I was one of the persons that issued both the issu...I was one of the persons that issued both the issues. And I found the reasoning very clear. One of the reaons I prefer the PUTLINE or the \n. But the CHR(10) translation to two characters was a surprise.<br />As said before, no rescoring needed. This discussion is far more useful.Wim de Langehttps://www.blogger.com/profile/05505341375827859005noreply@blogger.comtag:blogger.com,1999:blog-8677649049588007585.post-66563226326029893562010-05-30T21:27:35.626+01:002010-05-30T21:27:35.626+01:00I'm afraid that fprintf does translate some co...I'm afraid that fprintf does translate some control characters. If the file is opened in text mode \n is translated to the platform-specific line terminator(s).<br /><br />Example: Run this under Windows:<br /><br />#include <br /><br />int main() {<br /> FILE *file;<br /><br /> file = fopen("c:\\temp\\quiz.txt", "wt");<br /> if (file) {<br /> fprintf(file, "First\n");<br /> fprintf(file, "Second\r\n");<br /> fclose(file);<br /> }<br /><br /> file = fopen("c:\\temp\\quiz.bin", "wb");<br /> if (file) {<br /> fprintf(file, "First\n");<br /> fprintf(file, "Second\r\n");<br /> fclose(file);<br /> }<br />}<br /><br />and you'll the the following contents:<br /><br />quiz.txt:<br /><br />0D31:0100 46 69 72 73 74 0D 0A 53-65 63 6F 6E 64 0D 0D 0A First..Second...<br /><br />quiz.bin:<br /><br />0D31:0100 46 69 72 73 74 0A 53 65-63 6F 6E 64 0D 0A First.Second..<br /><br />This is how C has always behaved.<br /><br />I'm not to happy about relying on semi-documented Oracle features either but the point here is that if you do the obvious (CHR(13) + CHR(10) with UTL_FILE.PUT) you get the wrong result.Finn Ellebaek Nielsenhttps://www.blogger.com/profile/03504981000801261183noreply@blogger.comtag:blogger.com,1999:blog-8677649049588007585.post-87593861104684410352010-05-28T18:35:22.673+01:002010-05-28T18:35:22.673+01:00As stated in the Oracle PL/SQL Packages and Types ...As stated in the Oracle PL/SQL Packages and Types Reference, the UTL_FILE.PUTF procedure "works like a limited printf()" and thus strings such as "\n" will produce the appropriate platform-specific line terminator. However, the examples given in the original post clearly show that UTL_FILE is not relying solely on C's fprint or fprintf to process the passed strings. When using fprintf the "\n" in the format string may result in a multi-character line terminator depending on the platform; however, both fprint and fprintf pass control characters (e.g., LF (0x0A) and CR (0x0D)) unchanged to the output. The first example demonstrates that UTL_FILE.PUT is doing additional processing of at least some control characters. This is outside of the specified behavior of the C library routines and is not specified in the Oracle UTL_FILE documentation. Given this, there is no basis for assuming that character 0x0A ("linefeed" in ASCII, "repeat" in EBCDIC) will be changed into a line terminator by UTL_FILE on an EBCDIC platform (there are, however, many reasons to hope that it would not).<br /><br />I'm not particularly concerned about having the quiz scores adjusted, but do want to warn others against relying on undocumented behavior. As adderek noted, unspecified behavior should not be relied upon -- it may change.<br /><br />On a side note, I'm glad this forum has been established. As Johannes posted, the opportunity to acquire new knowledge is far more important than a few quiz points. The issues we discuss may inspire each of us to improve the quality of our code.jhall62https://www.blogger.com/profile/10339038131928463003noreply@blogger.comtag:blogger.com,1999:blog-8677649049588007585.post-47383130207865726752010-05-28T15:46:31.031+01:002010-05-28T15:46:31.031+01:00According to the Oracle SQL Reference, "Use o...According to the Oracle SQL Reference, "Use of the CHR function (either with or without the optional USING NCHAR_CS clause) results in code that is not portable between ASCII- and EBCDIC-based machine architectures."<br /><br />The explanations/examples defending the quiz's answers do not address behavior on EBCDIC platforms; only ASCII platforms were used. While I don't have access to Oracle on an EBCDIC platform (and thus can't prove my theory), it seems unreasonable to expect that UTL_FILE would translate CHR(10) (which is RPT, the "repeat" control character in EBCDIC) into a line termination in such an environment.<br /><br />Unexpected behavior caused by the use of literal control characters is one of many issues that I had to deal with many years ago when I wrote code for both EBCDIC and ASCII platforms.jhall62https://www.blogger.com/profile/10339038131928463003noreply@blogger.comtag:blogger.com,1999:blog-8677649049588007585.post-89868577764863793372010-05-28T15:43:13.267+01:002010-05-28T15:43:13.267+01:00As we wrote UTL_FILE relies on the behavior of C&#...As we wrote UTL_FILE relies on the behavior of C's fopen and fprintf. I'm fairly confident that this will ensure the correct line terminator(s) on any platform that Oracle runs on. This is not going to change as we've had UTL_FILE provided by Oracle for quite a few years now.<br /><br />I don't have access to a database instance running on an EBCDIC-based platform. It would be interesting if any players could test the behavior there.<br /><br />Any takers?Finn Ellebaek Nielsenhttps://www.blogger.com/profile/03504981000801261183noreply@blogger.comtag:blogger.com,1999:blog-8677649049588007585.post-78777321093023206552010-05-28T10:31:13.717+01:002010-05-28T10:31:13.717+01:00You said: "UTL_FILE.PUT puts text into the fi...You said: "UTL_FILE.PUT puts text into the file buffer, but does not append a newline character.<br /><br />OK, but I can always provide the newline character myself, which is what I did in this block. Oracle will ensure that CHR(10) is translated to the platform-specific line terminator(s)."<br /><br />And I will reply:<br />"You cannot depend on the behavior described above. Even if you observe that it worked like that, you cannot count on that."<br />"Unless you can prove that Oracle 10g Release 2 with all combinations of platforms and patch sets/patches ensures the newline character will be translated to a platform specific end of line character, you can't rely on it.<br />Similarly, many Oracle developers for years relied on GROUP BY also doing an ORDER BY, but this was changed with Oracle 10g Release 1."<br /><br />Please look at http://www.oracle.com/technology/oramag/oracle/06-mar/o26asktom.html for similar response about trigger ordering.<br />Note: This reply is rephrased reply which I have received from Steven Feuerstein and later Finn Ellebaek Nielsen at Saturday, 2010-05-01Unknownhttps://www.blogger.com/profile/16279008641149475715noreply@blogger.comtag:blogger.com,1999:blog-8677649049588007585.post-47967115163415416142010-05-28T09:18:04.206+01:002010-05-28T09:18:04.206+01:00Steven, thanks for the explanation. That's a l...Steven, thanks for the explanation. That's a lessons learned for me ;)Davidhttps://www.blogger.com/profile/14511056501986373205noreply@blogger.comtag:blogger.com,1999:blog-8677649049588007585.post-82176227714497361122010-05-28T04:27:43.309+01:002010-05-28T04:27:43.309+01:00Thanks for the detailed response. I did not get a ...Thanks for the detailed response. I did not get a higher score but learning something new is worth much much more :-)Johannes de Jonghttps://www.blogger.com/profile/14332647815367577750noreply@blogger.comtag:blogger.com,1999:blog-8677649049588007585.post-34306042149676299912010-05-27T22:32:01.113+01:002010-05-27T22:32:01.113+01:00The following is directly from the SQL Reference f...The following is directly from the SQL Reference for 10gR2:<br /><br />"Use of the CHR function (either with or without the optional USING NCHAR_CS clause) results in code that is not portable between ASCII- and EBCDIC-based machine architectures."<br /><br />As you stated, the Oracle documentation does not describe some of the behavior of UTL_FILE very well; it is even less informative about the implications of running on an EBCDIC based platform.jhall62https://www.blogger.com/profile/10339038131928463003noreply@blogger.comtag:blogger.com,1999:blog-8677649049588007585.post-47153293838596032222010-05-27T21:22:52.972+01:002010-05-27T21:22:52.972+01:00All of the examples above are based on platforms w...All of the examples above are based on platforms with ASCII control characters, <br />none of them address platforms with EBCDIC control characters (e.g., Oracle 10gR2 on IBM z/OS using the UTFE character set). I don't have access to Oracle on z/OS, so I will concede that passing "\n" through UTL_FILE.PUT will likely produce the prescribed effect (i.e., a line termination); however, it seems unreasonable to expect that UTL_FILE should transalate CHR(10) (i.e., RPT, the "repeat" control character in EBCDIC) into a line termination on an EBCDIC platform.<br /><br />It has been many years since I last used EBCDIC, but I distinctly remember the pains caused by working with both ASCII and EBCDIC character sets (e.g., different control characters, different sort orders when relying on the binary representation of characters, lazy character range checking, etc.).jhall62https://www.blogger.com/profile/10339038131928463003noreply@blogger.com