n Nov 13, 2:26 pm, Pete Becker < XXXX@XXXXX.COM > wrote:
And when writing to a file, the embedded locale does code
translation, regardless of the mode or whether you use inserts
or unformatted output functions. (A name which I detest. All
output has a format. If you use write() carelessly, you may end
up not knowing the format, but it does have a format.)
There's a lot more to it than that. On some systems, different
files types might be used, and it may not be possible to open in
text mode a stream written in binary, and vice versa. On other
systems (including all Unix and Unix-like systems), there's
absolutely no difference, the two modes are indistiguishable.
Again, it's more subtle than that. According to the (C)
standard: "A text stream is an ordered sequence of characters
composed into lines, each line consisting of zeor or more
characters plus a terminating new-line character." A text stream
is only guaranteed to work if the data written consists only of
printing characters, '\t' and '\n', no '\n' is immediatly
preceded by a space character, and the last character is a '\n'.
You cannot simply write arbitrary binary data to a text stream.
And what is physically on the disk may be different from what
you would see if you dumped the buffer from memory. (At least
one implementation of C for an IBM mainframe used ASCII
internally, and translated to EBCDIC when it output to disk.
Lines in text files were mapped to fixed length records on the
disk, with each line spaced padded to the record
length---trailing spaces were stripped on input---and no
characters whatever for the '\n'. The standard was carefully
designed to allow such implementations.)
In practice: Unix treats text mode as binary. (Unix also
requires an encoding derived from ASCII, with '\n' being
represented as 0x0A, '\r' as 0x0D, '\t' as 0x09, etc. This is a
Posix requirement, however, not a C/C++ one.) Windows maps '\n'
to the two byte sequence 0x0D, 0x0A, and recognizes 0x1A as an
end of file on input.
A binary stream just dumps whatever you give it to the output
(modulo code translation by the embedded locale). On the other
hand, it's not required to have a reliable end of file: you may
read more than you wrote (but the additional bytes are
guaranteed to be 0. This is a sop to some older OS's (CP/M, but
I think some DEC OS's as well), which maintained the file size
as a number of sectors, not bytes.
Not on my machines. It's been 0x0A on every system I've seen.
(If the machine used EBCDIC internally, it would probably be
0x15.) But most accurately, it's '\n'. Everywhere. If you're
writing text, you use it to indicate the end of the line, and
the system does the rest. If you're writing in binary mode, you
don't use it, directly.
In theory, at least. In practice, as I said, I've never seen an
actual implementation where it wasn't 0x0A.
James Kanze (GABI Software) email: XXXX@XXXXX.COM
Conseils en informatique orient objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sard, 78210 St.-Cyr-l'ole, France, +33 (0)1 30 23 00 34