iostreams: Does imbue() need to be called before open()?

iostreams: Does imbue() need to be called before open()?

Post by Chuck_McDe » Tue, 14 Oct 2003 15:50:15


Someone told me that when using standard iostreams, if you want to use
imbue(), you have to call imbue() before opening the iostream to a file.

Is this true? This would be a really *** restriction:

1) you couldn't change locales part way through writing to/reading from a
stream.
2) you couldn't construct an fstream from a FILE * and then call imbue(),
since constructing from a FILE * means the file is already open.

My guess is that it is legal to call imbue() at any time. What's the real
answer?


---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.yqcomputer.com/ ]
 
 
 

iostreams: Does imbue() need to be called before open()?

Post by juerge » Wed, 15 Oct 2003 02:48:11


[-]
No, but ...

The but, to the best of my knowledge, is that code conversions can be
state-dependent and if your iostream is buffered it's possible you're
trying to imbue a new locale in the middle of a state-dependent
conversion.

Best so to either all imbue before doing anything else or after
having flushed the streams' buffer.

Hoping not to be totally wrong 8-}
Juergen

--
\ Real name : Juergen Heinzl \ no flames /
\ EMail Private : XXXX@XXXXX.COM \ send money instead /

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.yqcomputer.com/ ]

 
 
 

iostreams: Does imbue() need to be called before open()?

Post by kanz » Thu, 16 Oct 2003 06:06:45

XXXX@XXXXX.COM ("Chuck McDevitt") wrote in message
news:<2Ehib.745634$Ho3.185470@sccrnsc03>...



It's not quite that bad, but...


There are more than a few nasty restrictions, linked with the use of the
codecvt facet in the filebuf. Basically, if rdbuf() != 0,
basic_ios::imbue calls rdbuf()->imbue. In the case of a filebuf, there
is a precondition: "If the file is not positioned at its beginning and
the encoding of the current locacle as determined by
a_codecvt.encoding() is state-dependent then that facet is the same as
the corresponding facet of loc.

This means that if you are using a non-state-dependant encoding, you can
change facets at will. If the file was one imbued with a state
dependant facet, however, you must take care to ensure that any imbued
locale contains the same codecvt facet; something along the lines of the
following would be safe:

typedef std::codecvt< char, char, std::char_traits< char >::state_type >
Cvt ;
stream.imbue(
std::locale(
desiredLocale,
std::use_facet< Cvt >( stream.rdbuf()->getloc() ) ) ) ;

Alternatively, you can avoid touching the locale of the filebuf by using
something like the following:

streambuf* save = source.rdbuf() ;
source.rdbuf( NULL ) ;
source.imbue( newLocale ) ;
source.rdbuf( save ) ;

This solution actually appeals to me as a means of managing the code
conversion in the filebuf separately from the locale used for
formatting/parsing.


You can't in general. If the goal is to change the way e.g. numbers or
dates are parsed, see above. If the goal is to change the way the
stream is encoded, you can only do this if you have been using an
encoding without state previously.

For file types like HTML2 or XML, where you have to read into the file
in order to determine the encoding, you must start with plain ASCII (or
some other single byte encoding, like ISO 8859-1), and use that, hoping
for the best, until you can actually determine the encoding.


You can't currently construct an fstream from a FILE* in a strictly
standard fstream; the ability to do so is an extension. So you must
check what the implementation says concerning the extension. Typically,
I would expect the same rules as after an open, i.e. those above.


Your guess is wrong.

James Kanze GABI Software mailto: XXXX@XXXXX.COM
Conseils en informatique orient objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]

 
 
 

iostreams: Does imbue() need to be called before open()?

Post by rmaddo » Thu, 16 Oct 2003 11:05:26


The real answer is that while it is legal to call imbue() at any time,
the fact of the matter is that this can be quite unsafe. If you look
at any of the examples in the Standard (e.g., 22.2.8/4/5) you will
note that each stream is imubed with a locale PRIOR to any
input/output, which should be safe.

Also, if you have access to the Langer and Kreft text "Standard C++
IOStreams and Locales", which should definitely be on your bookshelf,
a note in 2.1.4 headed "A Note on Proper Imbuing" points out some of
the hazards of changing a stream's locale during ongoing input/output.

Bottom line, as long as the call to imbue precedes any input/output
you should be good.

I believe this to be good advice to the best of my knowledge, but if
anyone else out there has better advice, please do jump right in. :-)

Randy.

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.yqcomputer.com/ ]
 
 
 

iostreams: Does imbue() need to be called before open()?

Post by kanz » Fri, 17 Oct 2003 17:47:38


The real answer is that std::basic_ios::imbue calls rdbuf()->imbue, and
that the standard places some very concrete preconditions on
filebuf::imbue. Violating a precondition is NOT legal, and results in
undefined behavior.

On the other hand, you really have to know where to look to find this
information. (Dietmar Kuehl pointed it out to me -- before that, I
pretty much thought that it was legal too.) You really shouldn't have
to search under filebuf to find a precondition for basic_ios (which
isn't always one, because if you know that your basic_ios is really a
stringstream, there's no problem).

James Kanze GABI Software mailto: XXXX@XXXXX.COM
Conseils en informatique orient objet/ http://www.yqcomputer.com/
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.yqcomputer.com/ ]
 
 
 

iostreams: Does imbue() need to be called before open()?

Post by rmaddo » Sat, 18 Oct 2003 13:11:59


James,

Thanks for the clarification and direction. Following your pointer I
ended up in the Standard at 27.8.1.4/17-19 and found text quite
similar to the warnings given by Langer and Kreft. Good to know where
in the Standard those warnings came from. As noted there, it seems
that you should be OK in the case of non-state-dependent encoding, or,
as you note, with a stringstream. Is that accurate advice?

Also, if you would be so kind, could you please provide any insight
into the note in para. 19 about possibly requiring reconversion of
previously converted characters or reconstruction of the original file
contents. I'm not sure what that means, but it sounds scary.

In any case, it seems clear that changing the locale may drastically
alter how input/output data is converted, which in turn will certainly
have an impact on the calling code. One instance in which this seems
useful, and even necessary, is the case of an XML parser that reads an
initial header in the file to determine the encoding of the rest of
the file and then switches to that encoding.

Randy.

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.yqcomputer.com/ ]
 
 
 

iostreams: Does imbue() need to be called before open()?

Post by kanz » Mon, 20 Oct 2003 11:15:07

XXXX@XXXXX.COM (Randy Maddox) wrote in message
news:< XXXX@XXXXX.COM >...





For standard streambuf's. Adding, of course, strstreambuf to the list
of safe types. The problem is really that the standard gives no
specifications as to what the contract is -- a user supplied streambuf
can do anything it wishes. And when all you have got is an istream& or
an ostream&, you really have to suppose that you might be dealing with a
user defined streambuf.

In practice, I generally suppose that the only streambuf which actually
uses the locale is a filebuf, and suppose thus that its guarantees
hold. Unless, of course, I know the actual type of the streambuf -- in
such cases, you can count on the actual behavior of the streambuf.


I think that it is only meant to be the rationale behind the
pre-conditions. I don't really understand it either, but if you have
problems with previously converted characters, it can only mean that
you've already read some.

Another thing that isn't clear is the relationship between the locale
and seeking. I would guess that the only case where imbue would be
allowed is after a seek to the beginning, but it would be nice if the
standard said so.


In the case of XML, it would seem that the information concerning the
code set can only appear after very limited header information, which
can only contain a limited number of characters and has a simple
structure. So you read ASCII (probably locale "C"), check the first few
characters, and rewind and shift to EBCDIC if necessary. Then you
finish reading up to the codeset information, and set the codeset. No
problem, since both ASCII and EBCDIC are single byte character codes,
without state information.

The case of HTML2 is somewhat more complex -- the codeset information is
nested deep in the <head> structure. And other elements, which may
precede it, may contain just about any character. The procedure for XML
should work, but anything with non-ASCII characters preceding the
codeset information will have been lost. In real life, I rather suspect
that I would read everything through the </head> tag into a string, more
or less byte by byte, and then work on the string. That way, once I
knew the codeset, I could easily start over. Of course, this means that
my code will be doing the translation (at least in the header), rather
than the filebuf, but I don't really see any other solution -- even if
it were guaranteed that I could call imbue after a rewind, this wouldn't
help if I were reading from a socket (not unlikely in the case of HTML),
or some other source which doesn't support rewind.

James Kanze GABI Software mailto: XXXX@XXXXX.COM
Conseils en informatique orient objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]