<html> <head> <h1> Notes on the ctype implementation. </h1> </head> <I> prepared by Benjamin Kosnik (bkoz@redhat.com) on August 30, 2000 </I> <p> <h2> 1. Abstract </h2> <p> Woe is me. </p> <p> <h2> 2. What the standard says </h2> <p> <h2> 3. Problems with "C" ctype : global locales, termination. </h2> <p> For the required specialization codecvt<wchar_t, char, mbstate_t> , conversions are made between the internal character set (always UCS4 on GNU/Linux) and whatever the currently selected locale for the LC_CTYPE category implements. <p> <h2> 4. Design </h2> The two required specializations are implemented as follows: <p> <code> ctype<char> </code> <p> This is simple specialization. Implementing this was a piece of cake. <p> <code> ctype<wchar_t> </code> <p> This specialization, by specifying all the template parameters, pretty much ties the hands of implementors. As such, the implementation is straightforward, involving mcsrtombs for the conversions between char to wchar_t and wcsrtombs for conversions between wchar_t and char. <p> Neither of these two required specializations deals with Unicode characters. As such, libstdc++-v3 implements <p> <h2> 5. Examples </h2> <pre> typedef ctype<char> cctype; </pre> More information can be found in the following testcases: <ul> <li> testsuite/22_locale/ctype_char_members.cc <li> testsuite/22_locale/ctype_wchar_t_members.cc </ul> <p> <h2> 6. Unresolved Issues </h2> <ul> <li> how to deal with the global locale issue? <li> how to deal with different types than char, wchar_t? <li> codecvt/ctype overlap: narrow/widen <li> mask typedef in codecvt_base, argument types in codecvt. what is know about this type? <li> why mask* argument in codecvt? <li> can this be made (more) generic? is there a simple way to straighten out the configure-time mess that is a by-product of this class? <li> get the ctype<wchar_t>::mask stuff under control. Need to make some kind of static table, and not do lookup evertime somebody hits the do_is... functions. Too bad we can't just redefine mask for ctype<wchar_t> <li> rename abstract base class. See if just smash-overriding is a better approach. Clarify, add sanity to naming. </ul> <p> <h2> 7. Acknowledgments </h2> Ulrich Drepper for patient answering of late-night questions, skeletal examples, and C language expertise. <p> <h2> 8. Bibliography / Referenced Documents </h2> Drepper, Ulrich, GNU libc (glibc) 2.2 manual. In particular, Chapters "6. Character Set Handling" and "7 Locales and Internationalization" <p> Drepper, Ulrich, Numerous, late-night email correspondence <p> ISO/IEC 14882:1998 Programming languages - C++ <p> ISO/IEC 9899:1999 Programming languages - C <p> Langer, Angelika and Klaus Kreft, Standard C++ IOStreams and Locales, Advanced Programmer's Guide and Reference, Addison Wesley Longman, Inc. 2000 <p> Stroustrup, Bjarne, Appendix D, The C++ Programming Language, Special Edition, Addison Wesley, Inc. 2000 <p> System Interface Definitions, Issue 6 (IEEE Std. 1003.1-200x) The Open Group/The Institute of Electrical and Electronics Engineers, Inc. http://www.opennc.org/austin/docreg.html