Стандарт языка Си С99 TC (1113411), страница 72
Текст из файла (страница 72)
Otherwise, zero is returned and the contents of the array areindeterminate.7.24.6 Extended multibyte/wide character conversion utilities1The header <wchar.h> declares an extended set of functions useful for conversionbetween multibyte characters and wide characters.2Most of the following functions — those that are listed as ‘‘restartable’’, 7.24.6.3 and7.24.6.4 — take as a last argument a pointer to an object of type mbstate_t that is usedto describe the current conversion state from a particular multibyte character sequence toa wide character sequence (or the reverse) under the rules of a particular setting for theLC_CTYPE category of the current locale.3The initial conversion state corresponds, for a conversion in either direction, to thebeginning of a new multibyte character in the initial shift state.
A zero-valuedmbstate_t object is (at least) one way to describe an initial conversion state. A zerovalued mbstate_t object can be used to initiate conversion involving any multibytecharacter sequence, in any LC_CTYPE category setting. If an mbstate_t object hasbeen altered by any of the functions described in this subclause, and is then used with adifferent multibyte character sequence, or in the other conversion direction, or with adifferent LC_CTYPE category setting than on earlier function calls, the behavior isundefined.299)4On entry, each function takes the described conversion state (either internal or pointed toby an argument) as current.
The conversion state described by the pointed-to object isaltered as needed to track the shift state, and the position within a multibyte character, forthe associated multibyte character sequence.299) Thus, a particular mbstate_t object can be used, for example, with both the mbrtowc andmbsrtowcs functions as long as they are used to step sequentially through the same multibytecharacter string.386Library§7.24.6WG14/N1256Committee Draft — Septermber 7, 2007ISO/IEC 9899:TC37.24.6.1 Single-byte/wide character conversion functions7.24.6.1.1 The btowc functionSynopsis1#include <stdio.h>#include <wchar.h>wint_t btowc(int c);Description2The btowc function determines whether c constitutes a valid single-byte character in theinitial shift state.Returns3The btowc function returns WEOF if c has the value EOF or if (unsigned char)cdoes not constitute a valid single-byte character in the initial shift state.
Otherwise, itreturns the wide character representation of that character.7.24.6.1.2 The wctob functionSynopsis1#include <stdio.h>#include <wchar.h>int wctob(wint_t c);Description2The wctob function determines whether c corresponds to a member of the extendedcharacter set whose multibyte character representation is a single byte when in the initialshift state.Returns3The wctob function returns EOF if c does not correspond to a multibyte character withlength one in the initial shift state. Otherwise, it returns the single-byte representation ofthat character as an unsigned char converted to an int.7.24.6.2 Conversion state functions7.24.6.2.1 The mbsinit functionSynopsis1#include <wchar.h>int mbsinit(const mbstate_t *ps);Description2If ps is not a null pointer, the mbsinit function determines whether the pointed-tombstate_t object describes an initial conversion state.§7.24.6.2.1Library387ISO/IEC 9899:TC3Committee Draft — Septermber 7, 2007WG14/N1256Returns3The mbsinit function returns nonzero if ps is a null pointer or if the pointed-to objectdescribes an initial conversion state; otherwise, it returns zero.7.24.6.3 Restartable multibyte/wide character conversion functions1These functions differ from the corresponding multibyte character functions of 7.20.7(mblen, mbtowc, and wctomb) in that they have an extra parameter, ps, of typepointer to mbstate_t that points to an object that can completely describe the currentconversion state of the associated multibyte character sequence.
If ps is a null pointer,each function uses its own internal mbstate_t object instead, which is initialized atprogram startup to the initial conversion state. The implementation behaves as if nolibrary function calls these functions with a null pointer for ps.2Also unlike their corresponding functions, the return value does not represent whether theencoding is state-dependent.7.24.6.3.1 The mbrlen functionSynopsis1#include <wchar.h>size_t mbrlen(const char * restrict s,size_t n,mbstate_t * restrict ps);Description2The mbrlen function is equivalent to the call:mbrtowc(NULL, s, n, ps != NULL ? ps : &internal)where internal is the mbstate_t object for the mbrlen function, except that theexpression designated by ps is evaluated only once.Returns3The mbrlen function returns a value between zero and n, inclusive, (size_t)(-2),or (size_t)(-1).Forward references: the mbrtowc function (7.24.6.3.2).388Library§7.24.6.3.1WG14/N1256Committee Draft — Septermber 7, 2007ISO/IEC 9899:TC37.24.6.3.2 The mbrtowc functionSynopsis1#include <wchar.h>size_t mbrtowc(wchar_t * restrict pwc,const char * restrict s,size_t n,mbstate_t * restrict ps);Description2If s is a null pointer, the mbrtowc function is equivalent to the call:mbrtowc(NULL, "", 1, ps)In this case, the values of the parameters pwc and n are ignored.3If s is not a null pointer, the mbrtowc function inspects at most n bytes beginning withthe byte pointed to by s to determine the number of bytes needed to complete the nextmultibyte character (including any shift sequences).
If the function determines that thenext multibyte character is complete and valid, it determines the value of thecorresponding wide character and then, if pwc is not a null pointer, stores that value inthe object pointed to by pwc. If the corresponding wide character is the null widecharacter, the resulting state described is the initial conversion state.Returns4The mbrtowc function returns the first of the following that applies (given the currentconversion state):0if the next n or fewer bytes complete the multibyte character thatcorresponds to the null wide character (which is the value stored).between 1 and n inclusive if the next n or fewer bytes complete a valid multibytecharacter (which is the value stored); the value returned is the numberof bytes that complete the multibyte character.(size_t)(-2) if the next n bytes contribute to an incomplete (but potentially valid)multibyte character, and all n bytes have been processed (no value isstored).300)(size_t)(-1) if an encoding error occurs, in which case the next n or fewer bytesdo not contribute to a complete and valid multibyte character (novalue is stored); the value of the macro EILSEQ is stored in errno,and the conversion state is unspecified.300) When n has at least the value of the MB_CUR_MAX macro, this case can only occur if s points at asequence of redundant shift sequences (for implementations with state-dependent encodings).§7.24.6.3.2Library389ISO/IEC 9899:TC3Committee Draft — Septermber 7, 2007WG14/N12567.24.6.3.3 The wcrtomb functionSynopsis1#include <wchar.h>size_t wcrtomb(char * restrict s,wchar_t wc,mbstate_t * restrict ps);Description2If s is a null pointer, the wcrtomb function is equivalent to the callwcrtomb(buf, L'\0', ps)where buf is an internal buffer.3If s is not a null pointer, the wcrtomb function determines the number of bytes neededto represent the multibyte character that corresponds to the wide character given by wc(including any shift sequences), and stores the multibyte character representation in thearray whose first element is pointed to by s.
At most MB_CUR_MAX bytes are stored. Ifwc is a null wide character, a null byte is stored, preceded by any shift sequence neededto restore the initial shift state; the resulting state described is the initial conversion state.Returns4The wcrtomb function returns the number of bytes stored in the array object (includingany shift sequences). When wc is not a valid wide character, an encoding error occurs:the function stores the value of the macro EILSEQ in errno and returns(size_t)(-1); the conversion state is unspecified.7.24.6.4 Restartable multibyte/wide string conversion functions1These functions differ from the corresponding multibyte string functions of 7.20.8(mbstowcs and wcstombs) in that they have an extra parameter, ps, of type pointer tombstate_t that points to an object that can completely describe the current conversionstate of the associated multibyte character sequence.
If ps is a null pointer, each functionuses its own internal mbstate_t object instead, which is initialized at program startupto the initial conversion state. The implementation behaves as if no library function callsthese functions with a null pointer for ps.2Also unlike their corresponding functions, the conversion source parameter, src, has apointer-to-pointer type. When the function is storing the results of conversions (that is,when dst is not a null pointer), the pointer object pointed to by this parameter is updatedto reflect the amount of the source processed by that invocation.390Library§7.24.6.4WG14/N1256Committee Draft — Septermber 7, 2007ISO/IEC 9899:TC37.24.6.4.1 The mbsrtowcs functionSynopsis1#include <wchar.h>size_t mbsrtowcs(wchar_t * restrict dst,const char ** restrict src,size_t len,mbstate_t * restrict ps);Description2The mbsrtowcs function converts a sequence of multibyte characters that begins in theconversion state described by the object pointed to by ps, from the array indirectlypointed to by src into a sequence of corresponding wide characters.
If dst is not a nullpointer, the converted characters are stored into the array pointed to by dst. Conversioncontinues up to and including a terminating null character, which is also stored.Conversion stops earlier in two cases: when a sequence of bytes is encountered that doesnot form a valid multibyte character, or (if dst is not a null pointer) when len widecharacters have been stored into the array pointed to by dst.301) Each conversion takesplace as if by a call to the mbrtowc function.3If dst is not a null pointer, the pointer object pointed to by src is assigned either a nullpointer (if conversion stopped due to reaching a terminating null character) or the addressjust past the last multibyte character converted (if any).