Стандарт языка Си С99 TC (1113411), страница 16
Текст из файла (страница 16)
A wide character constant is the same, except prefixed by theletter L. With a few exceptions detailed later, the elements of the sequence are anymembers of the source character set; they are mapped in an implementation-definedmanner to members of the execution character set.3The single-quote ', the double-quote ", the question-mark ?, the backslash \, andarbitrary integer values are representable according to the following table of escapesequences:single quote 'double quote "question mark ?backslash \octal characterhexadecimal character\'\"\?\\\octal digits\x hexadecimal digits4The double-quote " and question-mark ? are representable either by themselves or by theescape sequences \" and \?, respectively, but the single-quote ' and the backslash \shall be represented, respectively, by the escape sequences \' and \\.5The octal digits that follow the backslash in an octal escape sequence are taken to be partof the construction of a single character for an integer character constant or of a singlewide character for a wide character constant.
The numerical value of the octal integer soformed specifies the value of the desired character or wide character.6The hexadecimal digits that follow the backslash and the letter x in a hexadecimal escapesequence are taken to be part of the construction of a single character for an integercharacter constant or of a single wide character for a wide character constant. Thenumerical value of the hexadecimal integer so formed specifies the value of the desiredcharacter or wide character.7Each octal or hexadecimal escape sequence is the longest sequence of characters that canconstitute the escape sequence.8In addition, characters not in the basic character set are representable by universalcharacter names and certain nongraphic characters are representable by escape sequencesconsisting of the backslash \ followed by a lowercase letter: \a, \b, \f, \n, \r, \t,and \v.65)65) The semantics of these characters were discussed in 5.2.2.
If any other character follows a backslash,the result is not a token and a diagnostic is required. See ‘‘future language directions’’ (6.11.4).60Language§6.4.4.4WG14/N1256Committee Draft — Septermber 7, 2007ISO/IEC 9899:TC3Constraints9The value of an octal or hexadecimal escape sequence shall be in the range ofrepresentable values for the type unsigned char for an integer character constant, orthe unsigned type corresponding to wchar_t for a wide character constant.Semantics10An integer character constant has type int.
The value of an integer character constantcontaining a single character that maps to a single-byte execution character is thenumerical value of the representation of the mapped character interpreted as an integer.The value of an integer character constant containing more than one character (e.g.,'ab'), or containing a character or escape sequence that does not map to a single-byteexecution character, is implementation-defined. If an integer character constant containsa single character or escape sequence, its value is the one that results when an object withtype char whose value is that of the single character or escape sequence is converted totype int.11A wide character constant has type wchar_t, an integer type defined in the<stddef.h> header.
The value of a wide character constant containing a singlemultibyte character that maps to a member of the extended execution character set is thewide character corresponding to that multibyte character, as defined by the mbtowcfunction, with an implementation-defined current locale. The value of a wide characterconstant containing more than one multibyte character, or containing a multibytecharacter or escape sequence not represented in the extended execution character set, isimplementation-defined.1213EXAMPLE 1The construction '\0' is commonly used to represent the null character.EXAMPLE 2 Consider implementations that use two’s-complement representation for integers and eightbits for objects that have type char.
In an implementation in which type char has the same range ofvalues as signed char, the integer character constant '\xFF' has the value −1; if type char has thesame range of values as unsigned char, the character constant '\xFF' has the value +255.14EXAMPLE 3 Even if eight bits are used for objects that have type char, the construction '\x123'specifies an integer character constant containing only one character, since a hexadecimal escape sequenceis terminated only by a non-hexadecimal character. To specify an integer character constant containing thetwo characters whose values are '\x12' and '3', the construction '\0223' may be used, since an octalescape sequence is terminated after three octal digits. (The value of this two-character integer characterconstant is implementation-defined.)15EXAMPLE 4 Even if 12 or more bits are used for objects that have type wchar_t, the constructionL'\1234' specifies the implementation-defined value that results from the combination of the values0123 and '4'.Forward references: common definitions <stddef.h> (7.17), the mbtowc function(7.20.7.2).§6.4.4.4Language61ISO/IEC 9899:TC3Committee Draft — Septermber 7, 2007WG14/N12566.4.5 String literalsSyntax1string-literal:" s-char-sequenceopt "L" s-char-sequenceopt "s-char-sequence:s-chars-char-sequence s-chars-char:any member of the source character set exceptthe double-quote ", backslash \, or new-line characterescape-sequenceDescription2A character string literal is a sequence of zero or more multibyte characters enclosed indouble-quotes, as in "xyz".
A wide string literal is the same, except prefixed by theletter L.3The same considerations apply to each element of the sequence in a character stringliteral or a wide string literal as if it were in an integer character constant or a widecharacter constant, except that the single-quote ' is representable either by itself or by theescape sequence \', but the double-quote " shall be represented by the escape sequence\".Semantics4In translation phase 6, the multibyte character sequences specified by any sequence ofadjacent character and wide string literal tokens are concatenated into a single multibytecharacter sequence.
If any of the tokens are wide string literal tokens, the resultingmultibyte character sequence is treated as a wide string literal; otherwise, it is treated as acharacter string literal.5In translation phase 7, a byte or code of value zero is appended to each multibytecharacter sequence that results from a string literal or literals.66) The multibyte charactersequence is then used to initialize an array of static storage duration and length justsufficient to contain the sequence.
For character string literals, the array elements havetype char, and are initialized with the individual bytes of the multibyte charactersequence; for wide string literals, the array elements have type wchar_t, and areinitialized with the sequence of wide characters corresponding to the multibyte character66) A character string literal need not be a string (see 7.1.1), because a null character may be embedded init by a \0 escape sequence.62Language§6.4.5WG14/N1256Committee Draft — Septermber 7, 2007ISO/IEC 9899:TC3sequence, as defined by the mbstowcs function with an implementation-defined currentlocale.
The value of a string literal containing a multibyte character or escape sequencenot represented in the execution character set is implementation-defined.6It is unspecified whether these arrays are distinct provided their elements have theappropriate values. If the program attempts to modify such an array, the behavior isundefined.7EXAMPLEThis pair of adjacent character string literals"\x12" "3"produces a single character string literal containing the two characters whose values are '\x12' and '3',because escape sequences are converted into single members of the execution character set just prior toadjacent string literal concatenation.Forward references: common definitions <stddef.h> (7.17), the mbstowcsfunction (7.20.8.1).6.4.6 PunctuatorsSyntax1punctuator: one of[ ] ( ) { } .
->++ -- & * + - ~ !/ % << >> < > <= >=? : ; ...= *= /= %= += -= <<=, # ##<: :> <% %> %: %:%:==>>=!=&=^|^=&&|||=Semantics2A punctuator is a symbol that has independent syntactic and semantic significance.Depending on context, it may specify an operation to be performed (which in turn mayyield a value or a function designator, produce a side effect, or some combination thereof)in which case it is known as an operator (other forms of operator also exist in somecontexts). An operand is an entity on which an operator acts.§6.4.6Language63ISO/IEC 9899:TC33Committee Draft — Septermber 7, 2007WG14/N1256In all aspects of the language, the six tokens67)<::><%%>%:%:%:behave, respectively, the same as the six tokens[]{}###except for their spelling.68)Forward references: expressions (6.5), declarations (6.7), preprocessing directives(6.10), statements (6.8).6.4.7 Header namesSyntax1header-name:< h-char-sequence >" q-char-sequence "h-char-sequence:h-charh-char-sequence h-charh-char:any member of the source character set exceptthe new-line character and >q-char-sequence:q-charq-char-sequence q-charq-char:any member of the source character set exceptthe new-line character and "Semantics2The sequences in both forms of header names are mapped in an implementation-definedmanner to headers or external source file names as specified in 6.10.2.3If the characters ', \, ", //, or /* occur in the sequence between the < and > delimiters,the behavior is undefined.
Similarly, if the characters ', \, //, or /* occur in the67) These tokens are sometimes called ‘‘digraphs’’.68) Thus [ and <: behave differently when ‘‘stringized’’ (see 6.10.3.2), but can otherwise be freelyinterchanged.64Language§6.4.7WG14/N1256Committee Draft — Septermber 7, 2007ISO/IEC 9899:TC3sequence between the " delimiters, the behavior is undefined.69) Header namepreprocessing tokens are recognized only within #include preprocessing directives andin implementation-defined locations within #pragma directives.70)4EXAMPLEThe following sequence of characters:0x3<1/a.h>1e2#include <1/a.h>#define const.member@$forms the following sequence of preprocessing tokens (with each individual preprocessing token delimitedby a { on the left and a } on the right).{0x3}{<}{1}{/}{a}{.}{h}{>}{1e2}{#}{include} {<1/a.h>}{#}{define} {const}{.}{member}{@}{$}Forward references: source file inclusion (6.10.2).6.4.8 Preprocessing numbersSyntax1pp-number:digit.