Стандарт C++ 11 (1119564), страница 11
Текст из файла (страница 11)
Thecategories of preprocessing token are: header names, identifiers, preprocessing numbers, character literals(including user-defined character literals), string literals (including user-defined string literals), preprocessingoperators and punctuators, and single non-white-space characters that do not lexically match the otherpreprocessing token categories. If a ’ or a " character matches the last category, the behavior is undefined.Preprocessing tokens can be separated by white space; this consists of comments (2.8), or white-spacecharacters (space, horizontal tab, new-line, vertical tab, and form-feed), or both. As described in Clause 16,in certain circumstances during translation phase 4, white space (or the absence thereof) serves as morethan preprocessing token separation. White space can appear within a preprocessing token only as part ofa header name or between the quotation characters in a character literal or string literal.3If the input stream has been parsed into preprocessing tokens up to a given character:— If the next character begins a sequence of characters that could be the prefix and initial double quote ofa raw string literal, such as R", the next preprocessing token shall be a raw string literal.
Between theinitial and final double quote characters of the raw string, any transformations performed in phases 1and 2 (trigraphs, universal-character-names, and line splicing) are reverted; this reversion shall applybefore any d-char, r-char, or delimiting parenthesis is identified. The raw string literal is defined asthe shortest sequence of characters that matches the raw-string patternencoding-prefixopt R raw-string— Otherwise, if the next three characters are <:: and the subsequent character is neither : nor >, the <is treated as a preprocessor token by itself and not as the first character of the alternative token <:.— Otherwise, the next preprocessing token is the longest sequence of characters that could constitute apreprocessing token, even if that would cause further lexical analysis to fail.[ Example:#define R "x"const char* s = R"y";// ill-formed raw string, not "x" "y"— end example ]4[ Example: The program fragment 1Ex is parsed as a preprocessing number token (one that is not a validfloating or integer literal token), even though a parse as the pair of preprocessing tokens 1 and Ex mightproduce a valid expression (for example, if Ex were a macro defined as +1).
Similarly, the program fragment1E1 is parsed as a preprocessing number (one that is a valid floating literal token), whether or not E is amacro name. — end example ]§ 2.520© ISO/IEC 2011 – All rights reservedISO/IEC 14882:2011(E)5[ Example: The program fragment x+++++y is parsed as x ++ ++ + y, which, if x and y have integral types,violates a constraint on increment operators, even though the parse x ++ + ++ y might yield a correctexpression.
— end example ]2.6Alternative tokens[lex.digraph]1Alternative token representations are provided for some operators and punctuators.162In all respects of the language, each alternative token behaves the same, respectively, as its primary token,except for its spelling.17 The set of alternative tokens is defined in Table 2.Table 2 — Alternative tokensAlternative<%%><::>%:%:%:2.7TokensPrimary{}[]###AlternativeandbitororxorcomplbitandPrimary&&|||ˆ∼&Alternativeand_eqor_eqxor_eqnotnot_eqPrimary&=|=ˆ=!!=[lex.token]token:identifierkeywordliteraloperatorpunctuator1There are five kinds of tokens: identifiers, keywords, literals,18 operators, and other separators. Blanks,horizontal and vertical tabs, newlines, formfeeds, and comments (collectively, “white space”), as describedbelow, are ignored except as they serve to separate tokens.
[ Note: Some white space is required to separate otherwise adjacent identifiers, keywords, numeric literals, and alternative tokens containing alphabeticcharacters. — end note ]2.81Comments[lex.comment]The characters /* start a comment, which terminates with the characters */. These comments do notnest. The characters // start a comment, which terminates with the next new-line character. If there is aform-feed or a vertical-tab character in such a comment, only white-space characters shall appear between itand the new-line that terminates the comment; no diagnostic is required. [ Note: The comment characters16) These include “digraphs” and additional reserved words.
The term “digraph” (token consisting of two characters) is notperfectly descriptive, since one of the alternative preprocessing-tokens is %:%: and of course several primary tokens contain twocharacters. Nonetheless, those alternative tokens that aren’t lexical keywords are colloquially known as “digraphs”.17) Thus the “stringized” values (16.3.2) of [ and <: will be different, maintaining the source spelling, but the tokens canotherwise be freely interchanged.18) Literals include strings and character and numeric literals.§ 2.8© ISO/IEC 2011 – All rights reserved21ISO/IEC 14882:2011(E)//, /*, and */ have no special meaning within a // comment and are treated just like other characters.Similarly, the comment characters // and /* have no special meaning within a /* comment. — end note ]2.9Header names[lex.header]header-name:< h-char-sequence >" q-char-sequence "h-char-sequence:h-charh-char-sequence h-charh-char:any member of the source character set except new-line and >q-char-sequence:q-charq-char-sequence q-charq-char:any member of the source character set except new-line and "1Header name preprocessing tokens shall only appear within a #include preprocessing directive (16.2).
Thesequences in both forms of header-names are mapped in an implementation-defined manner to headers orto external source file names as specified in 16.2.2The appearance of either of the characters ’ or \ or of either of the character sequences /* or // in aq-char-sequence or an h-char-sequence is conditionally supported with implementation-defined semantics, asis the appearance of the character " in an h-char-sequence.192.10Preprocessing numberspp-number:digit. digitpp-numberpp-numberpp-numberpp-numberpp-number[lex.ppnumber]digitidentifier-nondigite signE sign.1Preprocessing number tokens lexically include all integral literal tokens (2.14.2) and all floating literal tokens (2.14.4).2A preprocessing number does not have a type or a value; it acquires both after a successful conversion to anintegral literal token or a floating literal token.2.11Identifiers[lex.name]identifier:identifier-nondigitidentifier identifier-nondigitidentifier digitidentifier-nondigit:nondigituniversal-character-nameother implementation-defined characters19) Thus, a sequence of characters that resembles an escape sequence might result in an error, be interpreted as the charactercorresponding to the escape sequence, or have a completely different meaning, depending on the implementation.§ 2.1122© ISO/IEC 2011 – All rights reservedISO/IEC 14882:2011(E)nondigit: one ofa b c d en o p q rA B C D EN O P Q RfsFSgtGThuHUivIVjwJWkxKXlyLYmzMZ _digit: one of0 1 2 3 4 5 6 7 8 91An identifier is an arbitrarily long sequence of letters and digits.
Each universal-character-name in anidentifier shall designate a character whose encoding in ISO 10646 falls into one of the ranges specifiedin E.1. The initial element shall not be a universal-character-name designating a character whose encodingfalls into one of the ranges specified in E.2. Upper- and lower-case letters are different. All characters aresignificant.202The identifiers in Table 3 have a special meaning when appearing in a certain context. When referred toin the grammar, these identifiers are used explicitly rather than using the identifier grammar production.any ambiguity as to whether a given identifier has a special meaning is resolved to interpret the token as aregular identifier.Table 3 — Identifiers with special meaningoverride3In addition, some identifiers are reserved for use by C++ implementations and standard libraries (17.6.4.3.2)and shall not be used otherwise; no diagnostic is required.2.121finalKeywords[lex.key]The identifiers shown in Table 4 are reserved for use as keywords (that is, they are unconditionally treatedas keywords in phase 7) except in an attribute-token (7.6.1) [ Note: The export keyword is unused but isreserved for future use.