B. Stroustrup - The C++ Programming Language (794319), страница 57
Текст из файла (страница 57)
This loop of calls must be broken somehow. A declarationdouble expr(bool);before the definition of prim() will do nicely.Section 10.2.2Input24710.2.2 InputReading input is often the messiest part of a program. To communicate with a person, the programmust cope with that person’s whims, conventions, and seemingly random errors.
Trying to forcethe person to behave in a manner more suitable for the machine is often (rightly) considered offensive. The task of a low-level input routine is to read characters and compose higher-level tokensfrom them. These tokens are then the units of input for higher-level routines. Here, low-level inputis done by ts.get(). Writing a low-level input routine need not be an everyday task.
Many systemsprovide standard functions for this.First we need to see the complete definition of Token_stream:class Token_stream {public:Token_stream(istream& s) : ip{&s}, owns{false} { }Token_stream(istream∗ p) : ip{p}, owns{true} { }˜Token_stream() { close(); }Token get();Token& current();// read and return next token// most recently read tokenvoid set_input(istream& s) { close(); ip = &s; owns=false; }void set_input(istream∗ p) { close(); ip = p; owns = true; }private:void close() { if (owns) delete ip; }istream∗ ip;bool owns;Token ct {Kind::end} ;// pointer to an input stream// does the Token_stream own the istream?// current token};We initialize a Token_stream with an input stream (§4.3.2, Chapter 38) from which it gets its characters.
The Token_stream implements the convention that it owns (and eventually deletes; §3.2.1.2,§11.2) an istream passed as a pointer, but not an istream passed as a reference. This may be a bitelaborate for this simple program, but it is a useful and general technique for classes that hold apointer to a resource requiring destruction.A Token_stream holds three values: a pointer to its input stream (ip), a Boolean (owns), indicating ownership of the input stream, and the current token (ct).I gave ct a default value because it seemed sloppy not to. People should not call current() beforeget(), but if they do, they get a well-defined Token.
I chose Kind::end as the initial value for ct sothat a program that misuses current() will not get a value that wasn’t on the input stream.I present Token_stream::get() in two stages. First, I provide a deceptively simple version thatimposes a burden on the user. Next, I modify it into a slightly less elegant, but much easier to use,version. The idea for get() is to read a character, use that character to decide what kind of tokenneeds to be composed, read more characters when needed, and then return a Token representing thecharacters read.248ip)ExpressionsChapter 10The initial statements read the first non-whitespace character from ∗ip (the stream pointed to byinto ch and check that the read operation succeeded:Token Token_stream::get(){char ch = 0;∗ip>>ch;switch (ch) {case 0:return ct={Kind::end};// assign and returnBy default, operator >> skips whitespace (that is, spaces, tabs, newlines, etc.) and leaves the valueof ch unchanged if the input operation failed.
Consequently, ch==0 indicates end-of-input.Assignment is an operator, and the result of the assignment is the value of the variable assignedto. This allows me to assign the value Kind::end to curr_tok and return it in the same statement.Having a single statement rather than two is useful in maintenance. If the assignment and the returnbecame separated in the code, a programmer might update the one and forget to update the other.Note also how the {}-list notation (§3.2.1.3, §11.3) is used on the right-hand side of an assignment. That is, it is an expression. I could have written that return-statement as:ct.kind = Kind::end; // assignreturn ct;// returnHowever, I think that assigning a complete object {Kind::end} is clearer than dealing with individualmembers of ct.
The {Kind::end} is equivalent to {Kind::end,0,0}. That’s good if we care about thelast two members of the Token and not so good if we are worried about performance. Neither is thecase here, but in general dealing with complete objects is clearer and less error-prone than manipulating data members individually. The cases below give examples of the other strategy.Consider some of the cases separately before considering the complete function. The expression terminator, ';', the parentheses, and the operators are handled simply by returning their values:case ';': // end of expression; printcase '∗':case '/':case '+':case '−':case '(':case ')':case '=':return ct={static_cast<Kind>(ch)};The static_cast (§11.5.2) is needed because there is no implicit conversion from char to Kind(§8.4.1); only some characters correspond to Kind values, so we have to ‘‘certify’’ that in this casech does.Numbers are handled like this:case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9':case '.':Section 10.2.2Input249ip−>putback(ch);// put the first digit (or .) back into the input stream∗ip >> ct.number_value; // read the number into ctct.kind=Kind::number;return ct;Stacking case labels horizontally rather than vertically is generally not a good idea because thisarrangement is harder to read.
However, having one line for each digit is tedious. Because operator >> is already defined for reading floating-point values into a double, the code is trivial. First theinitial character (a digit or a dot) is put back into cin. Then, the floating-point value can be readinto ct.number_value.If the token is not the end of input, an operator, a punctuation character, or a number, it must bea name. A name is handled similarly to a number:default:// name, name =, or errorif (isalpha(ch)) {ip−>putback(ch);// put the first character back into the input stream∗ip>>ct.string_value;// read the string into ctct.kind=Kind::name;return ct;}Finally, we may simply have an error.
The simple-minded, but reasonably effective way to dealwith an error is the write call an error() function and then return a print token if error() returns:error("bad token");return ct={Kind::print};The standard-library function isalpha() (§36.2.1) is used to avoid listing every character as a separate case label. Operator >> applied to a string (in this case, string_value) reads until it hits whitespace.
Consequently, a user must terminate a name by a space before an operator using the nameas an operand. This is less than ideal, so we will return to this problem in §10.2.3.Here, finally, is the complete input function:Token Token_stream::get(){char ch = 0;∗ip>>ch;switch (ch) {case 0:return ct={Kind::end};// assign and returncase ';': // end of expression; printcase '∗':case '/':case '+':case '−':case '(':case ')':case '=':return ct=={static_cast<Kind>(ch)};250ExpressionsChapter 10case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9':case '.':ip−>putback(ch);// put the first digit (or .) back into the input stream∗ip >> ct.number_value;// read number into ctct.kind=Kind::number;return ct;default:// name, name =, or errorif (isalpha(ch)) {ip−>putback(ch);// put the first character back into the input stream∗ip>>ct.string_value;// read string into ctct.kind=Kind::name;return ct;}error("bad token");return ct={Kind::print};}}The conversion of an operator to its Token value is trivial because thedefined as the integer value of the operator (§10.2.1).kindof an operator was10.2.3 Low-Level InputUsing the calculator as defined so far reveals a few inconveniences.
It is tedious to remember toadd a semicolon after an expression in order to get its value printed, and having a name terminatedby whitespace only is a real nuisance. For example, x=7 is an identifier – rather than the identifier xfollowed by the operator = and the number 7. To get what we (usually) want, we would have to addwhitespace after x: x =7. Both problems are solved by replacing the type-oriented default inputoperations in get() with code that reads individual characters.First, we’ll make a newline equivalent to the semicolon used to mark the end-of-expression:Token Token_stream::get(){char ch;do { // skip whitespace except ’\n’if (!ip−>get(ch)) return ct={Kind::end};} while (ch!='\n' && isspace(ch));switch (ch) {case ';':case '\n':return ct={Kind::print};Here, I use a do-statement; it is equivalent to a while-statement except that the controlled statementis always executed at least once.