Главная » Все файлы » Просмотр файлов из архивов » PDF-файлы » A.W. Appel, J. Palsberg - Modern Compiler Implementation in Java (Second Edition)

A.W. Appel, J. Palsberg - Modern Compiler Implementation in Java (Second Edition), страница 12

Описание файла

PDF-файл из архива "A.W. Appel, J. Palsberg - Modern Compiler Implementation in Java (Second Edition)", который расположен в категории "книги и методические указания". Всё это находится в предмете "конструирование компиляторов" из седьмого семестра, которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .

Просмотр PDF-файла онлайн

Текст 12 страницы из PDF

Because Ycan produce the empty string - and therefore X can produce the empty string - we find thatFIRST(X Y Z) must include FIRST(Z). Therefore, in computing FIRST sets, we must keeptrack of which symbols can produce the empty string; we say such symbols are nullable. Andwe must keep track of what might follow a nullable symbol.GRAMMAR 3.12••Z→dZ→XYZ••Y→Y→c••X→YX→aWith respect to a particular grammar, given a string γ of terminals and nonterminals,•nullable(X) is true if X can derive the empty string.•FIRST(γ) is the set of terminals that can begin strings derived from γ.•FOLLOW(X) is the set of terminals that can immediately follow X. That is, t ∈FOLLOW(X) if there is any derivation containing Xt. This can occur if the derivationcontains X Y Zt where Y and Z both derive ∊.A precise definition of FIRST, FOLLOW, and nullable is that they are the smallest sets forwhich these properties hold:For each terminal symbol Z, FIRST[Z] = {Z}.Algorithm 3.13 for computing FIRST, FOLLOW, and nullable just follows from these facts;we simply replace each equation with an assignment statement, and iterate.ALGORITHM 3.13: Iterative computation of FIRST, FOLLOW, and nullable.Algorithm to compute FIRST, FOLLOW, and nullable.Initialize FIRST and FOLLOW to all empty sets, and nullable to all false.for each terminal symbol ZFIRST[Z] ← {Z}repeat49for each production X → Y1Y2 ...

Ykif Y1 ... Yk are all nullable (or if k = 0)then nullable[X] ← truefor each i from 1 to k, each j from i + 1 to kif Y1 ... Yi-1 are all nullable (or if i = 1)then FIRST[X] ← FIRST[X] ∪ FIRST[Yi ]if Yi+1 ... Yk are all nullable (or if i = k)then FOLLOW[Yi] ← FOLLOW[Yi] ∪ FOLLOW[X]if Yi+1 ... Yj -1 are all nullable (or if i + 1 = j)then FOLLOW[Yi] ← FOLLOW[Yi] ∪ FIRST[Yj]until FIRST, FOLLOW, and nullable did not change in this iteration.Of course, to make this algorithm efficient it helps to examine the productions in the rightorder; see Section 17.4. Also, the three relations need not be computed simultaneously;nullable can be computed by itself, then FIRST, then FOLLOW.This is not the first time that a group of equations on sets has become the algorithm forcalculating those sets; recall the algorithm on page 28 for computing ∊-closure.

Nor will it bethe last time; the technique of iteration to a fixed point is applicable in dataflow analysis foroptimization, in the back end of a compiler.We can apply this algorithm to Grammar 3.12. Initially, we have:In the first iteration, we find that a ∈ FIRST[X], Y is nullable, c ∈ FIRST[Y], d ∈ FIRST[Z],d ∈ FOLLOW[X], c ∈ FOLLOW[X], d ∈ FOLLOW[Y]. Thus:In the second iteration, we find that X is nullable, c ∈ FIRST[X], {a; c} ⊆ FIRST[Z], {a, c,d} ⊆ FOLLOW[X], {a, c, d} ⊆ FOLLOW[Y]. Thus:The third iteration finds no new information, and the algorithm terminates.It is useful to generalize the FIRST relation to strings of symbols:FIRST(Xγ) = FIRST[X]if not nullable[X]FIRST(Xγ) = FIRST[X] ∪ FIRST(γ) if nullable[X]and similarly, we say that a string γ is nullable if each symbol in γ is nullable.50CONSTRUCTING A PREDICTIVE PARSERConsider a recursive-descent parser.

The parsing function for some nonterminal X has aclause for each X production; it must choose one of these clauses based on the next token T ofthe input. If we can choose the right production for each (X, T), then we can write therecursive-descent parser. All the information we need can be encoded as a two-dimensionaltable of productions, indexed by nonterminals X and terminals T. Thisiscalleda predictiveparsing table.To construct this table, enter production X → γ in row X, column T of the table for each T ∈FIRST(γ).

Also, if γ is nullable, enter the production in row X, column T for each T ∈FOLLOW[X].Figure 3.14 shows the predictive parser for Grammar 3.12. But some of the entries containmore than one production! The presence of duplicate entries means that predictive parsingwill not work on Grammar 3.12.Figure 3.14: Predictive parsing table for Grammar 3.12.If we examine the grammar more closely, we find that it is ambiguous. The sentence d hasmany parse trees, including:An ambiguous grammar will always lead to duplicate entries in a predictive parsing table. Ifwe need to use the language of Grammar 3.12 as a programming language, we will need tofind an unambiguous grammar.Grammars whose predictive parsing tables contain no duplicate entries are called LL(1). Thisstands for left-to-right parse, leftmost-derivation, 1-symbol lookahead. Clearly a recursivedescent (predictive) parser examines the input left-to-right in one pass (some parsingalgorithms do not, but these are generally not useful for compilers).

The order in which apredictive parser expands nonterminals into right-hand sides (that is, the recursive-descentparser calls functions corresponding to nonterminals) is just the order in which a leftmostderivation expands nonterminals. And a recursive-descent parser does its job just by lookingat the next token of the input, never looking more than one token ahead.51We can generalize the notion of FIRST sets to describe the first k tokens of a string, and tomake an LL(k) parsing table whose rows are the nonterminals and columns are everysequence of k terminals.

This is rarely done (because the tables are so large), but sometimeswhen you write a recursive-descent parser by hand you need to look more than one tokenahead.Grammars parsable with LL(2) parsing tables are called LL(2) grammars, and similarly forLL(3), etc. Every LL(1) grammar is an LL(2) grammar, and so on.

No ambiguous grammar isLL(k)forany k.ELIMINATING LEFT RECURSIONSuppose we want to build a predictive parser for Grammar 3.10. The two productionsare certain to cause duplicate entries in the LL(1) parsing table, since any token in FIRST(T)will also be in FIRST(E + T). The problem is that E appears as the first right-hand-sidesymbol in an E-production; this is called left recursion. Grammars with left recursion cannotbe LL(1).To eliminate left recursion, we will rewrite using right recursion.

We introduce a newnonterminal E′, and writeThis derives the same set of strings (on T and +) as the original two productions, but nowthere is no left recursion.In general, whenever we have productions X → Xγ and X → α, where α does not start with X,we know that this derives strings of the form αγ*, an α followed by zero or more γ.

So wecan rewrite the regular expression using right recursion:Applying this transformation to Grammar 3.10, we obtain Grammar 3.15.GRAMMAR 3.15••S→E$•••E → T E′•E′ → + T E′••E′ →E′ →− T E′•T → F T′52•T′ →* F T′•F → id•T′ → / F T′•F → num•T′ →•F → (E)To build a predictive parser, first we compute nullable, FIRST, and FOLLOW (Table 3.16).The predictive parser for Grammar 3.15 is shown in Table 3.17.Table 3.16: Nullable, FIRST, and FOLLOW for Grammar 3.15.Table 3.17: Predictive parsing table for Grammar 3.15. We omit the columns for num, /, and , as they are similar to others in the table.LEFT FACTORINGWe have seen that left recursion interferes with predictive parsing, and that it can beeliminated.

A similar problem occurs when two productions for the same nonterminal startwith the same symbols. For example:In such a case, we can left factor the grammar - that is, take the allowable endings (else S and∊) and make a new nonterminal X to stand for them:53The resulting productions will not pose a problem for a predictive parser. Although thegrammar is still ambiguous - the parsing table has two entries for the same slot - we canresolve the ambiguity by using the else S action.ERROR RECOVERYArmed with a predictive parsing table, it is easy to write a recursive-descent parser. Here is arepresentative fragment of a parser for Grammar 3.15:void T() {switch (tok) {case ID:case NUM:case LPAREN: F(); Tprime(); break;default: error!}}void Tprime() {switch (tok) {case PLUS: break;case TIMES: eat(TIMES); F(); Tprime(); break;case EOF: break;case RPAREN: break;default: error!}}A blank entry in row T, column x of the LL(1) parsing table indicates that the parsing functionT() does not expect to see token x - this will be a syntax error.

Свежие статьи
Популярно сейчас