Cooper_Engineering_a_Compiler(Second Edition) (1157546), страница 14

Файл №1157546 Cooper_Engineering_a_Compiler(Second Edition) (Rice) 14 страницаCooper_Engineering_a_Compiler(Second Edition) (1157546) страница 142019-09-182019-09-18СтудИзба

Rice

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 14)

Toaccomplish this aggregation and classification, the scanner applies a set ofrules that describe the lexical structure of the input programming language,sometimes called its microsyntax. The microsyntax of a programming language specifies how to group characters into words and, conversely, how toseparate words that run together. (In the context of scanning, we considerpunctuation marks and other symbols as words.)Western languages, such as English, have simple microsyntax. Adjacentalphabetic letters are grouped together, left to right, to form a word.A blank space terminates a word, as do most nonalphabetic symbols. (Theword-building algorithm can treat a hyphen in the midst of a word asif it were an alphabetic character.) Once a group of characters has beenaggregated together to form a potential word, the word-building algorithmcan determine its validity with a dictionary lookup.2.2 Recognizing Words 27Most programming languages have equally simple microsyntax.

Charactersare aggregated into words. In most languages, blanks and punctuation marksterminate a word. For example, Algol and its descendants define an identifieras a single alphabetic character followed by zero or more alphanumeric characters. The identifier ends with the first nonalphanumeric character.

Thus,fee and f1e are valid identifiers, but 12fum is not. Notice that the set ofvalid words is specified by rules rather than by enumeration in a dictionary.In a typical programming language, some words, called keywords or reserved words, match the rule for an identifier but have special meanings. Bothwhile and static are keywords in both C and Java. Keywords (and punctuation marks) form their own syntactic categories. Even though staticmatches the rule for an identifier, the scanner in a C or Java compiler wouldundoubtedly classify it into a category that has only one element, the keyword static. To recognize keywords, the scanner can either use dictionarylookup or encode the keywords directly into its microsyntax rules.The simple lexical structure of programming languages lends itself to efficient scanners.

The compiler writer starts from a specification of the language’s microsyntax. She either encodes the microsyntax into a notationaccepted by a scanner generator, which then constructs an executable scanner, or she uses that specification to build a hand-crafted scanner. Bothgenerated and hand-crafted scanners can be implemented to require justO(1) time per character, so they run in time proportional to the number ofcharacters in the input stream.2.2 RECOGNIZING WORDSThe simplest explanation of an algorithm to recognize words is often acharacter-by-character formulation. The structure of the code can providesome insight into the underlying problem.

Consider the problem of recognizing the keyword new. Assuming the presence of a routine NextChar thatreturns the next character, the code might look like the fragment shown inFigure 2.1. The code tests for n followed by e followed by w. At each step,failure to match the appropriate character causes the code to reject the stringand “try something else.” If the sole purpose of the program was to recognize the word new, then it should print an error message or return failure.Because scanners rarely recognize only one word, we will leave this “errorpath” deliberately vague at this point.The code fragment performs one test per character.

We can represent thecode fragment using the simple transition diagram shown to the right of thecode. The transition diagram represents a recognizer. Each circle representsan abstract state in the computation. Each state is labelled for convenience.Keyworda word that is reserved for a particular syntacticpurpose and, thus, cannot be used as an identifier28 CHAPTER 2 Scannersc ← NextChar();if (c = ‘n’)?then begin;s0n?c ← NextChar();if (c = ‘e’)then begin;s1e?c ← NextChar();if (c = ‘w’)then report success;s2w?else try something else;end;selse try something else;3end;else try something else;n FIGURE 2.1 Code Fragment to Recognize "new".siThe initial state, or start state, is s0 .

We will always label the start state ass0 . State s3 is an accepting state; the recognizer reaches s3 only when theinput is new. Accepting states are drawn with double circles, as shown inthe margin. The arrows represent transitions from state to state based on theinput character. If the recognizer starts in s0 and reads the characters n, e,and w, the transitions take us to s3 .

What happens on any other input, suchas n, o, and t? The n takes the recognizer to s1 . The o does not match theedge leaving s1 , so the input word is not new. In the code, cases that do notmatch new try something else. In the recognizer, we can think of this actionas a transition to an error state. When we draw the transition diagram of arecognizer, we usually omit transitions to the error state.

Each state has atransition to the error state on each unspecified input.Using this same approach to build a recognizer for while would produce thefollowing transition diagram:s0ws1hs2is3ls4es5If it starts in s0 and reaches s5 , it has identified the word while. Thecorresponding code fragment would involve five nested if-then-elseconstructs.To recognize multiple words, we can create multiple edges that leave a givenstate.

(In the code, we would begin to elaborate the do something else paths.)2.2 Recognizing Words 29One recognizer for both new and not might bew ss2 3e3n- s0 - s1QoQst ss4 5The recognizer uses a common test for n that takes it from s0 to s1 ,nedenoted s0 → s1 . If the next character is e, it takes the transition s1 → s2 .oIf, instead, the next character is o, it makes the move s1 → s4 . Finally, a wwtin s2 , causes the transition s2 → s3 , while a t in s4 produces s4 → s5 . States3 indicates that the input was new while s5 indicates that it was not. Therecognizer takes one transition per input character.We can combine the recognizer for new or not with the one for while bymerging their initial states and relabeling all the states.w ss2 3e 3n s- s0 1QoQsJwt ss4 5JJJ^Jh si sl se ss6 78910State s0 has transitions for n and w. The recognizer has three accepting states,s3 , s5 , and s10 .

If any state encounters an input character that does not matchone of its transitions, the recognizer moves to an error state.2.2.1 A Formalism for RecognizersTransition diagrams serve as abstractions of the code that would be requiredto implement them. They can also be viewed as formal mathematical objects, called finite automata, that specify recognizers. Formally, a finiteautomaton (fa) is a five-tuple (S, 6, δ, s0 , S A ), wherennS is the finite set of states in the recognizer, along with an error state se .6 is the finite alphabet used by the recognizer.

Typically, 6 is the unionof the edge labels in the transition diagram.Finite automatona formalism for recognizers that has a finite set ofstates, an alphabet, a transition function, a startstate, and one or more accepting states30 CHAPTER 2 Scannersnnnδ(s, c) is the recognizer’s transition function. It maps each state s ∈ Sand each character c ∈ 6 into some next state. In state si with inputccharacter c, the fa takes the transition si → δ(si , c).s0 ∈ S is the designated start state.S A is the set of accepting states, S A ⊆ S.

Each state in S A appears as adouble circle in the transition diagram.As an example, we can cast the fa for new or not or while in the formalismas follows:S = {s0 , s1 , s2 , s3 , s4 , s5 , s6 , s7 , s8 , s9 , s10 , se }6 = {e, h, i, l, n, o, t, w}( nwes0 → s1 , s0 → s6 , s1 → s2 ,δ=this4 → s5 , s6 → s7 , s7 → s8 ,ows1 → s4 ,s2 → s3 ,s8 → s9 ,s9 → s10l)es0 = s0S A = {s3 , s5 , s10 }For all other combinations of state si and input character c, we defineδ(si , c) = se , where se is the designated error state.

This quintuple is equivalent to the transition diagram; given one, we can easily re-create the other.The transition diagram is a picture of the corresponding fa.An fa accepts a string x if and only if, starting in s0 , the sequence of characters in the string takes the fa through a series of transitions that leavesit in an accepting state when the entire string has been consumed. Thiscorresponds to our intuition for the transition diagram. For the string new,neour example recognizer runs through the transitions s0 → s1 , s1 → s2 , andws2 → s3 .

Характеристики

Тип файла

PDF-файл

Размер

8,27 Mb

Материал

Rice

Тип материала

Другое

Предмет

Конструирование компиляторов

Высшее учебное заведение

МГУ им. Ломоносова

Список файлов учебной работы

rice.rar

Rice

COMP 512 Advanced Compiler Construction_files

2007Shield.jpg

Другое

08Taxonomy.pptx

978-0-262-31709-2-ch014.pdf

peephole.pptx

scheduling.pptx

Купер и Торчсон - перевод

Рисунки

Рис1-1.jpg

12 Instruction Scheduling(Engl).doc

12 Instruction Scheduling(Rusl).doc

12a Instruction Scheduling (Terminology)E.doc

12a Instruction Scheduling (Terminology)R.doc

12b Instruction Scheduling (Side-bars)E.doc

12b Instruction Scheduling (Side-bars)R.doc

Cooper_Engineering_a_Compiler(Second Edition).pdf

COMP 512 Advanced Compiler Construction.htm

Rice University Houston COMP 412- Topics in Compiler Construction.doc

Rice University Houston COMP 512 Advanced Compiler Construction.doc

Прочти меня!!!.txt

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.