Cooper_Engineering_a_Compiler(Second Edition) (1157546), страница 23

Файл №1157546 Cooper_Engineering_a_Compiler(Second Edition) (Rice) 23 страницаCooper_Engineering_a_Compiler(Second Edition) (1157546) страница 232019-09-182019-09-18СтудИзба

Rice

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 23)

Forthe reference δ(state,cat), the code must compute@δ0 + (state × number of columns in δ + cat) × wwhere @δ0 is a constant related to the starting address of δ in memory andw is the number of bytes per element of δ. Again, the scanner must issue aload operation to retrieve the data stored at this address.Thus, the table-driven scanner performs two address computations and twoload operations for each character that it processes. The speed improvementsin a direct-coded scanner come from reducing this overhead.Replacing the Table-Driven Scanner’s While LoopRather than represent the current dfa state and the transition diagram explicitly, a direct-coded scanner has a specialized code fragment to implementeach state.

It transfers control directly from state-fragment to state-fragmentto emulate the actions of the dfa. Figure 2.16 shows a direct-coded scannersinit : lexeme ← ‘‘ ’’;clear stack;s2 :push(bad);goto s0 ;s0 :if state ∈ SAthen clear stack;push(state);NextChar(char);if ‘0’ ≤ char ≤ ‘9’then goto s2 ;lexeme ← lexeme + char;if state ∈ SAthen clear stack;push(state);if (char = ‘r’)then goto s1 ;else goto sout ;s1 :NextChar(char);lexeme ← lexeme + char;if state ∈ SAthen clear stack;push(state);if (‘0’ ≤ char ≤ ’9’)then goto s2 ;else goto sout ;n FIGURE 2.16 A Direct-Coded Scanner for r[0...9]+ .NextChar(char);lexeme ← lexeme + char;else goto soutsout :while (state ∈/ SA andstate 6= bad) dostate ← pop();truncate lexeme;RollBack();end;if state ∈ SAthen return Type[state];else return invalid;68 CHAPTER 2 Scannersfor r [0. .

. 9]+ ; it is equivalent to the table-driven scanner shown earlier inFigure 2.14.Consider the code for state s1 . It reads a character, concatenates it onto thecurrent word, and advances the character counter. If char is a digit, it jumpsto state s2 . Otherwise, it jumps to state sout . The code requires no complicated address calculations. The code refers to a tiny set of values that can bekept in registers. The other states have equally simple implementations.The code in Figure 2.16 uses the same mechanism as the table-driven scanner to track accepting states and to roll back to them after an overrun.Because the code represents a specific dfa, we could specialize it further.

Inparticular, since the dfa has just one accepting state, the stack is unneededand the transitions to sout from s0 and s1 can be replaced with reportfailure. In a dfa where some transition leads from an accepting state to anonaccepting state, the more general mechanism is needed.A scanner generator can directly emit code similar to that shown inFigure 2.16. Each state has a couple of standard assignments, followed bybranching logic that implements the transitions out of the state. Unlike thetable-driven scanner, the code changes for each set of res.

Since that codeis generated directly from the res, the difference should not matter to thecompiler writer.Code in the style of Figure 2.16 is often calledspaghetti code in honor of its tangled controlflow.Of course, the generated code violates many of the precepts of structuredprogramming. While small examples may be comprehensible, the code fora complex set of regular expressions may be difficult for a human to follow. Again, since the code is generated, humans should not need to reador debug it. The additional speed obtained from direct coding makes it anattractive option, particularly since it entails no extra work for the compilerwriter. Any extra work is pushed into the implementation of the scannergenerator.Classifying CharactersThe continuing example, r [0.

. . 9]+ , divides the alphabet of input charactersinto just four classes. An r falls in class Register. The digits 0, 1, 2, 3, 4, 5, 6,7, 8, and 9 fall in class Digit, the special character returned when NextCharexhausts its input falls in class EndOfFile, and anything else falls in classOther.Collating sequencethe "sorting order" of the characters in analphabet, determined by the integers assignedeach characterThe scanner can easily and efficiently classify a given character, as shownin Figure 2.16. State s0 uses a direct test on ‘r’ to determine if char isin Register. Because all the other classes have equivalent actions in thedfa, the scanner need not perform further tests. States s1 and s2 classify2.5 Implementing Scanners 69char into either Digit or anything else.

They capitalize on the fact that thedigits 0 through 9 occupy adjacent positions in the ascii collating sequence,corresponding to the integers 48 to 57.In a scanner where character classification is more involved, the translationtable approach used in the table-driven scanner may be less expensive thandirectly testing characters. In particular, if a class contains multiple characters that do not occupy adjacent slots in the collating sequence, a tablelookup may be more efficient than direct testing.

For example, a class thatcontained the arithmetic operators +, -, *, \, and ˆ (43, 45, 42, 48, and94 in the ascii sequence) would require a moderately long series of comparisons. Using a translation table, such as CharCat in the table-drivenexample, might be faster than the comparisons if the translation table staysin the processor’s primary cache.2.5.3 Hand-Coded ScannersGenerated scanners, whether table-driven or direct-coded, use a small, constant amount of time per character. Despite this fact, many compilers usehand-coded scanners. In an informal survey of commercial compiler groups,we found that a surprisingly large fraction used hand-coded scanners.Similarly, many of the popular open-source compilers rely on hand-codedscanners. For example, the flex scanner generator was ostensibly built tosupport the gcc project, but gcc 4.0 uses hand-coded scanners in several ofits front ends.The direct-coded scanner reduced the overhead of simulating the dfa; thehand-coded scanner can reduce the overhead of the interfaces between thescanner and the rest of the system.

In particular, a careful implementationcan improve the mechanisms used to read and manipulate characters oninput and the operations needed to produce a copy of the actual lexeme onoutput.Buffering the Input StreamWhile character-by-character i/o leads to clean algorithmic formulations, theoverhead of a procedure call per character is significant relative to the costof simulating the dfa in either a table-driven or a direct-coded scanner.

Toreduce the i/o cost per character, the compiler writer can use buffered i/o,where each read operation returns a longer string of characters, or buffer,and the scanner then indexes through the buffer. The scanner maintains apointer into the buffer. Responsibility for keeping the buffer filled and tracking the current location in the buffer falls to NextChar. These operations can70 CHAPTER 2 Scannersbe performed inline; they are often encoded in a macro to avoid clutteringthe code with pointer dereferences and increments.The cost of reading a full buffer of characters has two components, a largefixed overhead and a small per-character cost.

A buffer and pointer schemeamortizes the fixed costs of the read over many single-character fetches.Making the buffer larger reduces the number of times that the scanner incursthis cost and reduces the per-character overhead.Using a buffer and pointer also leads to a simple and efficient implementation of the RollBack operation that occurs at the end of both the generatedscanners. To roll the input back, the scanner can simply decrement the inputpointer. This scheme works as long as the scanner does not decrement thepointer beyond the start of the buffer.

At that point, however, the scannerneeds access to the prior contents of the buffer.Double bufferingA scheme that uses two input buffers in a modulofashion to provide bounded roll back is oftencalled double buffering.In practice, the compiler writer can bound the roll-back distance that a scanner will need. With bounded roll back, the scanner can simply use twoadjacent buffers and increment the pointer in a modulo fashion, as shownbelow:Buffer 00Buffer 1n-1 n6InputPointer2n-1To read a character, the scanner increments the pointer, modulo 2n andreturns the character at that location. To roll back a character, the programdecrements the input pointer, modulo 2n.

It must also manage the contentsof the buffer, reading additional characters from the input stream as needed.Both NextChar and RollBack have simple, efficient implementations, asshown in Figure 2.17. Each execution of NextChar loads a character, increments the Input pointer, and tests whether or not to fill the buffer. Every ncharacters, it fills the buffer. The code is small enough to be included inline,perhaps generated from a macro. This scheme amortizes the cost of fillingthe buffer over n characters. By choosing a reasonable size for n, such as2048, 4096, or more, the compiler writer can keep the i/o overhead low.Rollback is even less expensive. It performs a test to ensure that thebuffer contents are valid and then decrements the input pointer. Again, theimplementation is sufficiently simple to be expanded inline.

(If we usedthis implementation of NextChar and RollBack in the generated scanners,RollBack would need to truncate the final character away from lexeme.)2.5 Implementing Scanners 71Char ← Buffer[Input];Input ← 0;Input ← (Input + 1) mod 2n;Fence ← 0;if (Input mod n = 0)then begin;fill Buffer[0 : n];Initializationfill Buffer[Input : Input + n - 1];Fence ← (Input + n) mod 2n;end;return Char;Implementing NextCharif (Input = Fence)then signal roll back error;Input ← (Input - 1) mod 2n;Implementing RollBackn FIGURE 2.17 Implementing NextChar and RollBack.As a natural consequence of using finite buffers, RollBack has a limited history in the input stream.

Характеристики

Тип файла

PDF-файл

Размер

8,27 Mb

Материал

Rice

Тип материала

Другое

Предмет

Конструирование компиляторов

Высшее учебное заведение

МГУ им. Ломоносова

Список файлов учебной работы

rice.rar

Rice

COMP 512 Advanced Compiler Construction_files

2007Shield.jpg

Другое

08Taxonomy.pptx

978-0-262-31709-2-ch014.pdf

peephole.pptx

scheduling.pptx

Купер и Торчсон - перевод

Рисунки

Рис1-1.jpg

12 Instruction Scheduling(Engl).doc

12 Instruction Scheduling(Rusl).doc

12a Instruction Scheduling (Terminology)E.doc

12a Instruction Scheduling (Terminology)R.doc

12b Instruction Scheduling (Side-bars)E.doc

12b Instruction Scheduling (Side-bars)R.doc

Cooper_Engineering_a_Compiler(Second Edition).pdf

COMP 512 Advanced Compiler Construction.htm

Rice University Houston COMP 412- Topics in Compiler Construction.doc

Rice University Houston COMP 512 Advanced Compiler Construction.doc

Прочти меня!!!.txt

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.