Главная » Все файлы » Просмотр файлов из архивов » PDF-файлы » A.W. Appel - Modern Compiler Implementation in C

A.W. Appel - Modern Compiler Implementation in C, страница 3

Описание файла

PDF-файл из архива "A.W. Appel - Modern Compiler Implementation in C", который расположен в категории "статьи". Всё это находится в предмете "конструирование компиляторов" из седьмого семестра, которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .

Просмотр PDF-файла онлайн

Текст 3 страницы из PDF

DATA STRUCTURES FOR TREE LANGUAGESfunctions.8. Each module (header file) shall have a prefix unique to that module (example,A_ in Figure 1.5).9. Typedef names (after the prefix) shall start with lower-case letters; constructorfunctions (after the prefix) with uppercase; enumeration atoms (after the prefix)with lowercase; and union variants (which have no prefix) with lowercase.Modularity principles for C programs.

A compiler can be a big program;careful attention to modules and interfaces prevents chaos. We will use theseprinciples in writing a compiler in C:1. Each phase or module of the compiler belongs in its own “.c” file, which willhave a corresponding “.h” file.2. Each module shall have a prefix unique to that module. All global names(structure and union fields are not global names) exported by the module shallstart with the prefix. Then the human reader of a file will not have to lookoutside that file to determine where a name comes from.3. All functions shall have prototypes, and the C compiler shall be told to warnabout uses of functions without prototypes.4. We will #include "util.h" in each file:/* util.h */#include <assert.h>typedef char *string;string String(char *);typedef char bool;#define TRUE 1#define FALSE 0void *checked_malloc(int);The inclusion of assert.h encourages the liberal use of assertions by the Cprogrammer.5.

The string type means a heap-allocated string that will not be modified afterits initial creation. The String function builds a heap-allocated string froma C-style character pointer (just like the standard C library function strdup).Functions that take strings as arguments assume that the contents will neverchange.6. C’s malloc function returns NULL if there is no memory left. The Tigercompiler will not have sophisticated memory management to deal with thisproblem.

Instead, it will never call malloc directly, but call only our ownfunction, checked_malloc, which guarantees never to return NULL:11CHAPTER ONE. INTRODUCTIONvoid *checked_malloc(int len) {void *p = malloc(len);assert(p);return p;}7. We will never call free. Of course, a production-quality compiler must freeits unused data in order to avoid wasting memory. The best way to do thisis to use an automatic garbage collector, as described in Chapter 13 (seeparticularly conservative collection on page 280). Without a garbage collector,the programmer must carefully free(p) when the structure p is about tobecome inaccessible – not too late, or the pointer p will be lost, but not toosoon, or else still-useful data may be freed (and then overwritten).

In orderto be able to concentrate more on compiling techniques than on memorydeallocation techniques, we can simply neglect to do any freeing.PROGRAMSTRAIGHT-LINE PROGRAM INTERPRETERImplement a simple program analyzer and interpreter for the straight-lineprogramming language. This exercise serves as an introduction to environments (symbol tables mapping variable-names to information about the variables); to abstract syntax (data structures representing the phrase structureof programs); to recursion over tree data structures, useful in many partsof a compiler; and to a functional style of programming without assignmentstatements.It also serves as a “warm-up” exercise in C programming.

Programmersexperienced in other languages but new to C should be able to do this exercise,but will need supplementary material (such as textbooks) on C.Programs to be interpreted are already parsed into abstract syntax, as described by the data types in Program 1.5.However, we do not wish to worry about parsing the language, so we writethis program by applying data constructors:A_stm prog =A_CompoundStm(A_AssignStm("a",A_OpExp(A_NumExp(5), A_plus, A_NumExp(3))),A_CompoundStm(A_AssignStm("b",A_EseqExp(A_PrintStm(A_PairExpList(A_IdExp("a"),A_LastExpList(A_OpExp(A_IdExp("a"), A_minus,A_NumExp(1))))),A_OpExp(A_NumExp(10), A_times, A_IdExp("a")))),A_PrintStm(A_LastExpList(A_IdExp("b")))));12PROGRAMMING EXERCISEFiles with the data type declarations for the trees, and this sample program,are available in the directory $TIGER/chap1.Writing interpreters without side effects (that is, assignment statements thatupdate variables and data structures) is a good introduction to denotationalsemantics and attribute grammars, which are methods for describing whatprogramming languages do.

It’s often a useful technique in writing compilers,too; compilers are also in the business of saying what programming languagesdo.Therefore, in implementing these programs, never assign a new value toany variable or structure-field except when it is initialized.

For local variables,use the initializing form of declaration (for example, int i=j+3;) and foreach kind of struct, make a “constructor” function that allocates it andinitializes all the fields, similar to the A_CompoundStm example on page 9.1. Write a function int maxargs(A_stm) that tells the maximum numberof arguments of any print statement within any subexpression of a givenstatement. For example, maxargs(prog) is 2.2. Write a function void interp(A_stm) that “interprets” a program in thislanguage. To write in a “functional programming” style – in which you neveruse an assignment statement – initialize each local variable as you declare it.For part 1, remember that print statements can contain expressions thatcontain other print statements.For part 2, make two mutually recursive functions interpStm and interpExp. Represent a “table,” mapping identifiers to the integer values assignedto them, as a list of id × int pairs.typedef struct table *Table_;Table_ {string id; int value; Table_ tail};Table_ Table(string id, int value, struct table *tail) {Table_ t = malloc(sizeof(*t));t->id=id; t->value=value; t->tail=tail;return t;}Then interpStm is declared asTable_ interpStm(A_stm s, Table_ t)taking a table t1 as argument and producing the new table t2 that’s just liket1 except that some identifiers map to different integers as a result of thestatement.13CHAPTER ONE.

INTRODUCTIONFor example, the table t1 that maps a to 3 and maps c to 4, which we write{a 7→ 3, c 7→ 4} in mathematical notation, could be represented as the linkedlist a 3.c 4Now, let the table t2 be just like t1 , except that it maps c to 7 instead of 4.Mathematically, we could write,t2 = update(t1 , c, 7)where the update function returns a new table {a 7→ 3, c 7→ 7}.On the computer, we could implement t2 by putting a new cell at the head ofthe linked list: c 7as long as we assume thata 3c 4the first occurrence of c in the list takes precedence over any later occurrence.Therefore, the update function is easy to implement; and the corresponding lookup functionint lookup(Table_ t, string key)just searches down the linked list.Interpreting expressions is more complicated than interpreting statements,because expressions return integer values and have side effects.

We wishto simulate the straight-line programming language’s assignment statementswithout doing any side effects in the interpreter itself. (The print statementswill be accomplished by interpreter side effects, however.) The solution is todeclare interpExp asstruct IntAndTable {int i; Table_ t;};struct IntAndTable interpExp(A_exp e, Table_ t) · · ·The result of interpreting an expression e1 with table t1 is an integer value iand a new table t2 . When interpreting an expression with two subexpressions(such as an OpExp), the table t2 resulting from the first subexpression can beused in processing the second subexpression.EXERCISES1.114This simple program implements persistent functional binary search trees, sothat if tree2=insert(x,tree1), then tree1 is still available for lookupseven while tree2 can be used.EXERCISEStypedef struct tree *T_tree;struct tree {T_tree left; String key; T_tree right;};T_tree Tree(T_tree l, String k, T_tree r) {T_tree t = checked_malloc(sizeof(*t));t->left=l; t->key=k; T->right=r;return t;}T_tree insert(String key, T_tree t) {if (t==NULL) return Tree(NULL, key, NULL)else if (key < t->key)return Tree(insert(key,t->left),t->key,t->right);else if (key > t.key)return Tree(t->left,t->key,insert(key,t->right));else return Tree(t->left,key,t->right);}a.

Implement a member function that returns true if the item is found, elsefalse.b. Extend the program to include not just membership, but the mapping ofkeys to bindings:T_tree insert(String key, void *binding, T_tree t);void * lookup(String key, T_tree t);c. These trees are not balanced; demonstrate the behavior on the followingtwo sequences of insertions:(a) t s p i p f b s t(b) a b c d e f g h i*d. Research balanced search trees in Sedgewick [1988] and recommenda balanced-tree data structure for functional symbol tables.

(Hint: topreserve a functional style, the algorithm should be one that rebalanceson insertion but not on lookup.)15.

Свежие статьи
Популярно сейчас