Cooper_Engineering_a_Compiler(Second Edition) (1157546), страница 76
Текст из файла (страница 76)
Some languages forbid passing expressions as actual parameters to call-by-referenceformal parameters.Inside the callee, each reference to a call-by-reference formal parameterneeds an extra level of indirection. Call by reference differs from call byvalue in two critical ways.
First, any redefinition of a reference formalparameter is reflected in the corresponding actual parameter. Second, anyreference formal parameter might be bound to a variable that is accessibleby another name inside the callee. When this happens, we say that the namesare aliases, since they refer to the same storage location. Aliasing can createcounterintuitive behavior.Call by referencea convention where the compiler passes anaddress for the formal parameter to the calleeIf the actual parameter is a variable (rather thanan expression), then changing the formal’s valuealso changes the actual’s value.300 CHAPTER 6 The Procedure AbstractionConsider the earlier example, rewritten in pl/i, which uses call-by-referenceparameter binding.c = fee(2,3);fee: procedure (x,y)a = 2;returns fixed binary;AliasWhen two names can refer to the same location,they are said to be aliases.In the example, the third call creates an aliasbetween x and y inside fee.declare x, y fixed binary;b = 3;x = 2 * x;y = x + y;c = fee(a,b);return y;b = 3;end fee;c = fee(a,a);a = 2;With call-by-reference parameter binding, the example produces differentresults.
The first call is straightforward. The second call redefines both aand b; those changes would be visible in the caller. The third call causesx and y to refer to the same location, and thus, the same value. This aliaschanges fee’s behavior.
The first assignment gives a the value 4. The secondassignment then gives a the value 8, and fee returns 8, where fee(2,2)would return 6.Call byaReturnbReferenceinoutinoutValuefee(2,3)fee(a,b)22483373778fee(a,a)Space for ParametersThe size of the representation for a parameter has an impact on the costof procedure calls. Scalar values, such as variables and pointers, are storedin registers or in the parameter area of the callee’s ar. With call-by-valueparameters, the actual value is stored; with call-by-reference parameters, theaddress of the parameter is stored.
In either case, the cost per parameter issmall.Large values, such as arrays, records, or structures, pose a problem for callby value. If the language requires that large values be copied, the overheadof copying them into the callee’s parameter area will add significant cost tothe procedure call. (In this case, the programmer may want to model call byreference and pass a pointer to the object rather than the object.) Some languages allow the implementation to pass such objects by reference. Othersinclude provisions that let the programmer specify that passing a particular6.4 Communicating Values Between Procedures 301parameter by reference is acceptable; for example, the const attribute in cassures the compiler that a parameter with the attribute is not modified.6.4.2 Returning ValuesTo return a value from a function the compiler must set aside space for thereturned value.
Because the return value, by definition, is used after thecallee terminates, it needs storage outside the callee’s ar. If the compilerwriter can ensure that the return value is of small fixed size, then it can storethe value either in the caller’s ar or in a designated register.With call-by-value parameters, linkageconventions often designate the registerreserved for the first parameter as the register tohold the return value.All of our pictures of the ar have included a slot for a returned value.
Touse this slot, the caller allocates space for the returned value in its own ar,and stores a pointer to that space in the return slot of its own ar. The calleecan load the pointer from the caller’s return-value slot (using the copy of thecaller’s arp that it has in the callee’s ar). It can use the pointer to access thestorage set aside in the caller’s ar for the returned value. As long as bothcaller and callee agree about the size of the returned value, this works.If the caller cannot know the size of the returned value, the callee may needto allocate space for it, presumably on the heap. In this case, the callee allocates the space, stores the returned value there, and stores the pointer in thereturn-value slot of the caller’s ar.
On return, the caller can access the returnvalue using the pointer that it finds in its return-value slot. The caller mustfree the space allocated by the callee.If the return value is small—the size of the return-value slot or less—then thecompiler can eliminate the indirection. For a small return value, the calleecan store the value directly into the return value slot of the caller’s ar. Thecaller can then use the value directly from its ar.
This improvement requires,of course, that the compiler handle the value in the same way in both thecaller and the callee. Fortunately, type signatures for procedures can ensurethat both compiles have the requisite information.6.4.3 Establishing AddressabilityAs part of the linkage convention, the compiler must ensure that each procedure can generate an address for each variable that it needs to reference.In an all, a procedure can refer to global variables, local variables, and anyvariable declared in a surrounding lexical scope. In general, the address calculation consists of two portions: finding the base address of the appropriatedata area for the scope that contains the value, and finding the correct offsetwithin that data area. The problem of finding base addresses divides into twoData areaThe region in memory that holds the data for aspecific scope is called its data area.Base addressThe address of the start of a data area is oftencalled a base address.302 CHAPTER 6 The Procedure Abstractioncases: data areas with static base addresses and those whose address cannotbe known until runtime.Variables with Static Base AddressesCompilers typically arrange for global data areas and static data areas to havestatic base addresses.
The strategy to generate an address for such a variableis simple: compute the data area’s base address into a register and add itsoffset to the base address. The compiler’s ir will typically include addressmodes to represent this calculation; for example, in iloc, loadAI representsa “register + immediate offset” mode and loadAO represents a “register +register” mode.To generate the runtime address of a static base address, the compilerattaches a symbolic, assembly-level label to the data area. Depending onthe target machine’s instruction set, that label might be used in a load immediate operation or it might be used to initialize a known location, in whichcase it can be moved into a register with a standard load operation.Name manglingThe process of constructing a unique string from asource-language name is called name mangling.If &fee.
is too long for an immediate load, thecompiler may need to use multiple operations toload the address.The compiler constructs the label for a base address by mangling the name.Typically, it adds a prefix, a suffix, or both to the original name, using characters that are legal in the assembly code but not in the source language. Forexample, mangling the global variable name fee might produce the label&fee.; the label is then attached to an assembly-language pseudo-operationthat reserves space for fee. To move the address into a register, the compilermight emit an operation such as loadI &fee.
⇒ ri . Subsequent operationscan then use ri to access the memory location for fee. The label becomes arelocatable symbol for the assembler and the loader, which convert it into aruntime virtual address.Global variables may be labelled individually or in larger groups. Infortran, for example, the language collects global variables into commonblocks. A typical fortran compiler establishes one label for each common block. It assigns an offset to each variable in each common block andgenerates load and store operations relative to the common block’s label.If the data area is larger than the offset allowed in a “register + offset”operation, it may be advantageous to have multiple labels for parts of thedata area.Similarly, the compiler may combine all the static variables in a single scopeinto one data area.
This reduces the likelihood of an unexpected namingconflict; such conflicts are discovered during linking or loading and can beconfusing to the programmer. To avoid such conflicts, the compiler can basethe label on a globally visible name associated with the scope. This strategydecreases the number of base addresses in use at any time, reducing demand6.4 Communicating Values Between Procedures 303for registers.