Nash - Scientific Computing with PCs (523165), страница 27
Текст из файла (страница 27)
They are also a first line of attack in finding errors. Given that we all tend to read what we thinkis present rather than what is actually written on the page, it is usually necessary to mark the listing tofind errors. Lines and arrows can be used to show program flow, brackets can be used to mark loops anda small example can be hand-worked beside each statement. Data files can be annotated to note whichprogram statements will work on which parts of the data.9: DEBUGGING77It is helpful if listing tools "print" a program source code file in a way that renders it easy for humans tocomprehend the program content and structure. That is, we want to split multi-statement lines, ensurethat there are blanks between keywords and variables, indent loops, lay out comments or remarks clearly,allow subroutines to be separated, with pagination and titling (under the user’s control), for neatness ofpresentation.
We believe that it is also useful if the program can list sets of files in a batch without theneed for operator attention, add time / date stamps of the file and the time of listing, cope with differingpage and font sizes, number the lines on the listing, provide a cross-reference of variables and subprograms to these line numbers along with target destination line numbers for any transfers of control,and finally permit this listing to be output to a file for inclusion in documents.Satisfying these desiderata is a time-consuming and arduous task.
Many details must be kept in mind,such as the line and page size, and dialect differences in programming languages. Some source programsmay be SAVEd in a tokenized form where common keywords such as PRINT are replaced with a singlespecial character. We have found, especially in our earlier work in BASIC where variables are global, thatlisting tools are among the more heavily used programs in our repertoire. For other programminglanguages the availability of proper sub-program structures allows us to segment and localize variablenames.
Still, clean or "pretty" listings do help in the detection and correction of errors, for example,unbalanced comment delimiters. If new language elements are introduced, for example, the object-orientedfacilities in Turbo Pascal versions 5.5 and later, then listing program(s) will require alteration.While commercial program listing tools exist, we have found it useful to have access to the source codeof the listing tool itself. On several occasions we have modified such programs to adjust for differentprinters or to send output to a file. Our BASIC cross-reference lister XREF.BAS (Section 6.2) can beobtained via electronic mail; send requests to jcnash@acadvm1.uottawa.ca. We also mentioned thePASCAL program TXREF (Roberts, 1986).
For FORTRAN code, POLISH (Dorrenbacher et al., 1979) waswritten at the University of Colorado by computer science students under the direction of Lloyd Fosdick.9.4Sub-testsWhen building a large program ourselves or trying to dissect one created by another programmer, it isobviously useful to know that each segment of the program is performing exactly as intended. The keyword here is "exactly". It is difficult to find and correct errors when the program "mostly" or "almostalways" does what we want it to do.In divide-and-conquer fashion, we can often break up a program into small segments of code and testeach part separately before assembling or reassembling the entire program.
In FORTRAN and otherlanguages with facilities for separate compilation of sub-programs, this is a natural consequence of thelanguage constructs. In interpreted languages with global variable names, such as BASIC, APL or somespecial-purpose packages like Stata or MATLAB, we must be more careful aboutvariable and arraynames, but the process is the same.There are two ways to be sure a program is correct: prove it is correct, or test all the possibilities. Proofsof program correctness are important in theoretical computer science and in certain applicationsdemanding the security that a program will always do what it should do. The proof process encompassesa careful examination of all the possible execution paths of the program no matter what the input. Assuch, it corresponds to testing all the possibilities.
Most users, we included, are too eager to get on withthe job of running the program to think of all the possible and bizarre combinations of input data thatmay at some point be presented to our program code. However, in the event that an error does occur, onemay have to carry out such a test. The following questions may help in building a test data set andadjusting the program code.•What inputs are acceptable to this program or program segment? The allowed possible values of allvariables used by the program must be specified.
If necessary, the program should test to see if theinputs are acceptable and provide for reporting of prohibited input data.78Copyright © 1984, 1994 J C & M M NashSCIENTIFIC COMPUTING WITH PCsNash Information Services Inc., 1975 Bel Air Drive, Ottawa, ON K2C 0X1 CanadaCopy for:Dr. Dobb’s Journal•Is there a unique (or at least well-defined) result for each valid set of input data? If not, is there asuitable error exit from the program?•Can sets of data be prepared that easily show the processes in the two preceding points?•Can every path through the program segment be traced to show that it performs the desiredfunctions?As an example, consider the code in Figure 6.2.1 to calculate the cube root of a number by Newton’siteration. The process is designed to find a solution or root of(9.4.1)where(9.4.2)where(9.4.3)f(x) = x3 - b = 0b is the number for which a cube root is desired.
The Newton iteration usesxnew = xold - f(xold)/f ’(xold)f ’(x) is the first derivative of f(x), that is,f’(x) = 3x2Note that x=0 causes a zero-divide unless we are careful. Our test data set should include zero andnear-zero numbers to examine the behavior of the program in this region. It should also include negativenumbers so the correct sign of cube roots can be verified.
Very large numbers in the data set test upperlimits on sizes of inputs that may be accepted.9.5Test and Example DataWhen the parts of a program are put together, the ensemble may still not perform correctly even if eachsubunit meets its specifications precisely. This may be a result of inconsistent design specifications fordifferent parts of the program, but is more likely a consequence of their being incomplete.
Users are oftenthe first to discover such errors.Whatever the deficiency, be it in design or coding or even some failure of the machine or programminglanguage to perform as expected, a form of test is needed to ensure the program is working properly.Very few programs are complemented by a set of test data that fully exercises the program. A proper testdata set is difficult to devise for most programs. It should:•Contain data for normal cases for which exact or very precise results are known;•Contain data sets that are extreme in that they force the program or the method to the limit of itscapabilities;•Contain data sets that ensure every control path within the program to be executed at least once;•Contain data sets that generate error messages from the program.Each control switch (e.g., IF statement) in a program gives two control directions, so the number ofpossible paths will likely be too large to test.
It is rare that we can ensure that every path is tested.However, it is not difficult to think of data sets that test each line of code at least once, thus ensuring thatall the error traps we have built into the program work.Programs to be employed by users of diverse sophistication must be robust in handling input that maybe totally inappropriate. That is, the program should not lose control when given the wrong type ofinformation but should allow for some manner of recovery. This robustness is frequently left out ofscientific programs, because the programmer must look after functions that are usually in the operatingsystem or programming language.
For example, on typewriters a letter I or lower-case letter L might beused for numeral 1 (one), or letter O (oh) for digit 0 (zero). Such letters as input for numbers will usuallycause programs to fail. We devised a program to use "O" and "l" and accept commas in numbers, but the9: DEBUGGING79code is nearly two pages long even for such "obvious" cases. In a batch processing environment, simplefailure of the program with a proper error message may be acceptable.
For interactive computing,robustness is more important since an error near the end of a long input phase is very frustrating to theuser. A robust program will detect errors and allow for correction or reentry of the data.Programmers can help users avoid errors initiated by incorrect input. Programs can be set up so that acarriage return is a suitable response to most requests for input. Programmers using "sensible defaults"should be careful how default choices are established for file names. Using the name of file that alreadyexists may overwrite that file, destroying data.Choosing defaults, command names and command structures is a time-consuming task, requiring manyhours of effort and many more of testing and refinement.
It is complicated by hypertext-style interfaces(e.g., Nash J C 1990b, Nash J C 1993b).The use of menus or selection boxes is also helpful to users, but can be overdone. We recommend it forpermitting user selection of one file from several or for "checkoff" of options.
A mouse or other pointingdevice could be used, but programming with character position cursor controls or even numerical or letterselections is easier and almost as effective. The simpler approach also lends itself easily to saving screenoutput via a console image file as discussed in Section 9.6.When a filename has to be supplied to a program, it is helpful if a program displays a list of available files(of an appropriate type or naming convention). This helps us supply the right data to programs. Ideally,users should be able to check the contents of files, disks or disk directories from within a program, or beable to escape to the operating system to do this, then resume program execution. Techniques exist onmost computers for such "shelling", but they are not standardized across operating systems, so if used,reduce the portability of programs.A final function of example data is to show the uses of a program.
Such examples also allow a user to edithis/her own data in the place of the sample data, rather than working from instructions. In ourexperience, this is much more efficient than a prescription of what to do in user documentation. We likeexample data to be accompanied by some form of the results to be expected. Few scientific programs wehave encountered do a very good job of providing the output.9.6Scripts and DocumentationWhile not yet common practice, we believe it should be possible to put comments in input files. Nash J C(1991a) describes a program where data file lines beginning with an exclamation mark (!) are simplydisplayed on the screen and copied to the console image file. A better choice would be to allow commentsanywhere in the input file, {for example, delimited by curly braces as in Pascal code and this phrase}.