Nash - Scientific Computing with PCs (523165), страница 53
Текст из файла (страница 53)
Dobb’s Journal18.6 Compiler and Arithmetic Time DifferencesNew versions of programming tools are often advertised as "better" and "faster". Let us see if newerversions of the popular Turbo Pascal compilers give better runtime performance. Ignoring someannoyances of obtaining reliable timings, we simply graph the execution times for the Choleskydecomposition against an index for the compiler (Figure 18.6.1). From this graph the improvements inperformance from version to version are quite clear, except the slight slowdown with the system overheadin running Turbo Pascal for Windows (TPW).For other programming languages, we noted some improvement in execution time for Microsoft Fortranbetween versions 3.2 and 4.1, but no appreciable difference between 4.1 and 5.1.
We also noted that theprogram compiled with version 5.1 failed on the PC with a NEC V20 processor. The fastest executiontimes were observed with the Silicon Valley Systems compiler. However, this comes at a price: theprogram code is large and will only run on Intel 80386 and later processors.We deliberately avoid details in this section, as there are many fine points that would require lengthydiscussion. Again, we urge that timings be run for the tasks and systems that will be used regularly.Figure 18.6.1Minimum execution time on any of the three Cholesky variants for four versions of theTurbo Pascal compiler.18.7 Processor and Configuration Time DifferencesWe conclude this study by considering how processor type and presence of a numeric co-processor affectthe speed of our computations.The first difficulty we encounter is that of timing programs without the co-processor if it is installed (seeSection 8.6).
Special system settings to suppress the co-processor did not work as we expected for severalof the compilers used. Environment variables that can be SET to suppress co-processor use have varying18: EFFICIENCY STUDY — CHOLESKY DECOMPOSITIONS159names and control values; they may not be indexed in manuals. To avoid spurious comparisons, we haveremoved results from consideration if the ratio of timings without and with co-processor was less than1.1.The second difficulty is that the arithmetic used with and without the co-processor may not be equivalent.The IEEE 754 (IEEE, 1985) binary floating-point standard has not been followed by all compiler writers,though recent emulations in software seem quite reasonable.
Early compilers made no pretence toequivalence, sometimes sacrificing good features of IEEE arithmetic for execution speed. We do not pursuethe possibility that co-processors may be incorrectly used.A third issue is that there are now several manufacturers of co-processors. Some of these (for exampleCyrix and ULSI) claim pin for pin compatibility with the Intel 80x87 chips along with higher speed.Others, such as the Weitek co-processor, use a different physical and logical interface to our problems.In either case, timing results and their comparisons may be affected.Despite these matters, we see from Table 18.7.1 that a co-processor can be highly effective in speeding upexecution.
Moreover, we gain the added advantage of a documented standard for the arithmetic used.Given the relatively low price of the co-processors, they are very cost-effective if any serious numericalwork is required, and we recommend their acquisition and use. On the negative side, we remind readersof a program failure observed on our 80286 PC related to co-processor use (Section 8.3).Table 18.7.1Summary results of the average ratio of the times for Cholesky decomposition over thethree algorithms and five matrix sizes for different compilers without and with numericco-processor active.machinetypenumber ofcompilersmeanratioStDevratioMinratioMaxratio8038658.382.934.4711.178028653.75.5053.144.22NEC V2078.553.136.1112.97Previous Home Next160Copyright © 1984, 1994 J C & M M NashSCIENTIFIC COMPUTING WITH PCsNash Information Services Inc., 1975 Bel Air Drive, Ottawa, ON K2C 0X1 CanadaCopy for:Dr.
Dobb’s JournalChapter 19Case study: graphical tools for data analysis19.119.219.319.419.519.619.7Purpose of the studyExample data for analysisDisplay characteristicsChoices of graphical toolsDisplaying level and variabilityDisplaying relative measurementsDisplaying patterns and outliersIn this chapter we present an example of data analysis using graphs. This is typical of the type of exercisethat researchers, engineers and statisticians carry out. The particular situation we have chosen is theperformance analysis of three program codes for solving nonlinear function minimization problems.
Thisis a task close to the topic of this book, but we do not distinguish our own field from others — the subjectmay be different, but the tools and methods used are the same.19.1 Purpose of the StudyThe focus of the analysis, which we do not carry to completion here, is the understanding of thedifferences in the performance of the programs over a set of test problems. We will attempt to addressthe following issues:•What information should we display in any graph or set of graphs?•How should we draw graphs — the choice, placement and scale of graphical features for good effect?•Which tools should be used for preparing displays easily?There is rarely a "best" approach or tool for preparing graphical displays, whose effectiveness for dataanalysis depends on human perception and is colored by personal tastes and preferences.
Nevertheless,this case study presents some general ideas we believe are widely applicable.More detail on the development of statistical graphics can be found in the excellent books by Tufte (1983)and Cleveland (1985). We recommend these to anyone whose job involves the preparation of graphs tobe used in data analysis and presentation.19.2 Example Data for AnalysisThe data we wish to analyze consists of a set of performance information for three nonlinear functionminimization programs as published by Nash S G and Nocedal (1991). The published paper contains alarge amount of numerical data.
We shall use only a part of this, namely the data for large test problemstheir Tables 5 and 11. The programs tested were three nonlinear function minimization codes, written inFORTRAN, and tested on an Encore Multimax computer. The three codes are:•Truncated Newton method (TN);•Limited memory Broyden-Fletcher-Goldfarb-Shanno method (LBFGS);19: GRAPHICAL TOOLS FOR DATA ANALYSIS•161Polak-Ribière Conjugate Gradient method (CG).The measures of performance for the codes are:•The execution time in seconds;•The number of iterations, that is, major cycles within the codes. These iterations may have vastlydifferent amounts of computational work, and the Nash S G and Nocedal paper addresses this insome detail.•The number of function and gradient evaluations. This is a major cost in all function minimizationmethods.
For these programs, the function and gradient are evaluated together, which is often efficientsince the function and gradient usually have common components. However, some methods may takeadvantage of situations where the function is very much "cheaper" to evaluate than the gradient orvice-versa, and separate counts for function and gradient evaluations would then be appropriate(Nash J C and Walker-Smith, 1987).The data we shall study is made up of the following elements:•The problem size n, that is, the number of parameters in a nonlinear function that are varied to seeka minimum of the function.•For each program, TN, LBFGS, and CG : the execution time, the number of iterations, and the numberof function/gradient evaluations until termination (i.e., convergence to a minimum, a failure, or morefunction/gradient evaluations than a preset limit).Very similar sets of data are commonly used in performance evaluations. Another example is found inthe paper by J C Nash and S G Nash (1988).19.3 Display CharacteristicsPresenting data graphically is more than simply issuing a command to a computer program to "plot thedata".
If graphs are to have a useful function in clarifying the information that is partially hidden in acollection of data, then we need to decide what aspects of the data we wish to reveal. This sectionconsiders some aspects of drawing graphs.First, we want to reveal magnitudes to illustrate level of performance for the computer programs we aretrying to understand. In statistics, we would be seeking measures of location. Such graphs are used almosteverywhere. They are the substance of most "presentation graphics" or "business graphics" packages.
Whileimportant, they are not the only form of display needed to help understanding of the information at hand.Variability is as important as level. In the case of program performance, we would like to know if aprogram can be expected to take approximately equal times to compute the minima of each of severaldifferent functions of the same order (n). Similarly, we want to know if different initial starting pointsproduce widely different executions times for the same test function.One way to see variability would be to compute a summary statistic such as the standard deviation.
(Thisis one measure of dispersion to statisticians.) While we could plot the value of the standard deviation foreach program using a plot such as a bar graph, this may hide details we want to see. All but one of a setof test functions of order n may have the same execution time, but the remaining problem takes 100 timesas long. A set of fairly well spaced timings could have the same value of the standard deviation — clearlya different situation. To display such information about the distribution of execution times (or otherperformance measure), we can use:•Histograms, which are frequency bar charts;•Box-and-whisker plots, abbreviated as boxplots, which summarize distributional information in a waythat permits several distributions to be easily compared;162Copyright © 1984, 1994 J C & M M NashSCIENTIFIC COMPUTING WITH PCsNash Information Services Inc., 1975 Bel Air Drive, Ottawa, ON K2C 0X1 CanadaCopy for:Dr.
Dobb’s JournalFigure 19.2.1Example data from Nash S G and Nocedal (1991) as entered in a Quattro worksheetPnumN32003120012002200534032950065003996147 100049 100051 100050 100043 100028 100010 10008 100038 100042 1000052 1000048 10000ItCGf-gTimeIt333634804110662810981611452254129416167684714999924911449848895193026159159156456206020.338406579267.151531918110493.711623.372.11328511518365733313767.313868.921103724839695173457481054165154544645716541033040514637L-BFGSf-gTime74824999917856349117717215761574762061114344231785029605.8504064641015.438918716835.680.132815.435.889.42130210232.5261ItTNf-g764383714123562722143530151530123324529599209294561005434463871606837021075752005820811119118Time pname22024.94281546568.879636415431.846711859.134.614223.912169690.9639CIABTHDKNPRQMGFEJLSO•Dotplots (MINITAB) or one way scatter plots (Stata) that display each observation as a dot or verticalbar along a single line.