Nash - Scientific Computing with PCs (523165), страница 52
Текст из файла (страница 52)
We haveremoved most of the commentary from the source code to avoid hiding the true underlying sizes. Wehave also removed right hand blanks, but made no effort to minimize program line indentation, as thiswould render the codes less readable. In any event, readers will see that the programs are relatively"small". The source programs were compiled and linked, and the table shows a summary of the executableprogram sizes.
Occasionally a single compiler allowed several variants of the executable program to becreated by various settings of compiler or linker options.Table 18.4.1 also shows a summary of the size of compressed or DIETed executables (Section 5.5). We haveuse the compressor DIET (by Teddy Matsumoto) that replaces the executable program with its compressedequivalent. Out of the 20 executables listed here, two compiled with the Silicon Valley Systems compilercould not be compressed by DIET.
The Turbo Pascal for Windows program would not run aftercompression. (The latter program must be run in the Microsoft Windows environment, so anticipates adifferent program loader.)Table 18.4.1Source and executable program sizes for the Cholesky timing tests.a) Source codesCHOLTEST.BASCHOLTEST.CCHOLTEST.FORCHOLTEST.PASbytes3625517340645694lines157256182251b) Executable codes -- summary statisticsMeanStDev.MinMax--------------------------------------Size (bytes) 46501.35 30805.212240114072DIETed32847.829925.297349114072ratio.67569.1658055.3536118: EFFICIENCY STUDY — CHOLESKY DECOMPOSITIONSFigure 18.4.1155Boxplot showing variability in executable program sizes for the Cholesky tests.To give a better idea of the program size variability, we can also use boxplots of the full and compressedexecutable sizes.
These are presented in Figure 18.4.1. From the summary table and the graph, we see:•That executable program sizes vary largely;•That compression can save space, which is useful if we are trying to pack a set of files onto as fewdistribution diskettes as possible.The variability of the executable program sizes is in part due to the manner in which differentprogramming environments store the arrays for the matrices A or L. Note that if we were to store a squarematrix of order 50 using 8 bytes per element, we would need 20,000 bytes of storage.
Even the lowertriangle requires 10,200 bytes. If the executable program includes explicit space for the array(s), then thefile will be correspondingly larger. Since this space is simply a clean blackboard for working on, we couldfind the memory required at run time. This allows smaller executable files, though at the small risk of aprogram load failure if there is not enough memory available for the arrays.Another source of "extra" program size is the mechanism used by different compilers for attaching(linking) the commonly needed pieces of program code to perform arithmetic, input-output, or specialfunctions. This run-time support can be linked as a lump, or we can select just the needed pieces. Clearly,the "lump" method makes for larger program sizes, but the compiler/linker is easier to prepare. We notethat Microsoft BASIC compilers sometimes use a variant of the "lump" method.
They allow compiled codeto be stored without having the runtime support included. At load time, a special runtime module isloaded along with the compiled code.A final note concerns the possibility of not compiling the programs, but interpreting them directly or fromsome intermediate form. This was the most popular way of executing BASIC programs on early PCs, andwe store only the source code that is translated directly.
The main drawback is that it is much moredifficult to automate the process of finding accelerations of the program execution. For reasons of speed,compiled programs are usually preferred, especially if the task is large or must be carried out often.156Copyright © 1984, 1994 J C & M M NashNash Information Services Inc., 1975 Bel Air Drive, Ottawa, ON K2C 0X1 CanadaSCIENTIFIC COMPUTING WITH PCsCopy for:Dr. Dobb’s Journal18.5 Time Differences from Program AlternativesThe simplest mechanism for comparing the program times is to take ratios, since this avoids issues ofunderlying machine speed.
We shall use the Nash variants of the program as the basis for comparison,and compute the ratios Price/Nash and Healy/Nash. (These have been made into variables ponn andhonn in our data analysis.) Figures 18.5.1 and 18.5.2 show boxplots (Tukey, 1977) of these variables forrecent and older timings respectively, with a line drawn at 1.0 for comparison.The boxplots seem to indicate that there is some average advantage for the Price variants in the morerecent timings. This is supported by the summary statistics for the ratio variables ponn and honn in Table18.5.1.Figure 18.5.1Boxplots comparing relative performance of Cholesky algorithm variants in timingsperformed in 1992.Figure 18.5.2Boxplots comparing relative performance of Cholesky algorithm variants in timingsperformed between 1975 and 1983.18: EFFICIENCY STUDY — CHOLESKY DECOMPOSITIONSTable 18.5.1157Summary statistics for ratios of Price and Healy Cholesky variant timings on Molermatrices of orders 10, 20, 30, 40, and 50 to the Nash variant.Var.
| ObsMeanStDevMin Max Median-----+-----------------------------------------ponn | 539.9526.1979802.2.97honn | 5391.0212.1767402.21.02We can also look at the direct timings on the Nash, Price and Healy variants (Table 18.5.2), but will restrictour attention to PC "386". This table suggests that, for "386" at least, the Nash variant of the Choleskyalgorithm enjoys a small advantage for the larger matrices. It also has a lower variability as measured byboth the standard deviation and the range (maximum - minimum).
However, the existence of a time ofzero for matrices of order 10 shows that our "clock" is at the limits of its resolution — it does not "tick"fast enough.The more recent ratio results indicate that the Price variant, compiler by compiler, generally does a littlebetter than the Nash variant. The Healy variant generally does a little worse than the Nash variant on thisbasis. However, for some compilers or interpreters, the Nash variant is much better (or the others aremuch worse), so that the total time taken by the Nash variant, hence its average, is smallest. The lowervariability of the Nash variant supports this; it suggests we see fewer really "bad" results.Table 18.5.2Summary statistics for Cholesky timings on a 33 MHz 80386DX Tower PC with 80387co-processor. There were 35 different timings for different compilers or compiler settings.
Timesare for 10 repetitions of the matrix decomposition of the Moler matrix of order listed.matrixorder1020304050|MeanStDev.MinMax---+--------------------------------------nash |.2322857.43560702.26price|.2154286.424515902.25healy|.2391429.477323202.47nash | 1.3591432.551194.1112.81price| 1.342.749773.114.67healy| 1.4622.919178.115.06nash | 4.1711437.750355.2838.67price| 4.2051438.664946.2846.08healy| 4.4797149.014118.2846.43nash | 9.41228617.49057.6287.22price| 9.60571419.82196.56 105.51healy| 10.1548620.4451.58 105.11nash | 17.8485733.096451.14 165.11price| 18.4048638.0141.1202.12healy| 19.2968638.862751.16 199.59Remember, however, that the same PC is decomposing the same five matrices using different methodsof running the programs.
Yet all the Choleski variants have exceedingly large variations in performance.The Max times are over 100 times bigger than the Min times in Table 18.5.2. We will try to see if we canmake any sense of this data.The relative advantage of the Nash variant in the older timings could be explained by a quirk of someinterpreted BASICs. First, execution times were difficult to measure reliably on some of these now-defunctsystems. Second, we observed for some BASICs, through simple experiments that added comments to aprogram code, that GOTO or GOSUB statements in a program were apparently carried out using aforward line-by-line search through the source code to find the target line number.
The Price and Healycodes in BASIC used jumps to lines earlier than the current line. Thus the GOTO search must passthrough almost the entire program code to find the right line, typically a few lines ahead of the currentstatement. The Nash code, while using some extra arithmetic, is faster.158Copyright © 1984, 1994 J C & M M NashSCIENTIFIC COMPUTING WITH PCsNash Information Services Inc., 1975 Bel Air Drive, Ottawa, ON K2C 0X1 CanadaCopy for:Dr.