Nash - Scientific Computing with PCs (523165), страница 55
Текст из файла (страница 55)
More remarkably, VG is available at no charge from theNational Institute for Standards and Technology (NIST) using the FTP protocol on academic computernetworks. Versions exist for several FORTRAN compilers.Of the above packages, VG is the only one we have used to prepare a graph. We have received and rundemonstrations of the other two, but have not used them in a practical situation. Frankly, we regard theprogramming of graphical displays as a very heavy task. If there are tools available (such as Stata orQUATTRO) that allow us to develop the displays we need, then we will choose these easier-to-use tools.If something special is needed, one could consider True BASIC, which offers add-on packages such as aStatistics Graphics Toolkit (which includes boxplots) and a 3D Graphics Toolkit.
Programs are easy towrite, but the language is nonstandard, though it does exist on multiple platforms, in particular MS-DOS,Macintosh and some UNIX workstations.19.5 Displaying Level and VariabilityIn Figure 19.5.1 we have used Stata to display both variability and level via boxplots and one-way plotsof execution time.
Note that in this display we have stacked the displays to enable some comparison ofthe execution times of the three programs. We note a wide variation in scale. There are a very few pointsat the right hand side of each one-way scatter plot; most points are scrunched together at the left-handside. Taking logarithms of the data (Figure 19.5.2) brings out more detail.Figure 19.5.1Level and variability — natural scale19: GRAPHICAL TOOLS FOR DATA ANALYSISFigure 19.5.2165Level and variability — logarithmic scaleFigure 19.5.2 has a reasonable scale for comparing the relative levels and variabilities of the execution166Copyright © 1984, 1994 J C & M M NashSCIENTIFIC COMPUTING WITH PCsNash Information Services Inc., 1975 Bel Air Drive, Ottawa, ON K2C 0X1 CanadaCopy for:Dr. Dobb’s Journaltimes of the programs.
However, a direct comparison of the times for given problems is not possiblebecause we cannot label the points. To overcome this we plot the times for one program versus those foranother and use the problem name — the letters of the alphabet A through T that we added into theQUATTRO worksheet during data entry — as the point label. Stata offers the possibility of addingboxplots and one-way scatterplots along the axes of the graph, so we get more direct visualization of thevariability by selecting this option. Interestingly, we need not transform to logarithmic data before plottingthis display since Stata offers a "log scale" option for XY plots, though not for the simpler one dimensionalplots. Figure 19.5.3 shows the output.
We note in passing that similar plots can be prepared for iterationcount and function/gradient count data.Figure 19.5.3Level and variability — enhanced XY plot in log scale19.6 Displaying Relative MeasurementsThe purpose of the Nash S G and Nocedal study was to try to understand which algorithms (usingFortran program implementations as proxies) were better in dealing with different types of problems. Weare trying to find the winners and losers in a set of computer program races. This calls for some form ofrelative measurement rather than the plotting of the actual numbers as above.One useful tool (used in Nash J C and Nash S G, 1988) has been a graph of ratio-to-best information. Thatis, we find the best outcome for a given case (problem) over all the competing methods (programs).
Wethen compute the ratio of the performance of each method to the best performance. For timings, the ratioof the "winner" to the "best" will naturally be 1, while all other methods will have larger ratios. Withmeasures of performance that increase with "better" performance, we may want to use reciprocals orotherwise alter our approach. However, the general idea stays the same. Watch out for zero dividescaused by missing values!A spreadsheet can easily develop the "best" and "ratio-to-best" information. This can be plotted using19: GRAPHICAL TOOLS FOR DATA ANALYSIS167simple bar graphs. Figure 19.6.1 is an example.
We would like for each case (problem) to flag the "winner"more clearly. In an attempt to bring out the best case, we reverse the sign of the ratio that was 1.0, anduse -2 as the quantity to plot so that the "winner" stands out. Unfortunately, We found it quite easy tomake an error in this process. We recommend checking several random graph points to see that they arein the correct position.The prevalence of any one program as "winner" is highlighted by the different shadings, but could bemade more prominent by use of color.
We have manually colored printouts from conventional printers.Color printing would, of course, be useful, but at the time of writing is still relatively expensive toreproduce. Figure 19.6.1 was produced with Lotus 1-2-3 v.2.2. Similar displays can clearly be used toshow "losers" by obvious changes in the computation of the ratios and "flag" values.An alternative graph that displays the relative performance of the method is called an "area plot" byQUATTRO (DOS version). We present an example as Figure 19.6.2, where again we use the ratio-to-bestfigures for timings.
Note that a narrow band is "better" here. It is more difficult to flag the "winner" or"loser" in each case — an unsatisfactory aspect of this type of plot. Still, the use of shaded areas allowsthe human eye to "integrate" the overall performance, with a larger area being poorer. An alternativewould adjust the ratios to make larger imply "better".
Different choices of shadings could alter ourperceptions of the relative performance of the programs.We note that while QUATTRO makes this plot easy to generate, it is not, to our knowledge, a commonfeature of many graphics packages. A common difficulty that arises when selecting graphics software isthat we really want to use two different types of graphs that are each only available in separate packages.As in this case study, we must then move data around and spend time and effort learning more than onepackage.Figure 19.6.1Relative performance — ratio-to-best bar chart168Copyright © 1984, 1994 J C & M M NashNash Information Services Inc., 1975 Bel Air Drive, Ottawa, ON K2C 0X1 CanadaFigure 19.6.2SCIENTIFIC COMPUTING WITH PCsCopy for:Dr.
Dobb’s JournalRelative performance — ratio-to-best area plot19.7 Displaying Patterns and OutliersMuch of the work of researchers is devoted to understanding patterns that lead to principles explainingphenomena. Therefore tools that help us to discover or "see" such patterns are important in aiding theprocess of developing an understanding of phenomena.One such tool is the three-dimensional plot, abbreviated 3D plot. Figure 19.7.1 displays simultaneouslythe execution times using all three of the program codes for all the test problems except the one labelledF (which has no timing for the CG program code).Readers will note that Figure 19.7.1 is not clear, with a vertical line on the right-hand side of the plot and"jaggy" diagonal lines.
The Student Edition of EXECUSTAT we used to prepare this graph does notinclude facilities for exporting graphics into our word processing software (WordPerfect 5.1). Figure 19.7.1was created as follows:1.A screen capture program (we used SNAP) was run. This intercepts "Print Screen" commands.2.EXECUSTAT is started and the desired plot displayed on the screen. Adjustments are made to thehorizontal and vertical display "angles".3.The Print Screen command is issued. SNAP intervenes and presents a dialog box that allows us toname a file to hold the graphic image and to set the file format. (We chose monochrome PCX format.)4.After leaving EXECUSTAT, we load our word processor and import the graph.
EXECUSTAT leavessome control information on the screen and this is captured along with the graph. Some of thisextraneous material may be moved out of sight by adjusting the graph size in the "window" for ourfigure.Screen capture is inferior to facilities to transfer clear, scalable graphic images to other software.Nevertheless, it is the type of mechanism we must sometimes use to save useful graphs or other screendisplays.19: GRAPHICAL TOOLS FOR DATA ANALYSISFigure 19.7.1169Three-dimensional point plot (EXECUSTAT)As a tool for interactively finding patterns, the rotation of 3D plots around both horizontal and verticalaxes is useful.
An excellent implementation is the statistical package Data Desk for the Macintosh, writtenby Paul Velleman of Cornell University. While other packages — e.g., EXECUSTAT and SYSTAT — usekeyboard or mouse commands to rotate the cloud point image about the horizontal or vertical axes, DataDesk uses the idea that the cloud point is like a globe or ball floating in a dish. With an icon that lookslike a human hand and is manipulated by the Macintosh mouse, one can "click" anywhere on the imageand "roll" it around in any direction. One can even "click", roll and let go, leaving the image to rotatearound whatever axis the motion implied.
This is masterful human factors engineering.Even with the less elegant horizontal/vertical rotations, cloud point spinning can bring to life patterns indata. We will illustrate this with an example outside the present study and consider a plot of some datafor automobiles, namely number of cylinders, displacement and price. These are plotted in Figure 19.7.2,the images of which were captured as above.
Nevertheless, the idea comes through clearly.While 3D plots are useful for seeing patterns, some details influence their effectiveness in revealing datafeatures:•If the points displayed are too small we cannot see them; if too large, they clutter the display and hideother points.170Copyright © 1984, 1994 J C & M M NashSCIENTIFIC COMPUTING WITH PCsNash Information Services Inc., 1975 Bel Air Drive, Ottawa, ON K2C 0X1 CanadaCopy for:Dr.