A. Wood - Softare Reliability. Growth Models (798489), страница 4
Текст из файла (страница 4)
Test time may be artificiallyaccumulated if a non-repaired defect prevents other defects from beingfound.Defect repair introduces new defects. The new defects are less likely tobe discovered by test since the retest for the repaired code is notusually as comprehensive as the original testing.New code is frequently introduced throughout the entire test period,both defect repair and new features.
This is accounted for in parameterestimation since actual defect discoveries are used, but may change theshape of the curve, i.e., make it less concave. The multi-stage model,discussed in Section 2.4, is an attempt to account for new codeintroduction.Defects are reported by lots of groups because of parallel testingactivity. If we add the test time for those groups, we have the problemof equivalency between an hour of QA test time and an hour of testtime from a group that is testing a different product. This can beaccommodated by restricting defects to those discovered by QA, butthat eliminates important data. This problem means that defects do notcorrelate perfectly with test time.This is certainly not true for calendar time or test cases as discussedearlier.
For execution time, "corner" tests sometimes are more likely tofind defects, so those tests create more stress on a per hour basis.When there is a section of code that has not been as thoroughly testedas other code, e.g., a product that is under schedule pressure, tests ofthat code will usually find more defects.
Many tests are rerun to ensuredefect repair has been done properly, and these reruns should be lesslikely to find new defects. However, as long as test sequences arereasonably consistent from release to release, this can be accounted forif necessary from lessons learned on previous releases.Customers run so many different configurations and applications thatit is difficult to define an appropriate operational profile. In somecases, the sheer size and transaction volume of the production systemmakes the operational environment impractical to replicate.
The testscontained in the QA test library test basic functionality and operation,error recovery, and specific areas with which we have had problems inthe past. Additional tests are continually being added, but the code alsolearns the old tests, i.e., the defects that the old tests would haveuncovered have been repaired.Our experience is that this is reasonable except when there is a sectionof code that has not been as thoroughly tested as other code, e.g., aproduct behind schedule that was not thoroughly unit tested. Tests runagainst this section of code may fmd a disproportionate share ofdefects.
[Musa,87,P242] has a detailed discussion of theindependence assumption.Table 2-2. Software Reliability Model Assumptions9It is difficult to detennine how the violation of the model assumptions will affect themodels. For instance, introducing new functionality may make the curve less concave, buttest reruns could make it more concave. Removing defects discovered by other groupscomes closer to satisfying the model assumptions but makes the model less useful becausewe are not including all the data (which may also make the results less statistically valid). Ingeneral, small violations probably get lost in the noise while significant violations mayforce us to revise the models, e.g., see the discussion of Release 4 test hours at the end ofSection 3.1.
Given the uncertainties about the effects of violating model assumption, thebest strategy is to try the models to see what works best for a particular style of softwaredevelopment and test.Multi-Stage ModelsOne of the assumptions made by all the models is that the set of code being testing isunchanged throughout the test period. Clearly, defect repair invalidates that assumption,but it is assumed that the effects of defect repair are minimal so that the model is still a goodapproximation.
If a significant amount of new code is added during the test period, there isa technique that allows us to translate the data to account for the increased code change.Theoretically, the problem is that adding a significant amount of changed code shouldincrease the defect detection rate. Therefore, the overall curve will look something likeFigure 2-3, where D 1 defects are found in T 1 time prior to the addition of the new code andan additional D 2-D 1 defects are found in T2-T 1 time after that code addition. The problem isto translate the data to a model Jl(t) that would have been obtained if the new code had beenpart of the software at the beginning of the test. This translation is discussed in Chapter 15of [Musa,87].
Let Jll (t) model the defect data prior to the addition of the new code, and letJl2(t) model the defect data after that code addition. The model Jl(t) is created byappropriately modifying the failure times from Ill(t) and 1l2(t). This section describes howto perform the translation assuming Il(t), III (t), and 1l2(t) are all G-O models. In theory,this technique could be applied to any of the models in Table 2-1, including the S-shapedmodels.NumberofDefectsTest TimeFigure 2-3. Two Stage Model Transformation10Assume that model J..L1 (t) applies to the time period 0-T1 and that model J..L2(t) applies to thetime period from T 1-T2 as shown in Figure 2-3. The first step in the translation is tob1tdetermine the parameters ofthe models J..L1(t) and J..L2(t) to get J..L1(t) = a1(1-e- ) and J..L2(t) =a2( 1_e-b2t).
The calculations for J..L1 (t) are the standard techniques described in Section 2.3using the data in time period 0-T l' The calculations for J..L2(t) are also the standardtechniques assuming that the test started at time T1 and produced Dr D 1 defects.
In otherwords, subtract D 1 from the cumulative defects and subtract T} from the cumulative timewhen calculating J..L2(t).The next step is to calculate the translated time for the defects observed prior to the insertionof the new code. The time for each defect is translated according to the following equation(equation 15.18 in [Musa,87]).1:j = (-llb2)ln{1-(a1/a2)(1-e-blti)}Next, calculate the translated time for the defects observed after the insertion of the newcode.
Start by calculating the expected amount of time it would have taken to observe D 1defects if the new code had been part of the original code released at the start of the test. 2This time 1: is calculated from D 1 = a2(l-e-bit) or 1: =(-llb2)ln{ 1-D1/a2}' This time shouldbe shorter than T 1 because the failure rate would have been higher at the start of test if therewere more defects at the start of test. Then all failure times from the T 1-T2 time period aretranslated by subtracting T1-1:, i.e., 1:j = 1j-(T1-1:). This essentially translates the defect timesin the T 1-T2 time period to the left, meaning that we would have expected to have foundmore defects earlier if there were more to find at the beginning of the test.Finally, we use the standard techniques from Section 2.3 to determine the parameters a andbtbin J..L(t) = a(1_e- ), where the defect times are adjusted as described previously.
Theadjustments made to the failure times provide the failure times that would have theoreticallybeen observed if the new code had been released at the beginning of the test rather than partof the way through the test.2.3Parameter EstimationA software reliability model is a function such as those shown in Table 2-1. Fitting thisfunction to the data means estimating its parameters from the data. One approach toestimating parameters is to input the data directly into equations for the parameters.
Themost common method for this direct parameter estimation is the maximum likelihoodtechnique described in Section 2.3.1. A second approach is fitting the curve described bythe function to the data and estimating the parameters from the best fit to the curve. Themost common method for this indirect parameter estimation is the least squares technique.The classical least squares technique is described in Section 2.3.2. and an alternativesquares technique is described in Section 2.3.3.
The alternative least squares technique wasused most often since it provided the best results. A comparison of the results obtained byusing each of these techniques is described in Section 3.6.2Actually, this step is slightly more complicated. 01 is replaced by the expected number ofdefects observed in T} according to modelll} (t), i.e., 01 is replaced by III (T I)' In our experience, O}and III (T 1) are essentially identical.112.3.1 Maximum LikelihoodThe maximum likelihood technique consists of solving a set of simultaneous equations forparameter values.
The equations define parameter values that maximize the likelihood thatthe observed data came from a distribution with those parameter values. Maximumlikelihood estimation satisfies a number of important statistical conditions for an optimalestimator and is generally considered to be the best statistical estimator for large samplesizes. Unfortunately, the set of simultaneous equations it defines are very complex andusually have to be solved numerically.
For a general discussion of maximum likelihoodtheory and equation derivation, see [Mood,74] and [Musa,87]. Here, we only show theequations that must be solved to provide parameter estimates and confidence intervals forthe Goel-Okumoto (G-O) model.The expected number of defects for the G-O model isbtJl(t) = a(1_e- ), wherea = expected total number of defects in the codeb = shape factor = the rate at which the failure rate decreases.From Equation (12.117) of [Musa,87], the parameter b can be estimated by solving:LW(f.-f. )(t.e-blj _ t .
e-bt;-')/(e-bti_e-bti-')=f t 1(I_ebtW) where( 1)j =1 11-1 1'1-1W W'W = current number of weeks of QA test~ = cumulative test time at the end of the ith weekf j = cumulative number of failures at the end of the ith week.From Equation (12.134) of [Musa,87], the a per cent confidence interval (e.g., 95%) for bis given by:(2)b ± ZI_an!(Io(b))o.s, whereZI-al2 is the value of the standard Normal, e.g., 1.645 for 90% confidence interval,lo(b) = L W=1(fj - f j _1)(tj - ~_1)2e-b(lj +ti-')/(e- btj-,_ e- btj )2 _ fwt w2e btwI(e blw _ 1)2jThe parameter a and its confidence interval can then be estimated by solving:(3)a = fw!(1 - e-btw ), where b is one ofthe values obtained above.When implementing these equations, equation (1) is solved numerically to derive anestimate of b, and the appropriate confidence interval for b is then calculated from (2).