Discrete Time NHPPModelsfor Software Reliability Growth Phenomenon
Discrete Time NHPPModelsfor Software Reliability Growth Phenomenon
net/publication/232806471
CITATIONS READS
17 245
1 author:
Omar Shatnawi
Al al-Bayt University
22 PUBLICATIONS 213 CITATIONS
SEE PROFILE
All content following this page was uploaded by Omar Shatnawi on 01 June 2014.
Abstract: Nonhomogeneous poisson process based software reliability growth models are generally classified into two
groups. The first group contains models, which use the machine execution time or calendar time as a unit of fault
detection/removal period. Such models are called continuous time models. The second group contains models, which use the
number of test occasions/cases as a unit of fault detection period. Such models are called discrete time models, since the unit
of software fault detection period is countable. A large number of models have been developed in the first group while there
are fewer in the second group. Discrete time models in software reliability are important and a little effort has been made in
this direction. In this paper, we develop two discrete time SRGMs using probability generating function for the software
failure occurrence / fault detection phenomenon based on a NHPP namely, basic and extended models. The basic model
exploits the fault detection/removal rate during the initial and final test cases. Whereas, the extended model incorporates fault
generation and imperfect debugging with learning. Actual software reliability data have been used to demonstrate the
proposed models. The results are fairly encouraging in terms of goodness-of-fit and predictive validity criteria due to
applicability and flexibility of the proposed models as they can capture a wide class of reliability growth curves ranging from
purely exponential to highly S-shaped.
Keywords: Software engineering, software testing, software reliability, software reliability growth model, nonhomogeneous
poisson process, test occasions.
is known as exponential model. Kapur et al. [6] number of remaining faults [4]. This assumption
proposed a model in which the testing phase is implies that all faults are equally likely to be detected.
assumed to have two different processes namely, fault In practical situation it has been observed that large
isolation and fault detection processes, this model is number of simple faults is easily detected at the early
known as delayed S-shaped model. Further they stages of the testing, while the fault detection may
proposed another model based on the assumption that become extremely difficult in the later stages of the
software may contain several types of faults. In these testing phase. In this case, the fault detection rate has a
models the fault removal process (i.e., debugging high value at the beginning of the testing as compared
process) is assumed to be perfect. But due to the to the value at the end of the testing. On the contrary, it
complexity of the software system and the incomplete may happen that the detection of faults increases the
understanding of software requirements, specifications skill of the testing team and thus leading to increase in
and structure, the testing team may not be able to efficiency. In other words, the testing team detects
remove the fault perfectly and the original fault may more faults in less time. This implies that the value of
remain or replaced by another fault. While the first the fault detection rate at the end of the testing phase is
phenomenon is known as imperfect debugging. The higher than it is value at the beginning of the testing.
second is called error/fault generation. Bittanti et al. [1] exploited this change in fault
The concept of imperfect debugging was first detection rate, which they termed as the Fault
introduced by Goel [3]. He introduced the probability Exposure Coefficient (FEC) for their SRGM.
of imperfect debugging in Jelinski and Moranda model Through real data experiments and analysis on
[5]. Kapur et al. [6] introduced the imperfect several software development projects, the fault
debugging in Goel and Okumoto model [4]. They detection rate has three possible trends as time
assumed that the fault removal rate per remaining progresses; increasing, decreasing or constant [1, 10].
faults is reduced due to imperfect debugging. Thus the It decreases when the software has been used and
number of failures observed by time infinity is more tested repeatedly, showing reliability growth. It can
than the initial fault content. Although these two also increase if the testing techniques/requirements are
models describe the imperfect debugging phenomenon changed, or new faults are introduced due to new
yet the software reliability growth curve of these software features or imperfect debugging. Thus, we
models is always exponential. Moreover, they assume treat the fault detection rate as a function of the number
that the probability of imperfect debugging is of test cases to interpret these trends.
independent of the testing time. Thus, they ignore the
role of the learning process during the testing phase by 2.1.2. Model Assumptions, Notations, and
not accounting for the experience gained with the Formulation
progress of software testing. Actually, the probability
• Failure observation/fault detection phenomenon is
of imperfect debugging is supposed to be a maximum
modelled by NHPP.
in the early stage of the testing phase and is supposed
to reduce with the progress of testing [1, 6, 7, 8, 10, 12, • Software is subject to failures during execution
13, 14, 15, 16]. caused by faults remaining in the software.
In this paper, we propose two discrete time NHPP • Each time a failure occurs, the fault that caused it is
based SRGMs for the situation given above. The immediately and perfectly detected, and no new
assumptions in this case are with respect to number of faults are introduced, i.e., the debugging process is
test cases instead of time. The rest of this paper is perfect.
organized as follows. Section 2 derives the proposed a Initial fault content of the software.
two flexible discrete time models. Sections 3 and 4 b(n+1) Fault detection rate function is dependent on
discuss the methods used for parameter estimation and the number of test cases and behaves linearly
the criteria used for validation and evaluation of the on the number of faults remaining.
proposed models. The applications of the proposed bi Initial fault detection rate.
discrete to actual software reliability data through data bf Final fault detection rate.
analyses and model comparisons are shown in section m(n) The expected mean number of faults detected
5. We conclude this paper in section 6. by the nth test case.
2. Software Reliability Modelling Under the given model assumptions, the expected
cumulative number of faults detected between the nth
2.1. Basic Discrete Time Model and (n+1)th test cases is proportional to the number of
faults remaining after the execution of the nth test run,
2.1.1. Model Development
satisfies the following difference equation:
The main assumption of SRGM is that the failure
observation / fault detection depends linearly upon the m ( n + 1) − m ( n ) = b ( n + 1)(a − m ( n ) ) (1)
126 The International Arab Journal of Information Technology, Vol. 6, No. 2, April 2009
The fault detection rate is given as a function of the 2.2. Extended Discrete Time Model
number faults detected, as in the following equation,
2.2.1. Model Development
m(n + 1)
b(n + 1) = bi + (b f − bi ) (2) During debugging process faults are identified and
a removed upon failures. In reality this may not be
According to the values of bi and bf, we can always true. The corrections may themselves introduce
distinguish between the following cases [1]: new faults or they may inadvertently create conditions,
not previously experienced, that enable other faults to
• Constant fault detection rate; bi=bf =b cause failures. This results in situations where the
• Increasing fault detection rate; bf >bi actual fault removals are less than the removal
• Decreasing fault detection rate; bi>bf attempts. Therefore, the fault detection rate is reduced
• Vanishing fault detection rate; bf =0, bi>0 by the probability of imperfect debugging [6, 8, 13,
By substituting equation 2 in the difference equation 1 14]. Besides there is a good chance that some new
and then solving it using Probability Generating faults might be introduced during the course of
debugging.
Function (PGF) with initial condition m(n=0)=0, one
can get the closed form solution as given below. The learning-process of the software developers has
also been studied [1, 6, 12, 16]. Leaning occurs if
m (n) =
(
a 1 − (1 − b f ) n ) (3)
testing appears to improve dynamically in efficiency as
b f − bi one progress through the testing phase. Learning
1+ (1 − b f ) n
usually manifests itself as a changing fault detection
bi rate. By assuming the fault detection rate is dependent
The structure of the model is flexible. The shape of on the number of test cases, the role of the learning
the growth curve is determined by the parameters bi process during the testing phase can be established.
and bf and can be both exponential and S-shaped for The extended model proposed below incorporates
the four cases discussed above. In case of constant these three factors.
fault detection rate, equation 3 can be written as To avoid mathematical complexity many
(
m(n) = a 1 − (1 − b)n ) (4)
simplifying assumptions have been taken while
developing the basic model. One of which is that the
If bf<bi, it is apparent from equation 1 that b(n+1) number of initial fault-content is constant.
decreases to zero more rapidly than linearly. The Modifications to the basic model have been proposed
smaller ratio bf/bi, the larger rate convergence of for increasing over-all fault content, where a of
equation 1. If bf=0, then equation 1 becomes: equation 1 is substituted by a(n) as given in equation 8.
bi The nature of a(n) depends upon various factors like
m(n +1) − m(n) = (a − m(n +1))(a − m(n)) (5) the skill of test team or testing strategy, number of test
a cases, software size, and complexity. Hence no single
The solution of which is given by: functional form can describe the growth in number of
faults during testing phase and debugging process. This
ab i n (6) necessitates a modelling approach that can be modified
m (n ) =
1 + bi n without unnecessarily increasing the complexity of the
resultant model [8].
If bf>bi, then equation 3 has an inflection point and
Here, we show how this can be achieved for the
the growth curve is S-shaped. Such behaviour of the
proposed basic discrete time model by changing the
SRGM can be attributed to the increased skill of the
fault detection rate. Before the spread of flexible
test team or a modification of the testing strategy.
models, a number of models were proposed that linked
The proposed discrete time model defined in
the fault detection rate to the initial fault content. But
equation 3 is very interesting from various points of
basic model due to it is inherent flexibility could
view. Besides the above-mentioned interpretation as a
describe many of them.
flexible S-shaped fault detection model, this model has
Hence it is imperative to capture this flexibility and
the exponential model [17] as special cases as given in
in this paper we do so by proposing a logistic number
equation 4.
of test cases dependent for fault detection rate as
Note that this proposed basic discrete time model is
proposed in equation 9.
able to model both cases of strictly decreasing failure
intensity and the case of increasing-and-decreasing
2.2.2. Model Assumptions
failure intensity. None of the exponential model [17]
and the ordinary delayed S-shaped model [6] can do In addition to basic discrete time model assumptions
both. except for assumption 3, we have:
• Over-all fault content is linearly dependent on the
number of test cases.
Discrete Time NHPP Models for Software Reliability Growth Phenomenon 127
bi
number of test cases.
) b bp −p α + α n
• Faults can be introduced during the testing phase
and the debugging process may not lead to the
[(1 − (1 − b f p) n
f
f
complete removal of the faults, i.e., the debugging
process is imperfect.
Fault generation and imperfect debugging with leaning
2.2.3. Model Notations are integrated to form the extended discrete time model
as given in equation 10.
In addition to the basic discrete time model notations According to the values of parameters, we can
we include the following: distinguish between the following cases:
a(n) Over-all fault-content dependent on the • Constant fault detection rate (bi=bf=b), no faults
number of test cases, which includes initial introduced (α=0), and prefect debugging (p=1)
fault-content and the number of faults • No faults introduced (α=0), and prefect debugging
introduced. (p=1).
b(n+1) Fault detection rate dependent on the number of In case 1, equation 10 can reduce to equation 4.
test cases. Whereas in case 2, it reduces to equation 3.
α Fault introduction rate per detected faults per
test case.
p The probability of fault removal on a failure, 3. Parameter Estimation
i.e., the probability of perfect debugging MLE method is used to estimate the unknown
Fault detection rate function is dependent on parameters of the developed framework. Since all data
the number of test cases and behaves linearly sets used are given in the form of pairs
on the number of faults remaining. (ni,xi)(i=1,2,…,f), where xi is the cumulative number of
faults detected by ni test cases (0<n1<n2<…<nf) and ni
2.2.4. Model Formulation is the accumulated number of test run executed to
Under the above extended model assumptions, the detect xi faults.
expected cumulative number of faults detected The likelihood function L for the unknown
between the nth and (n+1)th test cases is proportional to parameters with the superposed mean value function is
the number of faults remaining after the execution nth given as
test run, satisfies the following difference equation: L( parmaters| (ni , xi )) = ∏
f
[m(ni ) − m(ni −1 )]x − x i i −1
×
( xi − xi −1 )! (11)
m(n + 1) − m(n) = b(n + 1)(a(n) − m(n) ) (7)
i =1
removal on a failure and they are giving as: {m(ni ) − m(ni −1 )} − ∑ ln[( xi − xi −1 )!]
i =1
The first DS-I was collected from test of a network- and past failure behaviour [11]. The relative prediction
management system at AT&T Bell Laboratories, after fault (RPF) is defined as,
it was tested for 20 weeks in which 100 faults were
mˆ ( n f ) − x f
detected [13]. The second DS-II was collected during RPF = (16)
21 days of testing, 46 faults were detected [12]. The xf
third DS-III was for a radar system of size 124 KLOC where xf is the cumulative number of faults removed
after it was tested for 35 months in which 1301 faults after the execution of the last test run nf and m^(nf), is
were detected [2]. The fourth DS-IV had been the estimated value of the SRGM m(nf), which
collected during 25 days of testing in which 136 faults determined using the actually observed data up to an
were detected [13]. arbitrary test case ne(≤nf).
If the RPF value is negative/positive the model is
4.2. Comparison Criteria said to underestimate/ overestimate the fault removal
process. A value close to zero indicates more accurate
The performance of an SRGM judged by its ability to
prediction, thus more confidence in the model. The
fit the past software reliability data (goodness-of-fit)
value is said to be acceptable if it is within (±10%) [6].
and to predict satisfactorily the future behavior from
present and past data (predictive validity) [6, 11].
5. Data Analysis and Model Comparison
4.2.1. Goodness of Fit 5.1. For Basic Discrete Time Model
• The Sum of Squared Error (SSE). The difference 5.1.1. Goodness of Fit Analysis
between the simulated data m^(ni) and the observed
(reported data) xi is measured by the SSE as, Using MLE method, the estimated values of the
proposed basic discrete time model parameters for DS-
f I and DS-II are given in Table 1. According to the
SSE = ∑ (mˆ ( n
i =1
i ) − xi )
2
(13) estimated values of the initial and final fault detection
rates (bi and bf), the skill of the test-team does improve
where f is the number of observations. The lower value with time in both DS-I and DS-II. That is why the fault
of SSE indicates less fitting error, thus better detection/removal process resembles an S-shaped
goodness-of-fit. growth curve in both DS-I and DS-II.
• The Akaike Information Criterion (AIC). This Table 1. Parameters estimations for DS-I and DS-II.
criterion was first proposed as SRGM model Parameter Estimation
selection tool by [9]. It is defined as: Data
Model
Set a bi bf
AIC= -2(value of max. log likelihood function) + DS-I 111 .0717 .1581
2(number of parameters used in the model) (14) Proposed
DS-
Basic 59 .0167 .1550
II
Lower value of AIC indicates more confidence in the
model thus a better fit and predictive validity. The fitting of the proposed model to both DS-I and
• Coefficient of multiple determination (R ). This 2 DS-II are graphically illustrated in Figures 1 and 2
measure can be used to investigate whether a respectively. It is clearly seen from both the figures
significant trend exists in the observed failure that the proposed model fits both DS-I and DS-II
intensity. This coefficient is defined as the ratio of excellently. This highlights it is flexibility.
the Sum of Squares (SS) resulting from the trend 112
model to that from a constant model subtracted from
C u m u la tiv e F a u lts
1, that is: 84
residual SS
R2 = 1− (15)
56
corrected SS
R2 measures the percentage of the total variation about 28
the mean accounted for by the fitted curve. It ranges in
value from 0 to 1. Small values indicate that the model 0
does not fit the data well [11]. 0 5 10 15 20
Test Cases (w eeks)
4.2.2. Predictive Validity Actual Data Estimated Values
Predictive validity is defined as the ability of the model Figure 1. Goodness of fit (DS-I).
to determine the future failure behaviour from present
Discrete Time NHPP Models for Software Reliability Growth Phenomenon 129
24
0.1
12
RPE
0
0 50% 60% 70% 80% 90% 100%
0 7 14 21 -0.1
Test Cases (months)
A ctual Data Estimated Values Testing Progress Ratio
-0.2
Comparison of the proposed model and well- Figure 3. Predictive validity (DS-I).
documented discrete time NHPP based SRGMs in 0.2
RPE
0
observed that the exponential model [17] fails to give 50% 60% 70% 80% 90% 100%
any plausible result as it over estimates the fault-
-0.1
content and no estimates were obtained for DS-II. It is
clearly seen from both the Tables 2 and 3 that the -0.2 Testing Progress Ratio
proposed model is the best among the models under P redictive Validity Curve
comparison in terms of SSE, AIC, and R2 metrics
values, which is very encouraging. Figure 4. Predictive validity (DS-II).
Table 2. Goodness of fit for DS-I. It is clearly seen from both Figures 3 and 4 that 55%
of the total test time is sufficient to predict the future
Parameter Comparison behaviour of the fault removal process reasonably for
Models under
Estimation Criteria
Comparison DS-I and DS-II.
a b SSE AIC R2
Exponential [17] 130 .0798 232 92 .9857
5.2. For Extended Discrete Time Model
Delayed S-shaped [6] 106 .2165 357 99 .9781
5.2.1. Goodness of Fit Analysis
Proposed Basic See Table 1. 180 90 .9890
Using MLE method, the estimated values of the
Table 3. Goodness of fit for DS-II. proposed extended discrete time model parameters for
DS-III and DS-IV are given in Table 4. According to
Parameter Comparison
Models under
Estimation Criteria the estimated values of the initial and final fault
Comparison detection rates (bi and bf), the skill of the test-team
a b SSE AIC R2
does improve with time in DS-III and in DS-IV does
Exponential [17] * * * * *
not. That is why the fault detection process resembles
Delayed S-shaped [6] 84 .0831 28 79 .9938 an S-shaped growth curve in DS-III and an exponential
curve in DS-IV. According to the estimated values of
Proposed Basic See Table 1. 25 77 .9944 the fault introduction rate parameter (α) the fault
detection process (i.e., the debugging process) in DS-
III is perfect and no fault introduced during debugging,
Hence, the proposed basic discrete time model fits
whereas in DS-IV was not.
better than existing models on both DS-I and DS-II.
Table 4. Parameters estimations for DS-III & -IV.
5.1.2. Predictive Validity Analysis
Data Parameter Estimation
Model
DS-I and DS-II are truncated into different proportions Set a bi bf p α
and used to estimate the parameters of the proposed Proposed DS-III 1352 .0087 .1832 .9922 0
basic discrete time model. For each truncation, one Extended DS-IV 156 .1525 .0005 .9965 .0036
value of RPE ratio is obtained.
Figures 3 and 4 graphically illustrate the results of The fitting of the proposed model to both DS-III and
the predictive validity. It is observed that the predictive DS-IV are graphically illustrated in Figures 5 and 6. It
validity of the proposed model varies from one may be noticed that the relationship between the
truncation to another. The RPE ratio of the proposed cumulative number of faults and the number of test
model overestimates the fault removal process in DS-I cases vary from purely exponential to highly S-shaped.
and DS-II except when the testing progress ratio is It is clearly seen from both the figures that the
130 The International Arab Journal of Information Technology, Vol. 6, No. 2, April 2009
proposed model fits both the DS-III and DS-IV 5.2.2. Predictive Validity Analysis
excellently. This highlights it is flexibility.
DS-III and DS-IV are truncated into different
1312
proportions and used to estimate the parameters of the
Cumulative Faults
0.2
102
0.1
68
RPE
0
34 50% 60% 70% 8 0% 90% 100%
-0.1
0
0 5 10 15 20 25 Testing Progress Ratio
-0.2
Test Cases (days)
P redictive Validity Curve
A ctual Data Estimated Values
Figure 7. Predictive validity (DS-III).
Figure 6. Goodness of fit (DS-IV).
0.2
Comparison of the proposed model and well-
documented discrete time SRGM based on NHPP in 0.1
terms of goodness of fit is given in Tables 5 and 6 for
RPE
comparison in terms of SSE, AIC, and R2 metrics Figure 8. Predictive validity (DS-IV).
values, which is very encouraging. Hence, the
proposed extended discrete time model fits better than It is clearly seen from both Figures 7 and 8 that 50%
existing models on both DS-III and DS-IV. of the total test time is sufficient to predict the future
behaviour of the fault removal process reasonably for
Table 5. Goodness of fit for DS-III.
DS-III and DS-IV.
Parameter Comparison
Models under
Estimation Criteria
Comparison
a b SSE AIC R2
6. Conclusion
Exponential [17] * * * * * In this paper, newly developed discrete time SRGM
Delayed S-shaped [6] 1735 .0814 107324 518 .9856 based on NHPP to describe a variety of reliability
Proposed Extended See Table 5. 7133 342 .9990 growth and the increased skill (efficiency) of the
testing team or a modification of the testing strategy
Table 6. Goodness of fit for DS-III.
during testing phase, are proposed.
Parameter Comparison
The proposed discrete time models have been
Models under Estimation Criteria validated and evaluated on actual software reliability
Comparison
data cited from real software development projects and
a b SSE AIC R2
compared with existing discrete time NHPP based
Exponential [17] 136 .1291 766 119 .9664 models. The results are encouraging in terms of
Delayed S-shaped [6] 126 .2763 2426 176 .8936 goodness of fit and predictive validity due to their
Proposed Extended See Table 5. 306 116 .9866
applicability and flexibility. Hence, we conclude that
Discrete Time NHPP Models for Software Reliability Growth Phenomenon 131
the two proposed discrete time models not only fit the [12] Ohba M., “Software Reliability Analysis
past well but also predict the future reasonably well. Models”, IBM Journal of Research and
Development, vol. 28, no. 1, pp. 428-443, 1984.
Acknowledgements [13] Pham H., Software Reliability, Springer-Verlag,
USA, 2000.
I take this opportunity to thank Prof. P. K. Kapur, Dr. [14] Pham H., Nordmann L., and Zhang X., “A
A. K Bardhan, and Dr. P. C. Jha of Delhi University General Imperfect Software-Debugging Model
for their support every moment I sought. The with S-shaped Fault Detection Rate,” IEEE
suggestion, comments, and criticisms of these people Transactions on Reliability, vol. 48, no. 3, pp.
have greatly improved this manuscript. 169-175, 1999.
[15] Shatnawi O., “Modelling Software Fault
Dependency Using Lag Function,” Al Manarah
Journal for Research and Studies, vol. 15, no. 6,
References pp. 261-300, 2007.
[16] Yamada S., Ohba M., and Osaki S., “S-shaped
[1] Bittanti S., Bolzern P., Pedrotti E., Pozzi M., and Software Reliability Growth Models and Their
Scattolini A., Software Reliability Modeling and Applications,” IEEE Transactions on Reliability,
Identification, Springer-Verlag, USA, 1988. vol. 33, no. 1, pp. 169-175, 1984.
[2] Brooks D. and Motley W., “Analysis of Discrete [17] Yamada S. and Osaki S., “Discrete Software
Software Reliability Models,” Technical Report, Reliability Growth Models,” Applied Stochastic
New York, 1980. Models and Data Analysis, vol. 1, no.1, pp. 65-
[3] Goel L., “Software Reliability Models: 77, 1985.
Assumptions, Limitations and Applicability,” [18] Xie M., Software Reliability Modelling, World
IEEE Transactions on Software Engineering, vol. Scientific, New York, 1991.
11, no. 12, pp. 1411-1423, 1985.
[4] Goel L. and Okumoto K., “Time Dependent
Error Detection Rate Model for Software Omar Shatnawi received his PhD,
Reliability and Other Performance Measures,” in computer science and his MSc in
IEEE Transactions on Reliability, vol. 28, no. 3, operational research from
pp. 206-211, 1979. University of delhi in 2004 and
[5] Jelinski Z. and Moranda B., Statistical Computer 1999, respectively. Currently, he is
Performance Evaluation, Academic Press, New head of Department of Information
York, 1972. Systems at al-Bayt University. His
[6] Kapur K., Garg B., and Kumar S., Contributions research interests are in software engineering, with an
to Hardware and Software Reliability, World emphasis on improving software reliability and
Scientific, New York, 1999. dependability.
[7] Kapur K., Shatnawi O., and Yadavalli S., “A
Software Fault Classification Model,” South
African Computer Journal, vol. 33, no. 33, pp. 1-
9, 2004.
[8] Kapur K., Singh O., Shatnawi O., and Gupta A.,
“A Discrete Nonhomogeneous Poisson Process
Model for Software Reliability Growth with
Imperfect Debugging and Fault Generation,”
International Journal of Performability
Engineering, vol. 2, no. 4, pp. 351-368, 2006.
[9] Khoshogoftaar T. and Woodcock G., “Software
Reliability Model Selection: A Case Study,” in
Proceedings of the International Symposium on
Software Reliability Engineering, pp. 183-191,
USA, 1991.
[10] Kuo S., Huang H., and Lyu R., “Framework for
Modelling Software Reliability Using Various
Testing-Effort and Fault-Detection Rates,” IEEE
Transactions on Reliability, vol. 50, no. 3, pp.
310-320, 2001.
[11] Musa D., Iannino A., and Okumoto K., Software
Reliability, McGraw-Hill, New York, 1987.