Unit 1: A Perspective On Testing: Basic Definitions
Unit 1: A Perspective On Testing: Basic Definitions
Basic Definitions
Error:
• A good synonym of Error is mistake. When mistakes are done while coding, these are
called as bugs.
• Errors tend to propagate. An example is a requirements error may be magnified during
design and amplified still more during coding.
Fault:
• A fault is the result of an error. A fault is the representation of an error, where
representation is the mode of expression, such as narrative text, dataflow diagrams,
hierarchy charts, source code, and so on.
• Defect is a good synonym for fault.
• Faults can be elusive. When a designer makes an error of omission, the resulting fault is
that something is missing that should be present in the representation.
• A fault of commission occurs when we enter something into a representation that is
incorrect.
• Faults of omission occur when we fail to enter correct information. Of these two types,
faults of omission are more difficult to detect and resolve.
Failure:
• A failure occurs when a fault executes.
• Two subtleties arise here:
1. Failures only occur in an executable representation, which is usually taken to be
source code or loaded object code.
2. This definition relates failures only to faults of commission. How to deal with failures
that correspond to faults of omission? What about faults that never happens to
execute, or do not execute for a long time? Reviews prevent many failures by finding
faults and well-done reviews can find faults of omission.
Incident:
• When a failure occurs, it may or may not be readily apparent to the user. An incident is
the symptom associated with a failure that alerts the user to the occurrence of a failure.
Test:
• A test is the act of exercising software with test cases.
• A test has two distinct goals: to find failures and to demonstrate correct execution.
Test case:
• Test case has an identity and is associated with a program behavior.
• A test case also has a set of inputs and expected outputs.
Figure 1.1 portrays a life cycle model for testing. In the development phases, three opportunities
arise for errors to be made, resulting in faults that propagate through the remainder of the
development process.
When a fix causes formerly correct software to misbehave, the fix is deficient.
Test Cases
The essence of software testing is to determine a set of test cases for the item to be tested.
Before going on, we need to clarify what information should be in a test case
Test Case ID
Purpose
Preconditions
Inputs
Expected Outputs
Postconditions
Execution History
Date Result Version Run By
The act of testing entails establishing the necessary preconditions, providing the test case inputs,
observing the outputs, comparing these with the expected outputs, and then ensuring that the
expected postconditions exist to determine whether the test passed.
From all of this it becomes clear that test cases are valuable. Test cases need to be developed,
reviewed, used, managed, and saved.
Here a simple Venn diagram that clarifies several nagging questions about testing.
• Figure 1.3 shows the relationship among universe of program behaviors as well as the
specified and programmed behaviors. Of all the possible program behaviors, the speci-
fied behaviors are in the circle labeled S, and all those behaviors actually programmed
are in P. The intersection of S and P is the "correct" portion, that is, behaviors that are
both specified and implemented.
• With this diagram, we can see more clearly the problems faced by a tester. What if
certain specified behaviors have not been programmed? These are faults of omission.
What if certain programmed behaviors have not been specified? These correspond to
faults of commission
• A view of testing is that it is the determination of the extent of program behavior that is
both specified and implemented.
Consider the relationships among the sets S, P, and T. There may be specified behaviors that are
not tested (regions 2 and 5), specified behaviors that are tested (regions 1 and 4), and test cases
that correspond to unspecified behaviors (regions 3 and 7).
• There may be programmed behaviors that are not tested (regions 2 and 6), programmed
behaviors that are tested (regions 1 and 3), and test cases that correspond to
unprogrammed behaviors (regions 4 and 7).
• If specified behaviors exist for which no test cases are available, the testing is
incomplete. If certain test cases correspond to unspecified behaviors, then either such a
test case is unwarranted, the specification is deficient, or the tester wishes to determine
that specified non-behavior does not occur.
Functional Testing
Functional testing consider any program to be a function that maps values from its input domain
to values in its output range. This leads to the term black box testing, in which the content
(implementation) of a black box is not known, and the function of the black box is understood
completely in terms of its inputs and outputs.
In the functional approach to test case identification, the only information used is the
specification of the software.
Significant redundancies may exist among test cases, compounded by the possibility of gaps of
untested software.
Figure 1.6 shows the results of test cases identified by two functional methods. Method A
identifies a larger set of test cases than does Method B. Notice that, for both methods, the set of
test cases is completely contained within the set of specified behavior. Because functional
methods are based on the specified behavior, it is hard to imagine these methods identifying
behaviors that are not specified.
Structural Testing
Structural testing is the other fundamental approach to test case identification. To contrast it with
functional testing, it is sometimes called white box (or even clear box) testing. The clear box metaphor
is probably more appropriate, because the essential difference is that the implementation (of the black
box) is known and used to identify test cases. The ability to "see inside" the black box allows the tester
to identify test cases based on how the function is actually implemented.
Structural testing has been the subject of some fairly strong theory. To really understand structural
testing, familiarity with the concepts of linear graph theory is essential
With these concepts, the tester can rigorously describe exactly what is tested. Because of its strong
theoretical basis, structural testing lends itself to the definition and use of test coverage metrics. Test
coverage metrics provide a way to explicitly state the extent to which a software item has been tested,
and this in turn makes testing management more meaningful.
Figure 1.7 shows the results of test cases identified by two structural methods. As before, Method A
identifies a larger set of test cases than does Method B. Is a larger set of test cases necessarily better?
This is an excellent question, and structural testing provides important ways to develop an answer.
Notice that, for both methods, the set of test cases is completely contained within the set of programmed
behavior. Because structural methods are based on the program, it is hard to imagine these methods
identifying behaviors that are not programmed. It is easy to imagine, however, that a set of structural test
cases is relatively small with respect to the full set of programmed behaviors. In Chapter 11 we will see
direct comparisons of test cases generated by various structural methods.
In defense of structural testing, Edward Miller writes, "Branch coverage [a structural test coverage
metric], if attained at the 85% or better level, tends to identify twice the number of defects that would
have been found by 'intuitive' [functional] testing" (Miller, 1991).
The Venn diagrams presented earlier yield a strong resolution to this debate. Recall that the goal of both
approaches is to identify test cases. Functional testing uses only the specification to identify test cases,
while structural testing uses the program source code (implementation) as the basis of test case
identification. Our earlier discussion forces the conclusion that neither approach alone is sufficient.
Consider program behaviors: if all specified behaviors have not been implemented, structural test cases
will never be able to recognize this. Conversely, if the program implements behaviors that have not been
specified, this will never be revealed by functional test cases. (A Trojan horse is a good example of such
unspecified behavior.) The quick answer is that both approaches are needed; the testing craftsperson's
answer is that a judicious combination will provide the confidence of functional testing and the
measurement of structured testing.
Earlier, we asserted that functional testing often suffers from twin problems of redundancies and gaps.
When functional test cases are executed in combination with structural test coverage metrics, both of
these problems can be recognized and resolved (Figure 1.8).
The Venn diagram view of testing provides one final insight. What is the relationship between the set T
of test cases and the sets S and P of specified and implemented behaviors? Clearly, the test cases in T
are determined by the test case identification method used. A very good question to ask is how
appropriate (or effective) is this method? To close a loop from an earlier discussion, recall the causal trail
from error to fault, failure, and incident.
If we know what kind of errors we are prone to make, and if we know what kinds of faults are likely to
reside in the software to be tested, we can use this to employ more appropriate test case identification
methods. This is the point at which testing really becomes a craft.
Faults can be classified in several ways: the development phase in which the corresponding error
occurred, the consequences of corresponding failures, difficulty to resolve, risk of no resolution, and so
on. My favorite is based on anomaly occurrence: one time only, intermittent, recurring, or repeatable.
Figure 1.9 contains a fault taxonomy (Beizer, 1984) that distinguishes faults by the severity of their
consequences.
For a comprehensive treatment of types of faults, see the IEEE Standard Classification for Software
Anomalies (IEEE, 1993). (A software anomaly is defined in that document as "a departure from the
expected," which is pretty close to our definition.) The IEEE standard defines a detailed anomaly
resolution process built around four phases (another life cycle): recognition, investigation, action, and
disposition. Some of the more useful anomalies are given in Table 1.1 through Table 1.5; most of these
are from the IEEE standard, but I have added some of my favorites.
Levels of Testing
Thus far, we have said nothing about one of the key concepts of testing — levels of abstraction. Levels
of testing echo the levels of abstraction found in the waterfall model of the software development life
cycle. Although this model has its drawbacks, it is useful for testing as a means of identifying distinct
levels of testing and for clarifying the objectives that pertain to each level.
A practical relationship exists between levels of testing versus functional and structural testing. Most
practitioners agree that structural testing is most appropriate at the unit level, while functional testing is
most appropriate at the system level. This is generally true, but it is also a likely consequence of the base
information produced during the requirements specification, preliminary design, and detailed design
phases. The constructs defined for structural testing make the most sense at the unit level, and similar
constructs are only now becoming available for the integration and system levels of testing. We develop
such structures in Part IV to support structural testing at the integration and system levels for both
traditional and object-oriented software.
Examples
Three examples will be used to illustrate the various unit testing methods. They are the triangle problem
(a venerable example in testing circles); a logically complex function, NextDate; and an example that
typifies Management Information Systems (MIS) applications, known here as the commission problem.
Taken together, these examples raise most of the issues that testing craftspersons will encounter at the
unit level. The discussion of integration and system testing in Part IV uses three other examples: a
simplified version of an automated teller machine (ATM), known here as the simple ATM (SATM)
system; the currency converter, an event-driven application typical of graphical user interface (GUI)
applications; and the windshield wiper control device from the Saturn™ automobile. Finally, an object-
oriented version of NextDate is provided, called o-oCalendar, which is used to illustrate aspects of
testing object-oriented software.
For the purposes of structural testing, pseudocode implementations of the three unit-level examples are
given in this chapter. System-level descriptions of the SATM system, the currency converter, and the
Saturn windshield wiper system. These applications are described both traditionally (with E/R diagrams,
dataflow diagrams, and finite state machines) and with the de facto object-oriented standard, the Unified
Modeling Language (UML).
Generalized Pseudocode
Pseudocode provides a "language neutral" way to express program source code. This version is loosely
based on Visual Basic and has constructs at two levels: unit and program components. Units can be
interpreted either as traditional components (procedures and functions) or as object-oriented components
(classes and objects). This definition is somewhat informal; terms such as expression, variable list, and
field description are used with no formal definition. Items in angle brackets indicate language elements
that can be used at the identified positions. Part of the value of any pseudocode is the suppression of
unwanted detail; here, we illustrate this by allowing natural language phrases in place of more formal,
complex conditions (see Table 2.1).
Problem Statement
Simple version: The triangle program accepts three integers, a, b, and c, as input. These are taken to be
sides of a triangle. The output of the program is the type of triangle determined by the three sides:
Equilateral, Isosceles, Scalene, or Not A Triangle. Sometimes this problem is extended to include right
triangles as a fifth type; we will use this extension in some of the exercises.
Improved version: The triangle program accepts three integers, a, b, and c, as input. These are taken
to be sides of a triangle. The integers a, b, and c must satisfy the following conditions:
The output of the program is the type of triangle determined by the three sides: Equilateral,
Isosceles, Scalene, or NotATriangle. If an input value fails any of conditions cl, c2, or c3, the program
notes this with an output message, for example, "Value of b is not in the range of permitted values." If
values of a, b, and c satisfy conditions cl, c2, and c3, one of four mutually exclusive outputs is given:
Discussion
Perhaps one of the reasons for the longevity of this example is that it contains clear but complex logic. It
also typifies some of the incomplete definitions that impair communication among customers,
developers, and testers. The first specification presumes the developers know some details about
triangles, particularly the triangle inequality: the sum of any pair of sides must be strictly greater than
the third side. The upper limit of 200 is both arbitrary and convenient; it will be used when we develop
boundary value test cases.
Traditional Implementation
The "traditional" implementation of this grandfather of all examples has a rather Fortran-like style. The
flowchart for this implementation appears in Figure 2.1. The flowchart box numbers correspond to
comment numbers in the (Fortran-like) pseudocode program given next.
If match =0
Then If (a+b)< = c
Then Output ("NotATriangle")
Else If (b+c)<=a
Then Output ("NotATriangle")
Else If (a+c)<=b ' (10)
Then Output ("NotATriangle")
Else Output ("Scalene")
EndIf
EndIf
EndIf
Else If match=l
Then If (a+c) <=b
Then Output ("NotATriangle")
Else Output ("Isosceles")
EndIf
Else If match=2
Then If (a+c)<=b
Then Output ("NotATriangle")
Else Output ("Isosceles")
EndIf
Else If match=3
Then If (b+c)<=a
Then Output ("NotATriangle")
Else Output ("Isosceles")
EndIf
Else Output ("Equilateral")
EndIf
EndIf
EndIf
EndIf
End Trianglel
The variable "match" is used to record equality among pairs of the sides. A classical intricacy of the
FORTRAN style is connected with the variable "match": notice that all three tests for the triangle
inequality do not occur. If two sides are equal, say, a and c, it is only necessary to compare a + c with b.
(Because b must be greater than zero, a + b must be greater than c, because c equals a.) This observation
clearly reduces the number of comparisons that must be made.
The efficiency of this version is obtained at the expense of clarity (and ease of testing). Notice that six
ways are used to reach the NotATriangle box, and three ways are used to reach the Isosceles box.
Structured Implementation
Figure 2.2 is a dataflow diagram description of the triangle program. We could implement it as a main
program with the three indicated procedures. We will use this example later for unit testing; therefore, the three
procedures have been merged into one pseudocode program. Comment lines relate sections of the code
to the decomposition given in Figure 2.2
Step 2: Is A Triangle?
If (a < b + c) AND (b < a + c) AND (c < a + b)
Then IsATriangle = True
Else IsATriangle = False
EndIf
Step 3: Determine Triangle Type
If IsATriangle
Then If (a = b) AND (b = c)
Then Output ("Equilateral")
Else If (a ≠ b) AND (a ≠ c) AND (b ≠ c)
Then Output ("Scalene")
Else Output ("Isosceles")
EndIf
EndIf
Else Output ("Not a Triangle")
EndIf
End Triangle2
If NOT (c3)
Then Output ("Value of c is not in the range of permitted values")
EndIf
Step 2: Is A Triangle?
If (a < b + c) AND (b < a + c) AND (c < a + b)
Then IsATriangle = True
Else IsATriangle = False
EndIf
Step 3: Determine Triangle Type
If IsATriangle
Then If (a = b) AND (b = c)
Then Output ("Equilateral")
Else If (a ≠ b) AND (a ≠ c) AND (b ≠ c)
Then Output ("Scalene")
Else Output ("Isosceles")
EndIf
EndIf
Else Output ("Not a Triangle")
EndIf
Problem Statement
NextDate is a function of three variables: month, day, and year. It returns the date of the day
after the input date. The month, day, and year variables have integer values subject to these
conditions:
cl. 1 < month < 12
c2. 1 < day < 31
c3. 1812 < year < 2012
As we did with the triangle program, we can make our specification stricter. This entails defin-
ing responses for invalid values of the input values for the day, month, and year. We can also
define responses for invalid combinations of inputs, such as June 31 of any year. If any of
conditions cl, c2, or c3 fails, NextDate produces an output indicating the corresponding variable
has an out-of-range value — for example, "Value of month not in the range 1..12." Because
numerous invalid day-month-year combinations exist, NextDate collapses these into one
message: "Invalid Input Date.'
Discussion
Two sources of complexity exist in the NextDate function: the complexity of the input domain
discussed previously, and the rule that determines when a year is a leap year. A year is 365.2422
days long; therefore, leap years are used for the "extra day" problem. If we declared a leap year
every fourth year, a slight error would occur. The Gregorian calendar (after Pope Gregory)
resolves this by adjusting leap years on century years. Thus, a year is a leap year if it is divisible
by 4, unless it is a century • year. Century years are leap years only if they are multiples of 400
(Inglis, 1961), so 1992, 1996, and 2000 are leap years, while the year 1900 is not. The NextDate
function also illustrates a sidelight of software testing. Many times, we find examples of Zipf's
law, which states that 80% of the activity occurs in 20% of the space. Notice how much of the
source code is devoted to leap year considerations. In the second implementation, notice how
much code is devoted to input value validation.
Implementation
Else If day = 29
Then tomorrowDay = 1
tomorrowMonth = 3
Else Output ("Cannot have Feb.", day)
EndIf
EndIf
EndIf
EndCase
End NextDate
Do
Output ("Enter today's date in the form MM DD YYYY")
Input (month,day,year)
cl = (1 <= day) AND (day <= 31)
c2 = (1 <= month) AND (month <= 12)
c3 = (1812 <= year) AND (year <= 2012)
If NOT(cl)
Then Output ("Value of day not in the range 1..31")
EndIf
If NOT (c2)
Then Output ("Value of month not in the range 1..12")
EndIf
If NOT(c3)
Then Output ("Value of year not in the range 1812..2012")
EndIf
Until cl AND c2 AND c3
Case month Of
Case 1: month Is 1,3,5,7,8, Or 10: '31 day months (except Dec.)
If day < 31
Then tomorrowDay = day + 1
Else
tomorrowDay = 1
tomorrowMonth = month + 1
EndIf
EndIf
EndIf
EndCase
Output ("Tomorrow's date is", tomorrowMonth, tomorrowDay, tomorrowYear)
End NextDate2
Problem Statement
A rifle salesperson in the former Arizona Territory sold rifle locks, stocks, and barrels made by a gun-
smith in Missouri. Locks cost $45, stocks cost $30, and barrels cost $25. The salesperson had to sell at
least one complete rifle per month, and production limits were such that the most the salesperson could
sell in a month was 70 locks, 80 stocks, and 90 barrels. After each town visit, the salesperson sent a
telegram to the Missouri gunsmith with the number of locks, stocks, and barrels sold in that town. At the
end of a month, the salesperson sent a very short telegram showing -1 lock sold. The gunsmith then
knew the sales for the month were complete and computed the salesperson's commission as follows:
10% on sales up to (and including) $1000, 15% on the next $800, and 20% on any sales in excess of
$1800. The commission program produced a monthly sales report that gave the total number of locks,
stocks, and barrels sold, the salesperson's total dollar sales, and, finally, the commission.
Discussion
This example is somewhat contrived to make the arithmetic quickly visible to the reader. It might be
more realistic to consider some other additive function of several variables, such as various calculations
found in filling out a U.S. 1040 income tax form. (We will stay with rifles.) This problem separates into
three distinct pieces: the input data portion, in which we could deal with input data validation (as we did
for the triangle and NextDate programs); the sales calculation; and the commission calculation portion.
This time, we will omit the input data validation portion. We will replicate the telegram convention with
a sentinel-controlled While loop that is typical of MIS data gathering applications.
Implementation
totalBarrels = 0
Input(locks)
While NOT(locks = -1) 'Input device uses -1 to indicate end of data
Input(stocks, barrels)
totalLocks = totalLocks + locks
totalStocks = totalStocks + stocks
totalBarrels = totalBarrels + barrels
Input(locks)
EndWhile
Output("Locks sold: ", totalLocks)
Output("Stocks sold: ", totalStocks)
Output("Barrels sold: ",totalBarrels)
Problem Statement
The SATM system communicates with bank customers via the 15 screens shown in Figure 2.4. Using a
terminal with features as shown in Figure 2.3, SATM customers can select any of three transaction
types: deposits, withdrawals, and balance inquiries. These transactions can be done on two types of
accounts: checking and savings.
When a bank customer arrives at an SATM station, screen 1 is displayed. The bank customer accesses
the SATM system with a plastic card encoded with a personal account number (PAN), which is a key to
an internal customer account file, containing, among other things, the customer's name and account
information. If the customer's PAN matches the information in the customer account file, the system
presents screen 2 to the customer. If the customer's PAN is not found, screen 4 is displayed, and the card
is kept.
At screen 2, the customer is prompted to enter his or her personal identification number (PIN). If the
PIN is correct (i.e., matches the information in the customer account file), the system displays screen 5;
otherwise, screen 3 is displayed. The customer has three chances to get the PIN correct; after three
failures, screen 4 is displayed, and the card is kept.
On entry to screen 5, the system adds two pieces of information to the customer's account file: the
current date and an increment to the number of ATM sessions. The customer selects the desired
transaction from the options shown on screen 5; then the system immediately displays screen 6, where
the customer chooses the account to which the selected transaction will be applied.
If balance is requested, the system checks the local ATM file for any unposted transactions and
reconciles these with the beginning balance for that day from the customer account file. Screen 14 is
then displayed.
If a deposit is requested, the status of the deposit envelope slot is determined from a field in the
terminal control file. If no problem is known, the system displays screen 7 to get the transaction amount.
If a problem occurs with the deposit envelope slot, the system displays screen 12. Once the deposit
amount has been entered, the system displays screen 13, accepts the deposit envelope, and processes the
deposit. The deposit amount is entered as an unposted amount in the local ATM file, and the count of
deposits per month is incremented. Both of these (and other information) are processed by the master
ATM (centralized) system once a day. The system then displays screen 14.
If a withdrawal is requested, the system checks the status (jammed or free) of the withdrawal
chute in the terminal control file. If jammed, screen 10 is displayed; otherwise, screen 7 is displayed so
the customer can enter the withdrawal amount. Once the withdrawal amount is entered, the system
checks the terminal status file to see if it has enough money to dispense. If it does not, screen 9 is
displayed; otherwise, the withdrawal is processed. The system checks the customer balance (as
described in the balance request transaction); if the funds are insufficient, screen 8 is displayed. If the
account balance is sufficient, screen 11 is displayed and the money is dispensed.
The withdrawal amount is written to the unposted local ATM file, and the count of withdrawals per
month is incremented. The balance is printed on the transaction receipt as it is for a balance request
transaction. After the cash has been removed, the system displays screen 14.
When the "No" button is pressed in screen 10, 12, or 14, the system presents screen 15 and
returns the customer's ATM card. Once the card is removed from the card slot, screen 1 is displayed.
When the "Yes" button is pressed in screen 10, 12, or 14, the system presents screen 5 so the customer
can select additional transactions.
Discussion
A surprising amount of information is "buried" in the system description just given. For instance, if you
read it closely, you can infer that the terminal only contains $10 bills (see screen 7). This textual
definition is probably more precise than what is usually encountered in practice. The example is
deliberately simple (hence the name).
A plethora of questions could be resolved by a list of assumptions. For example: Is there a bor-
rowing limit? What keeps a customer from taking out more than his actual balance if he goes to several
ATM terminals? A lot of start-up questions are used: How much cash is initially in the machine? How
are new customers added to the system? These and other real-world refinements are eliminated to
maintain simplicity.