0% found this document useful (0 votes)
6 views

blackbox

Uploaded by

iit2023130
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

blackbox

Uploaded by

iit2023130
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 150

BLACK BOX TESTING

Black-box Testing
Inputs causing
anomalous
Input test data I behaviour
e

System

Outputs which reveal


the presence of
Output test results Oe defects
Black-box Testing
• Black-box testing, also called behavioral
testing, focuses on the functional
requirements of the software.
• That is, black-box testing enables the software
engineer to derive sets of input conditions
that will fully exercise all functional
requirements for a program.
Relationship Between Black Box and
White Box Testing
Black-box Testing
• Black-box testing is not an alternative to
white-box techniques.
• Rather, it is a complementary approach that is
likely to uncover a different class of errors
than white-box methods.
What kind of errors we can identify in
Black Box Testing
Black-box Testing
• Tests are designed to answer the following questions:
• How is functional validity tested?
• How is system behavior and performance tested?
• What classes of input will make good test cases?
• Is the system particularly sensitive to certain input
values?
• How are the boundaries of a data class isolated?
• What data rates and data volume can the system
tolerate?
• What effect will specific combinations of data have
on system operation?
Black-box Testing
• Black-box testing attempts to find errors in the
following categories:
(1) incorrect or missing functions,
(2) interface errors,
(3) errors in data structures or external data
base access,
(4) behavior or performance errors, and
(5) initialization and termination errors.
When to do the Black Box Testing
Black-box Testing
• Unlike white-box testing, which is performed
early in the testing process, black-box testing
tends to be applied during later stages of
testing.
• Because black-box testing purposely
disregards control structure, attention is
focused on the information domain.
Types of Black Box Testing
Types of Black Box Testing
• Equivalence Partitioning
• Boundary value analysis
• Graph-Based Testing Methods
• Requirements-Based Testing
• Random Testing
Equivalence Partitioning
• Partition system inputs and outputs into
“equivalence sets”:
– If input is a 5-digit integer between 10,000 and
99,999,
equivalence partitions are
Equivalence
Partitioning
An equivalence class Invalid inputs Valid inputs

represents a set of
valid or invalid states
for input conditions.
System
Typically, an input
condition is either a
specific numeric
value, a range of
values, a set of
related values, or a
Outputs
Boolean condition.
Equivalence partitioning

Test selection using equivalence partitioning allows a tester to


subdivide the input domain into a relatively small number of
sub-domains.

Each subset is known as an equivalence class.

16
Aim
• Equivalence class testing strategy is developed
with the following aims
• – To test as few input values as possible
• – To spot and eliminate redundant tests
• – To tell how many tests are necessary to test
a given piece of software to a known level
Example 1

Consider an application A that takes an integer denoted by age as


input. Let us suppose that the only legal values of age are in the range
[1..120]. The set of input values is now divided into a set E containing
all integers in the range [1..120] and a set U containing the remaining
integers.
All integers
Other integers

[1..120]

18
Example 1 (contd.)
Further, assume that the application is required to process all values in
the range [1..61] in accordance with requirement R1 and those in the
range [62..120] according to requirement R2.

Thus E is further subdivided into two regions depending on the


expected behavior.

Similarly, it is expected that all invalid inputs less than or equal to 1


are to be treated in one way while all greater than 120 are to be treated
differently. This leads to a subdivision of U into two categories.

19
Example 1 (contd.)

All integers
<1

>120
[62-120]
[1..61]

20
Example 1 (contd.)
Tests selected using the equivalence partitioning technique aim at
targeting faults in the application under test with respect to inputs in
any of the four regions, i.e. two regions containing expected inputs
and two regions containing the unexpected inputs.

It is expected that any single test selected from the range [1..61]
will reveal any fault with respect to R1. Similarly, any test
selected from the region [62..120] will reveal any fault with
respect to R2. A similar expectation applies to the two regions
containing the unexpected inputs.

21
Types of Equivalence classes Partition
• Uni-dimensional
• Multi-dimensional
Unidimensional partitioning
One way to partition the input domain is to consider one
input variable at a time. Thus each input variable leads to
a partition of the input domain. We refer to this style of
partitioning as unidimensional equivalence partitioning or
simply unidimensional partitioning.

This type of partitioning is commonly used.

23
Multidimensional partitioning
Another way is to consider the input domain I as the set product
of the input variables and define a relation on I. This procedure
creates one partition consisting of several equivalence classes.
We refer to this method as multidimensional equivalence
partitioning or simply multidimensional partitioning.
Multidimensional partitioning leads to a large number of
equivalence classes that are difficult to manage manually.

24
Partitioning Example
Consider an application that requires two integer inputs x and y.
Each of these inputs is expected to lie in the following ranges:
3 x7 and 5y9.

For unidimensional partitioning we apply the partitioning


guidelines to x and y individually. This leads to the following
six equivalence classes.

25
Partitioning Example (contd.)

E1: x<3 E2: 3x7 E3: x>7 y ignored.

E4: y<5 E5: 5y9 E6: y>9 x ignored.

For multidimensional partitioning we consider the input


domain to be the set product X x Y. This leads to 9
equivalence classes.

26
Partitioning Example (contd.)

E1: x<3, y<5 E2: x<3, 5y9 E3: x<3, y>9

E4: 3x7, y<5 E5: 3x7, 5y9 E6: 3x7, y>9

E7: >7, y<5 E8: x>7, 5y9 E9: x>7, y>9

27
Black Box Testing in detail
• Black Box testing is based on functional
specification
– What we get in functional specifications.
Boundary value analysis

29
Errors at the boundaries

Experience indicates that programmers make mistakes in processing values at and near the
boundaries of equivalence classes.

For example, suppose that method M is required to compute a function f1 when x 0 is


true and function f2 otherwise. However, M has an error due to which it computes f1 for
x<0 and f2 otherwise.

Obviously, this fault is revealed, though not necessarily, when M is tested against x=0 but
not if the input test set is, for example, {-4, 7} derived using equivalence partitioning. In
this example, the value x=0, lies at the boundary of the equivalence classes x0 and x>0.

30
Boundary value analysis (BVA)

Boundary value analysis is a test selection technique that targets faults in applications at
the boundaries of equivalence classes.

While equivalence partitioning selects tests from within equivalence classes, boundary
value analysis focuses on tests at and near the boundaries of equivalence classes.

Certainly, tests derived using either of the two techniques may overlap.

31
BVA: Procedure

1 Partition the input domain using unidimensional partitioning. This leads to as many
partitions as there are input variables. Alternately, a single partition of an input domain
can be created using multidimensional partitioning. We will generate several sub-
domains in this step.

2 Identify the boundaries for each partition. Boundaries may also be identified using
special relationships amongst the inputs.

3 Select test data such that each boundary value occurs in at least one test input.

32
BVA: Example: 1. Create equivalence
classes

Assuming that an item code must be in the range 99-999 and quantity in the range 1-100,

Equivalence classes for code:


E1: Values less than 99.
E2: Values in the range.
E3: Values greater than 999.

Equivalence classes for qty:


E4: Values less than 1.
E5: Values in the range.
E6: Values greater than 100.

33
BVA: Example: 2. Identify boundaries

98 100 998 1000


* x * * x *
E1 99 999 E3
E2

0 2 99 101
* x * * x *
E4 1 100 E6
E5

Equivalence classes and boundaries for findPrice. Boundaries are


indicated with an x. Points near the boundary are marked *.

34
BVA: Example: 3. Construct test set

Test selection based on the boundary value analysis technique requires that tests must
include, for each variable, values at and around the boundary. Consider the following
test set:

T={ t1: (code=98, qty=0),


t2: (code=99, qty=1),
Illegal values of code
t3: (code=100, qty=2),
and qty included.
t4: (code=998, qty=99),
t5: (code=999, qty=100),
t6: (code=1000, qty=101)
}

35
Important Observation
• Functionality Coverage Test Versus acceptance
Testing
How to choose the input in
functionality test
• What is experimental design
Input Coverage Methods
• Types of Input Coverage Methods
Output Coverage Testing
• What we need for Output Coverage Testing
What is grey box testing
• Example of grey box testing
Graph-Based Testing Methods
• The first step in black-box testing is to
understand the objects that are modeled in
software and the relationships that connect
these objects.
• Once this has been accomplished, the next
step is to define a series of tests that verify “all
objects have the expected relationship to one
another [BEI95].”
Graph-Based Testing Methods
• Stated in another way, software testing begins
by creating a graph of important objects and
their relationships and then devising a series
of tests that will cover the graph so that each
object and relationship is exercised and errors
are uncovered.
Graph-Based Testing Methods
• Nodes are represented as circles connected by links that take a number of
different forms. A directed link (represented by an arrow) indicates that a
relationship moves in only one direction. A bidirectional link, also called a
symmetric link, implies that the relationship applies in both directions.
Parallel links are used when a number of different relationships are
established between graph nodes.
Graph-Based Testing Methods
• Object #1 = new file menu select
• Object #2 = document window
• Object #3 = document text
Transaction flow modeling
• The nodes represent steps in some transaction (e.g.,
the steps required to make an airline reservation
using an on-line service), and the links represent the
logical connection between steps.

• The data flow diagram can be used to assist in


creating graphs of this type.
Finite state modeling
• The nodes represent different user observable
states of the software (e.g., each of the
“screens” that appear as an order entry clerk
takes a phone order), and the links represent
the transitions that occur to move from state
to state.
Timing modeling
• The nodes are program objects and the links
are the sequential connections between those
objects.
• Link weights are used to specify the required
execution times as the program executes.
Example: A Very Simple Finite State
Model of the Clock

82
Test cases
• We could use this very simple state model as a
basis for tests, where following a path in the
model is equivalent to running a test:
– Setup:
• Put the Clock into its Analog display mode
– Action:
• Click on “Settings\Digital”
– Outcome:
• Does the Clock correctly change to the Digital display?

83
Actions
• the following actions in the Clock:
– Start the Clock application
– Stop the Clock application
– Select Analog setting
– Select Digital setting

84
The rules for actions
• The rules for these actions in the Clock application are as follows:
• Start
– If the application is NOT running, the user can execute the Start command.
– If the application is running, the user cannot execute the Start command.
– After the Start command executes, the application is running.
• Stop
– If the application is NOT running, the user cannot execute the Stop command.
– If the application is running, the user can execute the Stop command.
– After the Stop command executes, the application is not running.

• Analog
– If the application is NOT running, the user cannot execute the Analog command.
– If the application is running, the user can execute the Analog command.
– After the Analog command executes, the application is in Analog display mode.
• Digital
– If the application is NOT running, the user cannot execute the Digital command.
– If the application is running, the user can execute the Digital command.
– After the Digital command executes, the application is in Digital display mode.

85
Operational modes
• Two modes of operations:
– system mode
• NOT_RUNNING means Clock is not running
• RUNNING means Clock is running
– setting mode
• ANALOG means Analog display is set
• DIGITAL means Digital display is set

86
State Transition Diagram for the Clock
Model

87
An Example

Idle

Caller
Ringing
hung up

You
Connected
hung up

On Hold
Other Black box testing techniques
– Domain
– Function
– Regression
– Specification-based
– User
– Scenario
– Risk-based
– Stress
– High volume stochastic or
random
Domain Testing

Key Idea: “Divide and conquer the data”


Summary: - Look for any data processed by the product. Look at
outputs as well as inputs.
- Decide which data to test with. Consider things like
boundary values, typical values, convenient values,
invalid values, or best representatives.
- Consider combinations of data worth testing together.

Good for: all purposes


Domain Testing
• AKA partitioning, equivalence analysis, boundary
analysis
• Fundamental question or goal:
– This confronts the problem that there are too
many test cases for anyone to run.
– This is a sampling strategy that provides a
rationale for selecting a few test cases from a huge
population.
Domain Testing
• General approach:
– Divide the set of possible values of a field into subsets,
pick values to represent each subset. Typical values will
be at boundaries. More generally, the goal is to find a
“best representative” for each subset, and to run tests
with these representatives.
– Advanced approach: combine tests of several “best
representatives”. Several approaches to choosing optimal
small set of combinations.
Domain Testing
• Paradigmatic case(s)
– Equivalence analysis of a simple numeric field.
Domain Testing
• Strengths
– Find highest probability errors with a relatively
small set of tests.
– Intuitively clear approach, generalizes well
• Blind spots
– Errors that are not at boundaries or in obvious
special cases.
– Also, the actual domains are often unknowable.
Domain Testing
• Some Key Tasks
–Partitioning into equivalence classes
–Discovering best representatives of the sub-classes
–Combining tests of several fields
–Create boundary charts
–Find fields / variables / environmental conditions
–Identify constraints (non-independence) in the
relationships among variables.
Domain Testing
Some Relevant Skills
•Identify ambiguities in specifications or descriptions of fields
•Find biggest / smallest values of a field
•Discover common and distinguishing characteristics of multi-
dimensional fields, that would justify classifying some values
as “equivalent” to each other and different from other groups
of values.
•Standard variable combination methods.
Function Testing

Key Idea: “Test what it can do”


Summary: - A function is something the product can do.
- Identify each function and sub-function.
- Determine how you would know if they worked.
- Test each function, one at a time.
- See that each function does what it’s supposed to
do, and not what it isn’t.

Good for: assessing capability rather than reliability


Function Testing
• Tag line
– “Black box unit testing.”
• Fundamental question or goal
– Test each function thoroughly, one at a time.
• Paradigmatic case(s)
– Spreadsheet, test each item in isolation.
– Database, test each report in isolation
• Strengths
– Thorough analysis of each item tested
• Blind spots
– Misses interactions, misses exploration of the benefits
offered by the program.
Some Function Testing Tasks
• Identify the program’s features / commands
– From specifications or the draft user manual
– From walking through the user interface
– From trying commands at the command line
– From searching the program or resource files
for command names
• Identify variables used by the functions and test their
boundaries.
• Identify environmental variables that may constrain the
function under test.
• Use each function in a mainstream way (positive testing) and
push it in as many ways as possible, as hard as possible.
Regression Testing

Key Idea: “Repeat testing after changes.”


Summary: - Build a suite of tests
- Run the tests when anything changes

- Bug regression (Show that a bug was not fixed)


- Old fix regression (Show that an old bug fix was
broken)
- General functional regression (Show that a change
caused a working area to break.)

Good for: Building confidence


Regression Testing
• Blind spots / weaknesses
– Anything not covered in the regression series.
– Repeating the same tests means not looking for the bugs
that can be found by other tests.
– Pesticide paradox
– Low yield from automated regression tests
– Maintenance of this standard list can be costly and
distracting from the search for defects.
Automating Regression Testing
• This is the most commonly discussed
automation approach:
– create a test case
– run it and inspect the output
– if the program fails, report a bug and try again
later
– if the program passes the test, save the resulting
outputs
– in future tests, run the program and compare the
output to the saved results. Report an exception
whenever the current output and the saved output
don’t match.
Potential Regression Advantages
– Dominant paradigm for automated testing.
– Straightforward
– Same approach for all tests
– Relatively fast implementation
– Variations may be easy
– Repeatable tests
Testing Analogy: Clearing Mines

This analogy was first presented by Brian Marick.


mines These slides are from James Bach..
Totally Repeatable Tests
Won’t Clear the Minefield

mines fixes
Variable Tests are Often More Effective

mines fixes
Specification Based Testing

Key Idea: “Verify every claim”


Summary: - Identify specifications (implicit or explicit).
- Analyze individual claims about the product.
- Work to clarify vague claims.
- Verify that each claim about the product is true.
- Expect the specification and product to be brought
into alignment.

Good for: simultaneously testing the product and


specification, while refining expectations
Specification-Driven Testing
• Strengths
– Critical defense against warranty claims, fraud charges,
loss of credibility with customers.
– Effective for managing scope / expectations of regulatory-
driven testing
– Reduces support costs / customer complaints by ensuring
that no false or misleading representations are made to
customers.
• Blind spots
– Any issues not in the specs or treated badly in the specs
/documentation.
Traceability Matrix
Var 1 Var 2 Var 3 Var 4 Var 5

Test 1 X X X

Test 2 X X

Test 3 X X X

Test 4 X X

Test 5 X X

Test 6 X X

Var can be anything identified as needing testing


(e.g., a feature, input, or result)
Traceability Matrix
• The columns involve different test items. A test item might
be a function, a variable, an assertion in a specification or
requirements document, a device that must be tested, any
item that must be shown to have been tested.
• The rows are test cases.
• The cells show which test case tests which items.
• If a feature changes, you can quickly see which tests must
be reanalyzed, probably rewritten.
• In general, you can trace back from a given item of interest
to the tests that cover it.
• This doesn’t specify the tests, it merely maps their
coverage.
Specification
• Tasks
– review specifications for
• Ambiguity
• Adequacy (it covers the issues)
• Correctness (it describes the program)
• Content (not a source of design errors)
• Testability support
– Create traceability matrices
– Document management (spec versions, file
comparison utilities for comparing two spec versions,
etc.)
– Participate in review meetings
Specification
• Skills
– Understand the level of generality called for when testing a
spec item. For example, imagine a field X:
• We could test a single use of X
• Or we could partition possible values of X and test
boundary values
• Or we could test X in various scenarios
• Which is the right one?
– Ambiguity analysis
Breaking Statements into Elements:
An Example
• Quality is value to some person
– Quality



– Value



– Some



– Person

User Testing

Key Idea: “Involve the users”


Summary: - Identify categories and roles of users.
- Determine what each category of user will do, how
they will do it, and what they value.
- Get real user data, or bring real users in to test.
- Otherwise, systematically simulate a user.
- Powerful user testing is that which involves a variety
of users and user roles, not just one.

Good for: all purposes


User Testing
• Tag line
– Strive for realism
– Let’s try this with real humans (for a change).
• Fundamental question or goal
– Identify failures that will arise in the hands of a person, i.e.
breakdowns in the overall human/machine/software
system.
• Paradigmatic case(s)
– Beta testing
– In-house experiments using a stratified sample of target
market
– Usability testing
User Testing
• Strengths
– Design issues are more credibly exposed.
– Can demonstrate that some aspects of product are incomprehensible or lead
to high error rates in use.
– In-house tests can be monitored with flight recorders (capture/replay, video),
debuggers, other tools.
– In-house tests can focus on areas / tasks that you think are (or should be)
controversial.
• Blind spots
– Coverage is not assured (serious misses from beta test, other user tests)
– Test cases can be poorly designed, trivial, unlikely to detect subtle errors.
– Beta testing is not free, beta testers are not skilled as testers, the technical
results are mixed. Distinguish marketing betas from technical betas.
Scenario Testing

Key Idea: “Do one thing after another”


Summary: - Define test procedures or high level cases that
incorporate multiple activities connected end to end.
- Don’t reset the system between events.
- Can vary timing and sequencing, and try parallel
threads.

Good for: finding problems fast


(however, bug analysis is more difficult)
Scenario Testing
• Tag lines
– “Do something useful and interesting”
– “Do one thing after another.”
• Fundamental question or goal
– Challenging cases that reflect real use.
• Paradigmatic case(s)
– Appraise product against business rules, customer data,
competitors’ output
– Life history testing
– Use cases are a simpler form, often derived from
product capabilities and user model rather than from
naturalistic observation of systems of this kind.
Scenario Testing
• The ideal scenario has several characteristics:
– It is realistic (e.g. it comes from actual customer or competitor
situations).
– There is no ambiguity about whether a test passed or failed.
– The test is complex, that is, it uses several features and
functions.
– There is a stakeholder who has influence and will protest if the
program doesn’t pass this scenario.
• Strengths
– Complex, realistic events. Can handle (help with) situations that
are too complex to model.
– Exposes failures that occur (develop) over time
• Blind spots
– Single function failures can make this test inefficient.
– Must think carefully to achieve good coverage.
Scenarios
•Some ways to trigger thinking about scenarios:
– Benefits-driven: People want to achieve X. How will they do it,
for the following X’s?
– Sequence-driven: People (or the system) typically does task X
in an order. What are the most common orders (sequences) of
subtasks in achieving X?
– Transaction-driven: We are trying to complete a specific
transaction, such as opening a bank account or sending a
message. What are all the steps, data items, outputs and
displays, etc.?
– Get use ideas from competing product: Their docs,
advertisements, help, etc., all suggest best or most interesting
uses of their products. How would our product do these
things?
Scenarios
• Some ways to trigger thinking about scenarios:

– Competitor’s output driven: Hey, look at these cool


documents they can make. Look (think of Netscape’s
superb handling of often screwy HTML code) at how
well they display things. How do we do with these?

– Customer’s forms driven: Here are the forms the


customer produces in her business. How can we work
with (read, fill out, display, verify, whatever) them?
Soap Operas
– Build a scenario based on real-life experience. This means
client/customer experience.
– Exaggerate each aspect of it:
• example, for each variable, substitute a more extreme
value
• example, if a scenario can include a repeating element,
repeat it lots of times
• make the environment less hospitable to the case
(increase or decrease memory, printer resolution, video
resolution, etc.)
– Create a real-life story that combines all of the elements into a test case
narrative.
Risk Based Testing

Key Idea: “Imagine a problem, then look for it.”


Summary: - What kinds of problems could the product have?
- Which problems matter most? Focus on those.
- How would you detect them if they were there?
- Make a list of interesting problems and design tests
specifically to reveal them.
- It may help to consult experts, design documentation,
past bug reports, or apply risk heuristics.

Good for: making best use of testing resources;


leveraging experience
Risk-Based Testing
• Tag line
– “Find big bugs first.”
• Fundamental question or goal
– Define and refine tests in terms of the kind of
problem (or risk) that you are trying to manage
– Prioritize the testing effort in terms of the relative
risk of different areas or issues we could test for.
• Paradigmatic case(s)
– Equivalence class analysis, reformulated.
– Test in order of frequency of use.
– Stress tests, error handling tests, security tests,
tests looking for predicted or feared errors.
– Sample from predicted-bugs list.
– Failure Mode and Effects Analysis (FMEA)
Equivalence and Risk
• Our working definition of equivalence:
Two test cases are equivalent if you expect the
same result from each.
• This is fundamentally subjective. It depends on what you expect. And what
you expect depends on what errors you can anticipate:
Two test cases can only be equivalent by
reference to a specifiable risk.
• Two different testers will have different theories about how programs can fail,
and therefore they will come up with different classes.
• A boundary case in this system is a “best representative.”
A best representative of an equivalence class is
a test that is at least as likely to expose a fault
as every other member of the class.
Risk-Based Testing
• Strengths
– Optimal prioritization (assuming we
correctly identify and prioritize the
risks)
– High power tests
• Blind spots
– Risks that were not identified or that
are surprisingly more likely.
– Some “risk-driven” testers seem to
operate too subjectively. How will I
know what level of coverage that I’ve
reached? How do I know that I haven’t
missed something critical?
Evaluating Risk
• Several approaches that call themselves “risk-based
testing” ask which tests we should run and which we
should skip if we run out of time.
• I think this is only half of the risk story. The other half
focuses on test design.
– A key purpose of testing is to find defects. So, a key
strategy for testing should be defect-based. Every test
should be questioned:
• How will this test find a defect?
• What kind of defect do you have in mind?
• What power does this test have against that kind of defect? Is
there a more powerful test? A more powerful suite of tests?
Risk-Based Testing
• Many of us who think about testing in terms of risk, analogize
testing of software to the testing of theories:
– Karl Popper, in his famous essay Conjectures and
Refutations, lays out the proposition that a scientific
theory gains credibility by being subjected to (and
passing) harsh tests that are intended to refute the
theory.
– We can gain confidence in a program by testing it
harshly (if it passes the tests).
– Subjecting a program to easy tests doesn’t tell us
much about what will happen to the program in the
field.
• In risk-based testing, we create harsh tests for vulnerable
areas of the program.
Risk-Based Testing
• Two key dimensions:
– Find errors (risk-based approach to the
technical tasks of testing)
– Manage the process of finding errors (risk-
based test management)
Risk-Based Testing: Definitions
– Hazard:
• A dangerous condition (something that could
trigger an accident)

– Risk:
• Possibility of suffering loss or harm (probability of
an accident caused by a given hazard).

– Accident:
• A hazard is encountered, resulting in loss or harm.
Risks: Where to look for errors
• Quality Categories:
– Capability
– Reliability Each quality category is a risk
– Usability category, as in:
– Performance “the risk of unreliability.”
– Installability
– Compatibility – Efficiency – Maintainability
– Supportability – Localizability – Extendibility
– Testability – Portability

» Derived from James Bach’s Satisfice Model


Risks: Where to look for errors
• New things: newer features may fail.
• New technology: new concepts lead to new mistakes.
• Learning Curve: mistakes due to ignorance.
• Changed things: changes may break old code.
• Late change: rushed decisions, rushed or demoralized staff
lead to mistakes.
• Rushed work: some tasks or projects are chronically
underfunded and all aspects of work quality suffer.
• Tired programmers: long overtime over several weeks or
months yields inefficiencies and errors
» Adapted from James Bach’s lecture notes
Risks: Where to look for errors
• Other staff issues: alcoholic, mother died, two
programmers who won’t talk to each other (neither
will their code)…
• Just slipping it in: pet feature not on plan may
interact badly with other code.
• N.I.H.: external components can cause problems.
• N.I.B.: (not in budget) Unbudgeted tasks may be
done shoddily.
• Ambiguity: ambiguous descriptions (in specs or
other docs) can lead to incorrect or conflicting
implementations.
» Adapted from James Bach’s lecture notes
Risks: Where to look for errors
• Conflicting requirements: ambiguity often hides conflict, result is loss
of value for some person.
• Unknown requirements: requirements surface throughout
development. Failure to meet a legitimate requirement is a failure of
quality for that stakeholder.
• Evolving requirements: people realize what they want as the product
develops. Adhering to a start-of-the-project requirements list may
meet contract but fail product. (check out
http//www.agilealliance.org/)
• Complexity: complex code may be buggy.
• Bugginess: features with many known bugs may also have many
unknown bugs.
» Adapted from James Bach’s lecture notes
Risks: Where to look for errors
• Dependencies: failures may trigger other failures.
• Untestability: risk of slow, inefficient testing.
• Little unit testing: programmers find and fix most of their own
bugs. Shortcutting here is a risk.
• Little system testing so far: untested software may fail.
• Previous reliance on narrow testing strategies: (e.g. regression,
function tests), can yield a backlog of errors surviving across
versions.
• Weak testing tools: if tools don’t exist to help identify / isolate a
class of error (e.g. wild pointers), the error is more likely to
survive to testing and beyond.
» Adapted from James Bach’s lecture notes
Risks: Where to look for errors
• Unfixability: risk of not being able to fix a bug.
• Language-typical errors: such as wild pointers in C. See
– Bruce Webster, Pitfalls of Object-Oriented Development
– Michael Daconta et al. Java Pitfalls
• Criticality: severity of failure of very important features.
• Popularity: likelihood or consequence if much used features fail.
• Market: severity of failure of key differentiating features.
• Bad publicity: a bug may appear in PC Week.
• Liability: being sued.
» Adapted from James Bach’s lecture notes
Bug Patterns as a Source of Risk
• Testing Computer Software lays out a set of 480 common defects. You can use
these or develop your own list.
– Find a defect in the list
– Ask whether the software under test could have this
defect
– If it is theoretically possible that the program could
have the defect, ask how you could find the bug if it
was there.
– Ask how plausible it is that this bug could be in the
program and how serious the failure would be if it
was there.
– If appropriate, design a test or series of tests for bugs
of this type.
Risk-Based Testing
• Tasks
– Identify risk factors (hazards: ways in which the program
could go wrong)
– For each risk factor, create tests that have power against it.
– Assess coverage of the testing effort program, given a set
of risk-based tests. Find holes in the testing effort.
– Build lists of bug histories, configuration problems, tech
support requests and obvious customer confusions.
– Evaluate a series of tests to determine what risk they are
testing for and whether more powerful variants can be
created.
Risk-Based Test Management
• Project risk management involves
– Identification of the different risks to the project (issues
that might cause the project to fail or to fall behind
schedule or to cost too much or to dissatisfy customers or
other stakeholders)
– Analysis of the potential costs associated with each risk
– Development of plans and actions to reduce the likelihood
of the risk or the magnitude of the harm
– Continuous assessment or monitoring of the risks (or the
actions taken to manage them)
Risk-Driven Testing Cycle

Analyze & Perform


Prioritize Appropriate
Hazards Tests

Improve Report &


Risk Analysis Resolve
Process Analyze Problems
Failures: Reassess
Risks

post-ship pre-ship
Risk-Based Test Management
• Tasks
– List all areas of the program that could require testing
– On a scale of 1-5, assign a probability-of-failure estimate to each
– On a scale of 1-5, assign a severity-of-failure estimate to each
– For each area, identify the specific ways that the program might fail
and assign probability-of-failure and severity-of-failure estimates
for those
– Prioritize based on estimated risk
– Develop a stop-loss strategy for testing untested or lightly-tested
areas, to check whether there is easy-to-find evidence that the
areas estimated as low risk are not actually low risk.
Stress Testing

Key Idea: “Overwhelm the product”


Summary: - Look for functions or sub-systems of the product
that may be vulnerable to failure due to challenging
input or constrained resources.
- Identify input or resources related to those functions
or sub-systems.
- Select or generate challenging data and platform
configurations to test with: e.g., large or complex data
structures, high loads, long test runs, many test
cases, limited memory, etc.

Good for: performance, reliability, and


efficiency assessment
Stress Testing
• Tag line
– “Overwhelm the product.”
• Fundamental question or goal
– Learn about the capabilities and weaknesses of the product by driving it
through failure and beyond. What does failure at extremes tell us about
changes needed in the program’s handling of normal cases?
• Paradigmatic case(s)
– Buffer overflow bugs
– High volumes of data, device connections, long transaction chains
– Low memory conditions, device failures, viruses, other crises.
• Strengths
– Expose weaknesses that will arise in the field.
– Expose security risks.
• Blind spots
– Weaknesses that are not made more visible by stress.
Volume Random Testing

Key Idea: “Run a million different tests”


Summary: - Look for an opportunity to automatically generate
thousands of slightly different tests.
- Create an automated, high speed oracle.
- Write a program to generate, execute, and evaluate
all the tests.

Good for: Assessing reliability across input and time.


Random / Statistical Testing
• Strengths
– Regression doesn’t depend on same old test every time.
– Partial oracles can find errors in young code quickly and
cheaply.
– Less likely to miss internal optimizations that are invisible
from outside.
– Can detect failures arising out of long, complex chains that
would be hard to create as planned tests.
• Blind spots
– Need to be able to distinguish pass from failure. Too many
people think “Not crash = not fail.”
– Executive expectations must be carefully managed.
– Also, these methods will often cover many types of risks,
but will obscure the need for other tests that are not
amenable to automation.
Random / Statistical Testing
• Blind spots
– Testers might spend much more time analyzing
the code and too little time analyzing the
customer and her uses of the software.
– Potential to create an inappropriate prestige
hierarchy, devaluating the skills of subject
matter experts who understand the product
and its defects much better than the
automators.
RandomTesting:
Model-based Stochastic Tests
• Fundamental Question or Goal
– Build a state model of the software. (The analysis will
reveal several defects in itself.) Generate random events
/ inputs to the program. The program responds by
moving to a new state. Test whether the program has
reached the expected state.
• Paradigmatic case(s)
– Walking a UI menu tree using a state transition table
State Transition Table Example
Initial State Event Result New State
S1 E1 <none> S1
E2 logged in S2
E3 SU log in S3
S2 E4 … S4
E2 E5 <none> S2
E6 logged out Exit
S1
S3 E4 … S4
S2 E5 admin S3
E1
E5 E6 logged out Exit
E4
E3 E5 E6

S3 S4
E4
Exit
E6
Model Complexity
• One major issue with model based testing is the complexity
of the models required for most programs:
– Difficult to create the model
– Difficult to enter the model in a machine readable form
– Maintenance is a critical issue because design changes add
or subtract nodes and transitions, forcing regeneration of
the model.

• Likely conclusions:
– Works poorly for a complex product like Word
– Likely to work well for embedded software and simple
menus (think of the brakes of your car)
– In general, well suited to a limited-functionality client that
will not be powered down or rebooted very often.
Model-Based Testing
• Strengths
– Doesn’t depend on same old test every time.
– Model unambiguously defines [part of] the product.
– Can detect failures arising out of long, complex chains that
would be hard to create as planned tests.
– Tests can be reconfigured automatically by changing the
model.
• Blind spots
– Need to be able to distinguish pass from fail.
– Model has to match the product.
– Covers some types of risks, but can obscure the need for
other tests that are not amenable to modeling or
automation.

You might also like