Evaluating
Diagnostic Tests
Payam Kabiri, MD. PhD.
Clinical Epidemiologist
Tehran University of Medical Sciences
Seven question to evaluate the
utility of a diagnostic test
Can the test be reliably performed?
Was the test evaluated on an appropriate
population?
Was an appropriate gold standard used?
Was an appropriate cut-off value chosen
to optimize sensitivity and specificity?
Seven question to evaluate the
utility of a diagnostic test
What are the positive and negative
likelihood ratios?
How well does the test perform in specific
populations?
What is the balance between cost of the
disease and cost of the test?
Which one of these test is the best
for SLE Dx?
Test Sensitivity% Specificity%
ANA 99 80
dsDNA 70 95
ssDNA 80 50
Histone 30-80 50
Nucleoprotein 58 50
Sm 25 99
RNP 50 87-94
PCNA 5 95
Diagnostic Tests Characteristics
Sensitivity
Specificity
PredictiveValue
Likelihood Ratio
5
Validity of Screening Tests
True Disease Status
Results of + -
Screening Test
+ a b
- c d
Sensitivity: The probability of testing
positive if the disease is truly present
Sensitivity = a / (a + c)
6
Validity of Screening Tests
True Disease Status
Results of + -
Screening Test
+ a b
- c d
Specificity: The probability of screening
negative if the disease is truly absent
Specificity = d / (b + d)
7
Two-by-two tables can also be used for
calculating the false positive and false
negative rates.
The false positive rate = false positives /
(false positives + true negatives). It is also
equal to 1- specificity.
The false negative rate = false negatives /
(false negatives + true positives). It is also
equal to 1 – sensitivity.
An ideal test maximizes both sensitivity
and specificity, thereby minimizing the
false positive and false negative rates.
Validity of Screening Tests
Breast Cancer
Physical Exam + -
and Mammo-
graphy + 132 983
- 45 63650
Sensitivity: a / (a + c)
Sensitivity =
Specificity: d / (b + d)
Specificity =
10
Validity of Screening Tests
Breast Cancer
Physical Exam + -
and Mammo-
graphy + 132 983
- 45 63650
Sensitivity: a / (a + c)
Sensitivity = 132 / (132 + 45) = 74.6%
Specificity: d / (b + d)
Specificity = 63650 / (983 + 63650) = 98.5%
11
2 X 2 table
Disease
+ -
Positive
+
predictive
Test value
Sensitivity
Natural Frequencies Tree
Population
100
In Every 100 People, 4 Will Have The Disease
Population
100
Disease + Disease -
4 96
If these 100 people are representative of the population at
risk, the assessed rate of those with the disease (4%)
represents the PREVALENCE of the disease – it can also be
considered the PRE-TEST PROBABILITY of having the disease
OF THE 4 PEOPLE WITH THE DISEASE, THE TEST
WILL DETECT 3
Population
100
Disease + Disease -
4 96
Test + Test - In other words, the
sensitivity is 75%
3 1
AMONG THE 96 PEOPLE WITHOUT THE DISEASE, 7
WILL TEST POSITIVE
Population In other words, the
specificity is 93%
100
Disease + Disease -
4 96
Test + Test - Test + Test -
3 1 7 89
AMONG THOSE WHO TEST POSITIVE, 3 IN 10 WILL
ACTUALLY HAVE THE DISEASE
Population This is also the
POST-TEST PROB-
100 ABILITY of having
the disease
Disease + Disease -
4 96
Test + Test + POSITIVE
PREDICTIVE
3 7 VALUE = 30%
Test - Test -
1 89
AMONG THOSE WHO TEST NEGATIVE, 89 OF 90 WILL
NOT HAVE THE DISEASE
Population
100
Disease + Disease -
4 96
Test + Test +
3 7
NEGATIVE Test - Test -
PREDICTIVE
VALUE = 99% 1 89
CONVERSELY, IF SOMEONE TESTS NEGATIVE, THE
CHANCE OF HAVING THE DISEASE IS ONLY 1 IN 90
Population
100
Disease + Disease -
4 96
Test + Test +
3 7
Test - Test -
1 89
PREDICTIVE VALUES AND CHANGING PREVALENCE
Population
1000
Disease + Disease -
4 996
Prevalence reduced by an order
of magnitude from 4% to 0.4%
PREDICTIVE VALUE AND CHANGING PREVALENCE
Sensitivity and Population
Specificity 1000
unchanged
Disease + Disease -
4 996
Test + Test +
3 70
Test - Test -
1 926
POSITIVE PREDICTIVE VALUE AT LOW PREVALENCE
Population
Previously, PPV
1000
was 30%
Disease + Disease -
4 996
Test + Test + POSITIVE
PREDICTIVE
3 70 VALUE = 4%
Test - Test -
1 926
NEGATIVE PREDICTIVE VALUE AT LOW PREVALENCE
Population
Previously, NPV
1000
was 99%
Disease + Disease -
4 996
Test + Test +
3 70
NEGATIVE Test - Test -
PREDICTIVE
VALUE >99% 1 926
Prediction Of Low Prevalence
Events
Even highly specific tests, when applied to
low prevalence events, yield a high number
of false positive results
Because of this, under such circumstances,
the Positive Predictive Value of a test is low
However, this has much less influence on
the Negative Predictive Value
Relationship Between
Prevalence and Predictive Value
1
Difference between
PPV and NPV
Predictive Value
0.8
relatively small
0.6
0.4
Difference between
0.2 PPV PPV and NPV
relatively large
NPV
0
0.05 0.2 0.4 0.6 0.8 0.95
Pre-test Probability (Prevalence)
Based on a test with 90% sensitivity and 82% specificity
Relationship Between
Prevalence And Predictive Value
Predictive Value
Prevalence
Based on a test with 75% sensitivity and 93% specificity
Performance of A Test With
Changing Prevalence
A : Sensitivity =
Specificity = 0.9
POST-TEST PROBABILITY
LR+ = 9.0
B : Sensitivity =
Specificity = 0.7
LR+ = 3.0
C : Sensitivity =
Specificity = 0.5
LR+ = 1.0
2 X 2 table
Sensitivity
FALSE
NEGATIVES
Sensitivity
The proportion of people with the diagnosis (N=4) who are
correctly identified (N=3)
Sensitivity = a/(a+c) = 3/4 = 75%
Specificity
FALSE
POSITIVES
Specificity
The proportion of people without the diagnosis (N=96) who
are correctly identified (N=89)
Specificity = d/(b+d) = 89/96 = 93%
Value of a diagnostic test depends
on the prior probability of disease
Prevalence Prevalence
(Probability) = 5% (Probability) = 90%
Sensitivity = 90% Sensitivity = 90%
Specificity = 85% Specificity = 85%
PV+ = 24% PV+ = 98%
PV- = 99% PV- = 49%
Test not as useful
when disease unlikely
Test not as useful
when disease likely
31
A Test With Normally
Distributed Values
Assessing the performance of
Test cut-off the test assumes that these
two distributions remain
constant. However, each of
them will vary (particularly
through spectrum or selection
bias)
% of Group
NON-DESEASED
DISEASED
Negative Positive
Degree of ‘positivity’ on test
Performance of A Diagnostic
Test
NON-CASES CASES
FALSE
NEGATIVES Test cut-off
FALSE
POSITIVES
% of Group
NON-DESEASED
DISEASED
Negative Positive
Degree of ‘positivity’ on test
Minimising False Negatives: A
Sensitive Test
NON-CASES CASES
Cut-off shifted to minimise
Test cut-off false negatives ie to
optimise sensitivity
CONSEQUENCES:
- Specificity reduced
% of Group
-A Negative result from a
seNsitive test rules out the
NON- diagnosis - snNout
DESEASED
DISEASED
Negative Positive
Degree of ‘positivity’ on test
Minimising False Positives: A
Specific Test
Cut-off shifted to
minimise false positives
Test cut-off ie to optimise specificity
CONSEQUENCES:
- Sensitivity reduced
% of Group
-A Positive result from a
sPecific test rules in the
diagnosis - spPin
NON-DESEASED
DISEASED
Negative Positive
Degree of ‘positivity’ on test
Receiver Operating Characteristics (ROC)
Non-diseased Diseased
Threshold
Evaluation Result Value
Or
Subjective Judgment Of Likelihood That Case Is Diseased
Non-diseased Diseased
Centers Centers
Threshold
Test result value
or
subjective judgment of likelihood that case is diseased
more typically:
Non-diseased Diseased
Centers Centers
Cutoff point
more typically:
Non-diseased Diseased
cases cases
FP rate
more typically:
Non-diseased Diseased
Centers Centers
TP rate
Non-diseased
Centers
TPF, sensitivity
Threshold
less aggressive
mindset
Diseased
Centers
FPF, 1-specificity
Non-diseased
cases
TPF, sensitivity
moderate
mindset
Threshold
Diseased
cases
FPF, 1-specificity
Non-diseased
cases
more
TPF, sensitivity
aggressive
mindset
Threshold
Diseased
cases
FPF, 1-specificity
Non-diseased
cases
Entire ROC curve
TPF, sensitivity
Threshold
Diseased
cases
FPF, 1-specificity
Entire ROC curve
e
il n
TPF, sensitivity
c e
an
ch
FPF, 1-specificity
Check this out:
[Link]
c/[Link]
Likelihood Ratios
Pre-test & post-test probability
Pre-test probability of disease can be
compared with the estimated later probability
of disease using the information provided by
a diagnostic test.
The difference between the previous
probability and the later probability is an
effective way to analyze the efficiency of a
diagnostic method.
It tells you how much a positive or negative
result changes the likelihood that a patient
would have the disease.
The likelihood ratio incorporates both the
sensitivity and specificity of the test and
provides a direct estimate of how much a test
result will change the odds of having a
disease
The likelihood ratio for a positive result
(LR+) tells you how much the odds of the
disease increase when a test is positive.
The likelihood ratio for a negative result
(LR-) tells you how much the odds of the
disease decrease when a test is negative.
Positive & Negative Likelihood
Ratios
We can judge diagnostic tests: positive
and negative likelihood ratios.
Like sensitivity and specificity, are
independent of disease prevalence.
Likelihood Ratios (Odds)
The probability of a test result in those
with the disease divided by the
probability of the result in those without
the disease.
How many more times (or less) likely a
test result is to be found in the disease
compared with the non-diseased.
52
Positive Likelihood Ratios
This ratio divides the probability that a
diseased patient will test positive by the
probability that a healthy patient will test
positive.
The positive likelihood ratio
+LR = sensitivity/(1 – specificity)
False Positive Rate
The false positive rate = false positives /
(false positives + true negatives). It is also
equal to 1- specificity.
The false negative rate = false negatives /
(false negatives + true positives). It is also
equal to 1 – sensitivity.
Positive Likelihood Ratios
It can also be written as the
true positive rate/false positive rate.
Thus, the higher the positive likelihood
ratio, the better the test (a perfect test has
a positive likelihood ratio equal to infinity).
Negative Likelihood Ratio
This ratio divides the probability that a
diseased patient will test negative by the
probability that a healthy patient will test
negative.
The negative likelihood ratio
–LR = (1 – sensitivity)/specificity.
False Negative Rate
The false negative rate = false negatives /
(false negatives + true positives).
It is also equal to 1 – sensitivity.
Negative Likelihood Ratio
It can also be written as the
false negative rate/true negative rate.
Therefore, the lower the negative
likelihood ratio, the better the test (a
perfect test has a negative likelihood ratio
of zero).
Positive & Negative Likelihood
Ratios
Although likelihood ratios are independent
of disease prevalence, their direct validity
is only within the original study population.
Probability of Disease
Pre-test probability of disease = disease
prevalence
Post-test probability of disease =
If normal, c/(c+d)
If negative, a/(a+b)
Disease absent, Disease present,
gold standard gold standard
Test result positive True positives (a) False positives (b)
Test result negative False negatives (c) True negatives (d)
60
Bayes Theorem
Post-test Odds =
Likelihood Ratio X Pre-test Odds
Using Likelihood Ratios to Determine Post-
Test Disease Probability
Pre-test Pre-test
probability odds of
of disease disease Post-test Post-test
odds of probability
disease of disease
Likelihood
ratio
62
Pre-test & post-test probability
“Post-test probability” depends on the
accuracy of the diagnostic test and the
pre-test probability of disease
A test result cannot be interpreted without
some knowledge of the pre-test probability
Where does “pre-test
probability” come from?
Clinical experience
Epidemiological data
“Clinical decision rules”
Guess
what is the likelihood that this
patient has the disease?
A disease with a prevalence of 30% must
be diagnosed.
There is a test for this disease.
It has a sensitivity of 50% and a specificity
of 90%.
Likelihood Ratios
Sensitivity Specificity From: J Clin End & Metab. 2006;
91(11):4295-4301.
FNA Biopsy 88% 82%
Sensitivity
1 – Specificity
= 0.88 / (1 – 0.82)
= 4.89
This means that Anne’s positive FNA biopsy will
be approx. 5 times as likely to be seen with,
as opposed to without, thyroid cancer.
Prevalence of 30%
Sensitivity of 50%
Specificity of 90%
22 positive
Disease +ve 30 15 15
tests in
total of
which 15
100 have the
disease
Disease -ve
70 63 About 70%
70 – 63 = 7
Likelihood
Population
100
Disease +
4 The likelihood that
someone with the
disease will have a
Test +
positive test is ¾ or
3 75%
This is the same as
Test - the sensitivity
1
Likelihood II
Population
100
Disease -
96
The likelihood that
someone without
the disease will Test +
have a positive test 7
is 7/96 or 7%
This is the same as Test -
the (1-specificity)
89
Likelihood Ratio
Likelihood of Positive Test Given
The Disease
Likelihood Ratio =
Likelihood of Positive Test
in the Absence of the Disease
Sensitivity 0.75
= = = 10.7
1- Specificity 0.07
A Likelihood Ratio of 1.0 indicates an uninformative test (occurs
when sensitivity and specificity are both 50%)
The higher the Likelihood Ratio, the better the test (other
factors being equal)
Diagnostic Odds Ratio
Potentially useful as an
overall summary
measure, but only in
conjunction with other
measures (LR,
sensitivity, specificity)
3
The Diagnostic Odds Ratio is DOR 7
1
the ratio of odds of having the 89
diagnosis given a positive test
0.429
38.2
to those of having the 0.011
diagnosis given a negative test
Is there an
easier way?
Likelihood Ratio And Pre- And
Post-test Probabilities
For a given test with a
given likelihood ratio, the
post-test probability will
depend on the pre-test
probability (that is, the
prevalence of the condition
in the sample being
assessed)
Sensitivity Analysis of A
Diagnostic Test
Value 95% CI
Pre-test 26% to
35%
probability 44%
Sensitivity Analysis of A
Diagnostic Test
Value 95% CI
Pre-test
35% 26% to 44%
probability
Likelihood
5.0 3.0 to 8.5
ratio
Applying the 95% confidence
intervals above to the
nomogram, the post-test
probability is likely to lie in the
range 55-85%
Applying A Diagnostic Test In
Different Settings
ThePositive Predictive Value of a test will vary
(according to the prevalence of the condition in the
chosen setting)
Sensitivity
and Specificity are usually considered
properties of the test rather than the setting, and are
therefore usually considered to remain constant
However, sensitivity and specificity are likely to be
influenced by complexity of differential diagnoses and a
multitude of other factors (cf spectrum bias)
Likelihood Ratios (Odds)
This is an alternative way of describing the
performance of a diagnostic test. Similar to S
and S, and can be used to calculate the
probability of disease after a positive or
negative test (predictive value). Advantage of
this is that it can be used at multiple levels of
test results.
77
What is this second fraction?
Likelihood Ratio Positive
Multiplied by any patient’s pretest odds
gives you their posttest odds.
Comparing LR+ of different tests is
comparing their ability to “rule in” a
diagnosis.
As specificity increases LR+ increases and
PPV increases (Sp P In)
78
Clinical interpretation of post-
test probability
Probability of disease:
Don't Do further
treat for diagnostic Treat for
disease testing disease
0 1
Testing Treatment
threshold threshold
If you are here, Test
will help you to go
Disease toward one end of
Disease
ruled out this probability,
ruled in
either 0 or 1 to get
79 the final decision.
Values of Positive and Negative
Likelihood Ratios (LR)
LR Poor-fair Good Excellent
Positive
likelihood 2.1-5 5.1-10 10>
ratio
Negative
likelihood 0.5-0.2 0.19-0.1 0.1<
ratio
Likelihood Ratios & You
Allows us to determine the accuracy with which a
test identifies the target disorder
As the LR becomes larger, the likelihood of the
target disease increases:
Likelihood ratio Interpretation
>10 Strong evidence to rule in disease
5-10 Moderate evidence to rule in disease
2-5 Weak evidence to rule in disease
0.5-2 No significant change in the likelihood of disease
0.2-0.5 Weak evidence to rule out disease
0.1-0.2 Moderate evidence to rule out disease
<0.1 Strong evidence to rule out disease
Advantages of LRs
The higher or lower the LR, the higher or lower
the post-test disease probability
Which test will result in the highest post-test
probability in a given patient?
The test with the largest LR+
Which test will result in the lowest post-test
probability in a given patient?
The test with the smallest LR-
82
Advantages of LRs
Clear separation of test characteristics
from disease probability.
83
Likelihood Ratios - Advantage
Provide a measure of a test’s ability to rule
in or rule out disease independent of
disease probability
Test A LR+ > Test B LR+
Test A PV+ > Test B PV+ always!
Test A LR- < Test B LR-
Test A PV- > Test B PV- always!
84
Predictive Values
Alternate formulations:Bayes’ Theorem
PV+ =
Se Pre-test Prevalence
Se Pre-test Prevalence + (1 - Sp) (1 - Pre-test Prevalence)
High specificity to “rule-in” disease
PV- =
Sp (1 - Pre-test Prevalence)
Sp (1 - Pre-test Prevalence) + (1 - Se) Pre-test Prevalence
High sensitivity to “rule-out” disease
85
Clinical Interpretation: Predictive Values
PV+ And PV-1 Of Electrocardiographic Status2
For Angiographically Verified3 Coronary Artery
Disease, By Age And Sex Of Patient
Sex Age PV+ (%) PV- (%)
F <40 32 88
F 40-50 46 80
F 50+ 62 68
M <40 62 68
M 40-50 75 54
M 50+ 85 38
1. Based on statistical smoothing of results from 78 patients referred to NC
Memorial Hospital for chest pain. Each value has a standard error of 6-7%.
2. At least one millivolt horizontal st segment depression.
3. At least 50% stenosis in one or more main coronary vessels.
86
If Predictive value is more
useful why not reported?
Should they report it?
Only if everyone is tested.
And even then.
You need sensitivity and specificity from
literature. Add YOUR OWN pretest
probability.
87
So how do you figure pretest
probability?
Start with disease prevalence.
Refine to local population.
Refine to population you serve.
Refine according to patient’s presentation.
Add in results of history and exam (clinical
suspicion).
Also consider your own threshold for testing.
88
Pretest Probability: Clinical
Significance
Expected test result means more than
unexpected.
Same clinical findings have different
meaning in different settings
([Link] versus unscheduled visit).
Heart sound, tender area.
Neurosurgeon.
Lupus nephritis.
89
What proportion of all patients
will test positive?
Diseased X sensitivity
+ Healthy X (1-specificity)
Prevalence X sensitivity +
(1-prevalence)(1-specificity)
We call this “test prevalence”
i.e. prevalence according to the test.
Some Examples
Diabetes mellitus (type 2)
Check out this:
Some Examples from
Essential Evidence Plus
Disease Link Address
Diabetes Mellitus [Link]
(type 2)
Deep Vein [Link]
Thrombosis
Arrhythmia (Atrial [Link]
Fibrillation & Flutter)
[Link]
Which one of these test is the best
for SLE Dx?
Test Sensitivity Specificity LR(+)
ANA 99 80 4.95
dsDNA 70 95 14
ssDNA 80 50 1.6
Histone 30-80 50 1.1
Nucleoprotein 58 50 1.16
Sm 25 99 25
RNP 50 87-94 3.8-8.3
PCNA 5 95 1
Was it clear enough !
Key References
Sedlmeier P and Gigerenzer G. Teaching Bayesian
reasoning in less than two hours. Journal of Experimental
Psychology: General. 130 (3):380-400, 2001.
Knotternus JA (ed). The Evidence Base of Clinical
Diagnosis. London: BMJ Books, 2002.
Sackett DL, Haynes RB, Guyatt G, and Tugwell P. Clinical
Epidemiology : A Basic Science for Clinical Medicine.
Boston, Mass: Little, Brown & Co, 1991.
Loong TW. Understanding sensitivity and specificity with
the right side of the brain. BMJ 2003: 327: 716-19.
! بزنیدEmail اگر میل داشتید
[Link]@[Link]