0% found this document useful (0 votes)
27 views4 pages

Sensitivity Specificity

This document discusses common statistical concepts related to diagnostic tests. It explains that diagnostic tests have some false positives and negatives when compared to a gold standard test. Sensitivity measures the proportion of true positives detected, while specificity measures the proportion of true negatives detected. The document provides an example comparing a D-dimer blood test to a perfusion scan for detecting pulmonary embolisms. It defines positive and negative predictive values which indicate the probability that a person truly has or does not have the disease given a positive or negative test result.

Uploaded by

yousra yatreb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views4 pages

Sensitivity Specificity

This document discusses common statistical concepts related to diagnostic tests. It explains that diagnostic tests have some false positives and negatives when compared to a gold standard test. Sensitivity measures the proportion of true positives detected, while specificity measures the proportion of true negatives detected. The document provides an example comparing a D-dimer blood test to a perfusion scan for detecting pulmonary embolisms. It defines positive and negative predictive values which indicate the probability that a person truly has or does not have the disease given a positive or negative test result.

Uploaded by

yousra yatreb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Statistics

Common pitfalls in statistical analysis: Understanding the


properties of diagnostic tests – Part 1
Priya Ranganathan, Rakesh Aggarwal1
Department of Anaesthesiology, Tata Memorial Centre, Mumbai, Maharashtra, 1Department of Gastroenterology, Sanjay Gandhi
Postgraduate Institute of Medical Sciences, Lucknow, Uttar Pradesh, India

Abstract In this article in our series on common pitfalls in statistical analysis, we look at some of the attributes
of diagnostic tests (i.e., tests which are used to determine whether an individual does or does not have
disease). The next article in this series will focus on further issues related to diagnostic tests.

Keywords: Biostatistics, predictive values, sensitivity, specificity

Address for correspondence: Dr. Priya Ranganathan, Department of Anaesthesiology, Tata Memorial Centre, Ernest Borges Road, Parel, Mumbai ‑ 400 012,
Maharashtra, India.
E‑mail: [email protected]

INTRODUCTION individuals. In this article, we explain some of the attributes


of diagnostic tests, and some issues related to their clinical
Diagnostic tests are used to differentiate between interpretation. Another article in the next issue will focus
individuals with and without a particular disease. However, on some additional issues related to diagnostic tests.
most diagnostic tests are imperfect, and provide some
false‑positive (the test is positive though the individual does Let us look at the example of diagnosis of pulmonary
not have the disease) and false‑negative (the test is negative embolism. A perfusion scan is the gold standard for its
though the individual has the disease) results. diagnosis, but is often not available. Also, it is costly and
invasive. Hence, we wish to use a blood test (D‑dimer level)
Most diseases have a gold standard diagnostic test, which for the detection of pulmonary embolism. To assess the
is used to establish a diagnosis. This concept has some performance of this test for the diagnosis of pulmonary
limitations, but let us assume for now that such a “gold embolism, one would perform this test in a group of
standard” does exist for the disease that we are studying. patients suspected to have pulmonary embolism who have
However, such gold standard tests are usually difficult to also undergone the perfusion scan [Table 1].
perform, costly, invasive, time-consuming, or not easily
accessible. Hence, we often look to substitute the gold SENSITIVITY AND SPECIFICITY
standard with another test, in order to decrease costs,
minimize invasiveness or save time, etc. In these cases, Sensitivity and specificity are the most commonly used
we are interested in knowing how the “substitute” test measures of the performance of a diagnostic test as
performs in comparison with the gold standard for compared to an existing gold standard.
differentiating between the diseased and the non-diseased
This is an open access article distributed under the terms of the Creative Commons
Access this article online Attribution‑NonCommercial‑ShareAlike 3.0 License, which allows others to remix, tweak, and
build upon the work non‑commercially, as long as the author is credited and the new creations
Quick Response Code:
Website: are licensed under the identical terms.

www.picronline.org
For reprints contact: [email protected]

DOI: How to cite this article: Ranganathan P, Aggarwal R. Common pitfalls in


10.4103/picr.PICR_170_17 statistical analysis: Understanding the properties of diagnostic tests – Part 1.
Perspect Clin Res 2018;9:40-3.

40 © 2018 Perspectives in Clinical Research | Published by Wolters Kluwer - Medknow


Ranganathan and Aggarwal: Diagnostic tests

Sensitivity is defined as the proportion of individuals with Table 1: Number of individuals in whom pulmonary embolism
the disease (as detected by the gold standard test) who was detected using the perfusion scan (gold standard)
versus the results of the blood test for D‑dimer
have a positive result on the new test. In Table 1, of the Pulmonary Pulmonary Row
ten individuals with pulmonary embolism, seven had a embolism embolism total
positive result on the D‑dimer test; therefore, the sensitivity present absent

of D‑dimer test for the detection of pulmonary embolism D‑dimer positive 7 13 20


D‑dimer negative 3 77 80
is 7/10 = 70%. Column total 10 90 100

Specificity is defined as the proportion of individuals Table 2: 2×2 contingency table for assessing the sensitivity
without the disease (as detected by the gold standard test) and specificity of a diagnostic test
who have a negative result on the new test. In Table 1, Disease present Disease absent Row totals
of the ninety individuals without pulmonary embolism, Test positive a (TP) b (FP) a+b
77 had a negative result on the D‑Dimer test; therefore, Test negative c (FN) d (TN) c+d
Column totals a+c b+d a+b+c+d
the specificity of the D‑dimer test for the detection of Sensitivity=TP/(TP + FN) = a/(a + c), Specificity=TN/(FP + TN)
pulmonary embolism is 77/90 = 85.6%. = d/(b + d), Positive predictive value=TP/(TP + FP) = a/(a + b),
Negative predictive value=TN/(TN + FN) = d/(c + d). TP=True
positive, FP=False positive, FN=False negative, TN=True negative
If we were to replace the cells in the example above with
generic terms, we get a 2 × 2 contingency table [Table 2],
POSITIVE AND NEGATIVE PREDICTIVE VALUES
which can be used for all diagnostic tests.
Predictive values refer to the ability of a test result to
A highly sensitive test will be positive in almost everyone confirm the presence or absence of a disease, based on
with the disease of interest, but may also be positive in whether it is positive or negative, respectively.
some individuals without the disease. However, it would
hardly ever be negative in a person with the disease. Thus, Referring to the previous example, of the twenty individuals
if a highly sensitive test is negative, it almost definitely in whom D‑dimer test was positive, only seven actually had
rules out the disease (hence the mnemonic SNOUT: pulmonary embolism; therefore, the positive predictive
SnNOut = Sensitive…Negative…Rules Out). value (PPV; also sometimes more appropriately referred
to as the predictive value of a positive test result) of this
A highly specific test will be negative in almost everyone test is 7/20, or 35%. PPV reflects the probability that an
without the disease, but may be negative in some with the individual with a positive test result truly has the disease.
disease. However, it would hardly ever be positive in an
individual without the disease. If a highly specific test is Similarly, of the eighty individuals with a negative D‑dimer
positive, it almost definitely rules in the disease (hence the test, 77 did not have pulmonary embolism; therefore,
mnemonic SPIN: SpPIn = Specific…Positive…Rules In). the negative predictive value (NPV; or predictive value
of a negative test result) is 77/80, or 96%. NPV is the
Sensitivity and specificity are useful attributes for comparing probability that an individual with a negative test result
a new test against the gold standard test. However, these truly does not have the disease.
measures have some limitations. First, the sensitivity is
calculated based on individuals with the disease and fails RELATIONSHIP OF POSITIVE PREDICTIVE
VALUE AND NEGATIVE PREDICTIVE VALUE
to give any information about people without the disease.
WITH THE PREVALENCE OF DISEASE
Similarly, specificity is calculated based on individuals
without the disease and does not tell us anything about It is important to note that sensitivity and specificity are
individuals with the disease. Second, in the clinic, we see properties of a test and are usually not influenced by the
a patient with a particular set of symptoms and are unsure prevalence of disease in the population. By contrast, PPV
whether he has the disease or not. We then do the test and and NPV are heavily influenced by the prevalence of the
obtain its result. What we need at that point is not what disease in the population tested/studied. With all the other
proportion of individuals with the disease have the test factors remaining constant, the PPV increases with increasing
positive; instead, we want to predict whether the particular prevalence and NPV decreases with increase in prevalence.
individual has the disease, based on the positive or negative
test result. This can be done much better using measures To illustrate this, let us look at the use of D‑dimer
that are referred to as the predictive values of the test result. test among three separate groups of individuals : all

Perspectives in Clinical Research | Volume 9 | Issue 1 | January-March 2018 41


Ranganathan and Aggarwal: Diagnostic tests

patients admitted to a hospital (with a hypothetical Table 3a: Performance of D‑dimer test for pulmonary
prevalence of pulmonary embolism of 1%), cancer embolism in 1000 unselected inpatients in a hospital (with
hypothetical disease prevalence of 1%)
patients undergoing chemotherapy (10%), and critically
Pulmonary Pulmonary Total
ill cancer patients in an intensive care unit (30%). Let embolism embolism
us do D‑dimer test in 1000 individuals from each of present absent
these groups. D‑dimer positive 7 143 150
D‑dimer negative 3 847 850
Total 10 990 1000
Among the 1000 inpatients, the prevalence being 1%, 10
will have pulmonary embolism. By comparison, in the other
Table 3b: Performance of D‑dimer test for pulmonary
two groups, 100 and 300 patients, respectively, will have embolism among 1000 cancer patients in a hospital (with
pulmonary embolism. We have already established that the hypothetical disease prevalence of 10%)
D‑dimer test is 70% sensitive and 85.6% specific. Using Pulmonary Pulmonary Total
these numbers, the number of individuals with positive and embolism embolism
present absent
negative test results in the three groups can be calculated
D‑dimer positive 70 130 200
and are shown in Table 3a‑c, respectively. D‑dimer negative 30 770 800
Total 100 900 1000
Let us now calculate PPV and NPV in each situation.
Thus, when the test is done in all inpatients, its PPV Table 3c: Performance of D‑dimer test for pulmonary
is 7/150 = 4.7% and the NPV is 847/850 = 99.6% embolism among 1000 critically ill cancer patients in an
Intensive Care Unit (with hypothetical disease prevalence of 30%)
[Table 3a]. By comparison, when it is done in all cancer
Pulmonary Pulmonary Total
patients, the PPV is 70/200 = 35.0% and the NPV is embolism embolism
770/800 = 96.2% [Table 3b]. Further, when it is done in present absent
critically ill cancer patients, the PPV is 210/311 = 67.5% D‑dimer positive 210 101 311
and the NPV is 599/689 = 86.9% [Table 3c]. It is apparent D‑dimer negative 90 599 689
Total 300 700 1000
that the values of PPV and NPV are quite different in
the three situations.
radiographs, interpretation depends on the experience of
The above example shows us the importance of likelihood the assessor, and the sensitivity and specificity of the test
of the disease of interest in the individual in whom the test can vary, depending on the accuracy of reporting. For tests
has been done (also referred to as the pretest probability which report on a continuous scale, for example, random
of disease). Thus, even a test with good sensitivity and blood sugar for the diagnosis of diabetes, choosing a
specificity has low PPV when used in a population where cutoff point to define disease can change the sensitivity
the likelihood of the disease is low (low pretest probability, and specificity. We will discuss this in the next article.
as in all inpatients in the example in Table 3a). This is not EXAMPLES OF DIAGNOSTIC TESTS IN PRACTICE
infrequent. When a test is initially developed, it is costly and
is used primarily in those with a high likelihood of disease. Enzyme‑linked immunoassay (ELISA) tests are generally used
However, later, when the test becomes cheaper and more as the initial screening tests for HIV infection. This is because
widely available, it is often used more indiscriminately even they are highly sensitive (and therefore pick up most people
among those with a low likelihood of disease, resulting in with infection). Since the sensitivity is high, a negative ELISA
a lower PPV. In view of this phenomenon, it is prudent almost certainly rules out infection (recall the SnNOUT
to apply a diagnostic test only in those with a high pretest mnemonic). However, the problem with highly sensitive tests
probability of the disease (based on symptoms and signs). is that they may also have a number of false‑positive results.
Similarly, among persons with a strong suspicion of Therefore, anyone with a positive ELISA should be subjected
disease (high pretest probability), the NPV of a test may to another test with a high specificity such as polymerase chain
not be high,– i.e., in this situation, even a negative test result reaction to confirm the presence of HIV infection.
may not reliably rule out a disease.
Low‑dose computed tomography scan (LDCT) has been
Some tests have a clear dichotomous result – the test is recommended as a screening tool for lung cancer. This
either positive or negative – for example, presence or is a highly sensitive test (sensitivity reported from 80%
absence of pus cells in urine or of HBsAg in the blood. to 100%) – this means that almost every cancerous lung
For such tests, interobserver variability is negligible. nodule will be detected on LDCT. The problem here
However, when we look at the results of tests such as chest is that the LDCT also picks up benign calcific nodules
42 Perspectives in Clinical Research | Volume 9 | Issue 1 | January-March 2018
Ranganathan and Aggarwal: Diagnostic tests

and therefore has several false positives; the specificity is aid to diagnose hip osteoarthritis. This article examines
low (reported around 20%). Since the prevalence of lung the sensitivity, specificity, PPV, and NPV of this test as a
cancer in the average population is low, using this test for diagnostic tool.[1]
screening this population will have a high NPV but a low
PPV. If we apply clinical criteria and do the test only in Financial support and sponsorship
individuals at a high probability of lung cancer (e.g., elderly, Nil.
heavy smokers, and those with hemoptysis), the pretest
probability of lung cancer is higher, and the test would Conflicts of interest
have a higher PPV. There are no conflicts of interest.

REFERENCE
SUGGESTED READING
1. Kim C, Nevitt MC, Niu J, Clancy MM, Lane NE, Link TM,
The readers may want to read an article by Kim and et al. Association of hip pain with radiographic evidence of hip
colleagues who assessed the use of hip radiographs as an osteoarthritis: Diagnostic test study. BMJ 2015;351:h5983.

Perspectives in Clinical Research | Volume 9 | Issue 1 | January-March 2018 43

You might also like