0% found this document useful (0 votes)
72 views156 pages

05 Body

Uploaded by

aldalbahi.777
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views156 pages

05 Body

Uploaded by

aldalbahi.777
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd

Technical Review

Chapter 1. Introduction
The often cited Institute of Medicine Report, To Err is Human: Building a Safer
Health System1 crystallized widespread public concern about the need to take action to
reduce the occurrence of apparently common, serious medical errors. Achieving this goal
involves identifying errors in practice, and undertaking initiatives to avoid and prevent
them. It also requires national and regional attention to monitor and report to the public
about patient safety. Widespread consensus exists that health care organizations can
reduce patient injuries by learning from successful safety-improvement initiatives in
other industries. Such initiatives have focused on systematically reducing opportunities
for errors to occur, by improving the environment for safety. These diverse steps range
from technical changes, such as implementing electronic medical record systems, to
cultural ones, such as improving staff awareness of patient safety risks. Clinical process
interventions also have strong evidence for reducing the risk of adverse events related to
a patient’s exposure to hospital care.2 However, local and national initiatives may be
better prioritized and evaluated through the use of adequate data on patient safety
problems. This report reviews previous studies and presents new empirical evidence on
one potentially important source of such data: computerized hospital discharge abstracts
from the Agency for Healthcare Research and Quality (AHRQ) Healthcare Cost and
Utilization Project (HCUP). Analyses of these and similar inexpensive, readily available
administrative data sets may provide a screen for potential medical errors, and a method
for monitoring trends over time.

Using Administrative Data

Although prior studies of the utility of routinely available administrative data sets,
like the HCUP Nationwide Inpatient Sample (NIS), leave many questions unanswered
and raise some important concerns, the careful use of these sources of information holds
promise for screening in order to target further data collection and analysis. The ability to
assess all patients at risk for a particular patient safety problem, along with the relative
low cost, are particular strengths of these data sets. However, two broad areas of concern
also hold true for these data sets. First, questions about the clinical accuracy of discharge-
based diagnosis coding lead to concerns about the interpretation of reported diagnoses
that may represent safety problems. Specifically, administrative data are unlikely to
capture all cases of a complication, regardless of the preventability, without false
positives and false negatives (sensitivity and specificity). Further, when the codes are
accurate in defining an event, the clinical vagueness inherent in the description of the
code itself (e.g., “hypotension”), may lead to a highly heterogeneous pool of clinical
states represented by that code. A final issue in accuracy of any data source used for
identifying patient safety problems is the possibility of incomplete reporting, as medical
providers might fear adverse consequences to reputation, disciplinary action, and lawsuits
as a result of “full disclosure” in potentially public records such as discharge abstracts.
A second area of concern relates to the limited information about the ability of
these data to distinguish adverse events in which no error occurred from true medical
errors. A number of factors, such as the heterogeneity of clinical conditions included in

13
some codes, lack of information about event timing available in these data sets, and
limited clinical detail for risk adjustment, contribute to the difficulty in identifying
complications that represent medical error or may be at least in some part preventable.
These factors may exist for other sources of patient safety data as well. For example, they
have been raised in the context of the Joint Commission on Accreditation of Healthcare
Organizations (JCAHO) implementation of a “sentinel event” program geared at
identifying serious adverse events that may be related to underlying safety problems.
Given the importance of patient safety, it is perhaps surprising that only a
relatively limited literature exists related to the potential use of discharge data and other
widely-used data sources in documenting patient safety problems and improving patient
safety. While these limited studies have identified some discharge-based measures
applicable to addressing patient safety problems that seem highly predictive of true
errors, many discharge-based measures appear to have relatively low sensitivity and
specificity for identifying potentially preventable complications or true errors.
However, virtually all of these studies failed to account for many potentially
avoidable limitations of discharge data, including measurement error (“noise”) and bias.
Moreover, most of these studies have been conducted at the patient level, and have
focused on answering the question: does the discharge information identify a patient
safety problem in this particular case? Despite the fact that most initiatives to improve
patient safety focus on organizational or process change, almost no studies have
addressed the question: can discharge data be used to identify systematic patient safety
problems, and thereby target areas for opportunity at the level of groups of patients?

Patient Safety Indicators Evidence Project

The Evidence-based Practice Center (EPC) at the University of California San


Francisco and Stanford University (UCSF-Stanford), with collaboration from the
University of California Davis, contracted with the AHRQ to review and improve the
evidence base related to potential patient safety indicators (PSIs) that can be developed
from administrative data. The term “patient safety indicator,” for the purposes of this
report, refers to measures that screen for potential problems that patients experience
resulting from exposure to the health care system, and that are likely amenable to
prevention by changes at the level of the system. The key intent of the PSIs are thus as a
“screening tool” or “starting point” for further analysis to reduce “potentially preventable
errors” through system or process changes.
In addition to the need for data to guide quality improvement initiatives, there is a
public mandate to monitor patient safety as part of quality in general. Measures are
needed for aggregate statistical reporting, as planned for the National Quality Report. The
PSIs developed and evaluated by the EPC will be shared with the AHRQ directed task
force charged to develop this national report regarding national, regional (e.g., Northeast,
South, Midwest,West) and state statistics about health care quality and patient safety.
This report follows the approach of a previous quality indicator development and
evaluation project described in a companion technical report from the EPC, and
published by AHRQ (available at: http://www.achq.gov/data/hcup/qirefine.htm).3
Similarly, this report takes a multifaceted approach to evaluating the validity of potential
indicators, applying the same validation framework. This report documents the

14
background literature review and empirical analyses performed to develop
recommendations for and provide information about AHRQ PSIs. In addition, the project
included consultation with expert coders from the American Health Information
Management Association (AHIMA), and clinical panel reviews based on a process
adapted from RAND and the University of California Los Angeles (RAND/UCLA)
Appropriateness Method. We present new evidence on the ability of a broad range of
discharge-based PSIs to identify systematic differences across hospitals, and potentially
to monitor trends on a national or regional basis. The research reported here reflects an
examination of the face validity of these indicators, and as such is subject to limitations.
Primarily, due to the paucity of evidence available in the literature, this review relied on
the expert opinion of clinician panels. The limitations are fully discussed in the final
chapter of this report. Further research will be needed to establish the validity of these
indicators in identifying potential patient safety concerns.
The PSIs developed here follow some of the same goals as the refined quality
indicators (QIs) reviewed in the companion report. AHRQ QIs (referred to as HCUP II
Quality Indicators in the companion report)3 were developed as a screening tool to
provide an accessible and low-cost approach to identifying potential problems in quality
of care for organizations that lack the resources to develop their own quality assessment
program. The initial version of the QI software was based mostly on quality measures
already reported in the literature. The principal requirement was that the measures could
be derived from common denominator discharge data sets comprised of variables that are
available from most state-level hospital administrative data. Data elements in these sets
include, but may not be limited to, International Classification of Disease, Clinical
Modification (ICD-9-CM) discharge diagnosis and procedure codes; dates of admission,
discharge and major procedures; age; gender; and diagnostic related group (DRG). In
addition, the measures could not require linkages outside the hospital stay (e.g., post-
hospital mortality or readmissions) because most state databases do not accommodate
such linkages. The HCUP State Inpatient Databases (SID) is an example of such a
common denominator discharge data set, and was used for the development of the AHRQ
PSIs, reported here. While similar goals for the development of the previous AHRQ QIs
apply to the PSIs reported here, the relevant literature is considerably less extensive.
Consequently, we review the literature in a more general way for indicators as a whole,
and for specific indicators we only review those studies validating the indicator use,
rather than the clinical soundness of the concept of the indicator. As a result, we devote
more attention to the development and validation of the most promising PSIs.
The report reviews the methods applied in our survey of discharge-based patient
safety indicators, further development and selection of indicators, detailed clinician panel
review, and empirical analysis of the most promising indicators. The bulk of the report
then presents the results of these activities. We conclude with recommendations about
how the most promising discharge-based PSIs can be applied and improved.

Anticipated Uses of Evidence Report

The approach to identification and evaluation of PSIs presented in this report


serves as the basis for development of Version 1.0 of AHRQ PSI software. The primary
goal of the report is to document the evidence, both from the literature, clinician review

15
and data analysis, on suitable PSIs that can be derived from hospital discharge abstract
data. By transparently inventorying and evaluating potential indicators and risk
adjustment strategies, we anticipate that this report will provide detailed context for users
who apply these measures to facilitate identifying promising areas for researching and
improving patient safety in a number of settings. The clear message throughout this
report is that these indicators are developed for use as an initial screen that can target
promising areas for in-depth review.
The discharge-based PSIs may be useful screens for organizations, purchasers,
and policymakers to identify problems at the hospital level, as well as to document
systematic area level differences in potentially preventable adverse events or patient
safety problems. Additionally, PSI rates would be amenable to monitoring over time by
region (e.g., geographical area, nation), setting (e.g., urban vs. rural) or specific hospital
type (e.g., teaching vs. community, large vs. small). The PSI rates calculated at the state
or national level would also be useful to individual hospitals seeking to compare their
own performance to a benchmark. However, these measures are not designed, nor are
they suitable for public reporting for the purpose of comparing providers because of the
limitations of discharge-based data sources, although public reporting at the aggregate
level (e.g., state or national) may be appropriate. Further discussion of the appropriate
uses of these indicators is included in Chapter 4, Conclusions.
Finally, this report may also serve as a reference for background material on
patient safety measurement using routinely collected administrative data, and as a
summary for the current state of discharge-based patient safety indicators and risk
adjustment methods. In addition to the companion technical report on quality indicators,
it documents a novel integration of evidence-based methods with other approaches to
develop and evaluate health care measures related to patient safety.

16
Chapter 2. Methodology
Section 2A. Conceptual Framework and Definitions
In approaching the task of evaluating patient safety indicators based on
administrative data, we developed a conceptual framework and standardized definitions
of commonly used terms. In the literature, the distinctions between medical error, adverse
events, complications of care, and other terms pertinent to patient safety are not well
established and are often used interchangeably. In this report, the terms medical error,
adverse events or complications, and similar concepts are defined as follows:

 Quality: “Quality of care is the degree to which health services for individuals and
populations increase the likelihood of desired health outcomes and are consistent with
current professional knowledge.” In this definition, “the term health services refers to a
wide array of services that affect health…(and) applies to many types of health care
practitioners (physicians, nurses, and various other health professionals) and to all
settings of care…”4

 Quality indicators: Screening tools for the purpose of identifying potential areas of
concern regarding the quality of clinical care. For the purpose of this report, we focus on
indicators that reflect the quality of care inside hospitals. Quality indicators may assess
any of the four system components of health care quality, including patient safety (see
below), effectiveness (i.e., “providing services based on scientific knowledge to all who
could benefit, and refraining from providing services to those not likely to benefit),
patient centeredness, and timeliness (i.e., “minimizing unnecessary delays").4

 Patient safety: “Freedom from accidental injury,” or “avoiding injuries or harm to


patients from care that is intended to help them.” Ensuring patient safety “involves the
establishment of operational systems and processes that minimize the likelihood of errors
and maximizes the likelihood of intercepting them when they occur.” 5

 Patient safety indicators: Specific quality indicators which also reflect the quality of
care inside hospitals, but focus on aspects of patient safety. Specifically, PSIs screen for
problems that patients experience as a result of exposure to the healthcare system, and
that are likely amenable to prevention by changes at the system or provider level.

 Medical error: “The failure of a planned action to be completed as intended (i.e.,


error of execution) or the use of a wrong plan to achieve an aim (i.e., error of planning).”1
The definition includes errors committed by any individual, or set of individuals, working
in a health care organization.

 Complication or adverse event: “An injury caused by medical management rather


than by the underlying disease or condition of the patient.”6 In general, adverse events
prolong the hospitalization, produce a disability at the time of discharge, or both. Used in
this report, complication does not refer to the sequelae of diseases, such as neuropathy as

17
a “complication” of diabetes. Throughout the report, “sequelae” is used to refer to these
conditions.

 Preventable adverse event: An adverse event attributable to error is a “preventable


adverse event.”6 A condition for which reasonable steps may reduce (but not necessarily
eliminate) the risk of that complication occurring.

 Case finding indicators: Indicators for which the primary purpose is to identify
specific cases in which a medical error may have occurred, for further investigation.

 Rate based indicators: Indicators for which the primary purpose is to identify the
rate of a complication rather than to identify specific cases.

While the definitions above are intended to distinguish between events that are
less preventable, from those that are more preventable, the difference is best described as
a spectrum. To conceptualize this spectrum we developed the following three categories
of conditions:

1. Conditions which could be either a comorbidity or a complication. These


conditions, inasmuch as they are present on admission, and not caused by medical
management, but rather due to the patient’s underlying disease, include conditions
such as congestive heart failure. It is extremely difficult to distinguish
complications from comorbidities for these conditions using administrative data.
As a result, these conditions were not considered in this report.

2. Conditions which are likely to reflect medical error. These conditions, such as
foreign body accidentally left during a procedure, are likely to have been caused
by medical error. Most of these conditions appear infrequently in administrative
data, and thus rates of events lack the precision to allow for comparisons between
providers. However, these conditions may be the subject of case finding
indicators.

3. Conditions which conceivably, but not definitively reflect medical error. These
conditions represent a spectrum of preventability between the previous two
categories from those which are mostly unpreventable to those which are mostly
preventable (i.e., category 2 above). Because of the uncertainty regarding the
preventability of these conditions and the likely heterogeneity of cases with the
condition, indicators utilizing these conditions are less useful as case finding
indicators. However, examining the rate of these conditions may highlight
potential areas of concern.

Evaluation Framework

To evaluate the soundness of each indicator we applied the same framework as


was applied in the companion QI report.3 This included six areas of evidence:

18
Framework for Evaluating the Quality Indicators

1. Face validity: Does the indicator capture an aspect of quality that is widely
regarded as important and subject to provider or public health system
control? Consensual validity expands face validity beyond one person to
the opinion of a panel of experts.
2. Precision: Is there a substantial amount of provider or community level
variation that is not attributable to random variation?
3. Minimum bias: Is there either little effect on the indicator of variations in
patient disease severity and comorbidities, or is it possible to apply risk
adjustment and statistical methods to remove most or all bias?
4. Construct validity: Does the indicator perform well in identifying true (or
actual) quality of care problems?
5. Fosters real quality improvement: Is the indicator insulated from perverse
incentives for providers to improve their reported performance by avoiding
difficult or complex cases, or by other responses that do not improve
quality of care?
6. Application: Has the measure been used effectively in practice? Does it
have potential for working well with other indicators?

A full discussion of this framework is available in the companion QI report.3


Since the literature surrounding PSIs is sparse, this report uses a variety of techniques to
evaluate each indicator. Specifically, face validity (consensual validity) was evaluated
using a structured panel review (Section 2D. Clinician Panel Review Methods), minimum
bias was explored empirically (Section 3E. Comparative Empirical Results) and briefly
during the panel review, and construct validity was evaluated using the limited literature
available (Section 3A. Literature Review Results).
The relative importance of each of these evaluation areas may differ for the PSIs
as compared to the QIs. For indicators which are primarily designed to screen only for
medical error, precision and minimum bias may be less important, since these events are
relatively rare, and in general are better utilized as case-finding indicators. For these
indicators comparisons between rates are less relevant. However, for rate-based
indicators, concerns of precision and minimum bias remain, if indicators are used in any
comparison of rates (comparison to national averages, peer group, etc.).

Section 2B. Literature Review Methods


The literature searches performed in connection with assessing potential HCUP
QIs in previous work3 identified many references relevant to potential PSIs. In addition,
we performed the electronic searches outlined below for articles published before
February 2002 followed by hand searching the bibliographies of identified references.
Members of the project team were queried to supplement this list, based on their personal

19
knowledge of recent work in the field. Because Iezzoni et al.’s Complications Screening
Program (CSP)7 included numerous candidate indicators, we also performed an author
search using her name. Forthcoming articles and Federal reports in press, but not
published, were also included when identified through personal contacts. The search
strategy is shown in Table 1.

Table 1. Electronic Search Strategy for Articles Pertaining to Patient Safety Indicators
MEDLINE Search String
EMBASE Search String
1) medical error [mh] OR iatrogenic 1) iatrogenic disease [em] OR health
disease [mh] OR sentinel survey [em] OR danger, risk, safety
surveillance [mh] OR safety [mh] & related phenomenon[em] OR drug
safety [em] OR error[em]/all
exploded
2) (adverse [ti] AND events [ti]) OR 2) (adverse AND events).ti OR
complications [ti] OR iatrogenesis complication$.ti OR iatrogen$.ti OR
[ti] OR iatrogenic [ti] mistake$.ti OR error$.ti
3) epidemiologic studies [mh] OR 3) health care quality[em] OR
quality of health care [mh] OR epidemiology[em]
comparative study [mh] OR
disease/classification [mh]
4) (#1 OR #2) AND #3 4) (#1 OR #2) AND #3
5) health services research [mh] OR 5) health services research[em] OR
abstracting and indexing [mh] OR documentation[em] OR medical
medical records [mh] OR medical record[em] OR medical audit[em] OR
audit [mh] OR hospitalization [mh] hospitalization[em] OR child
OR patient readmission [mh] OR hospitalization[em] OR hospital
patient discharge [mh] admission[em]
6) reproducibility of results [mh] OR 6) reproducibility[em] OR
sensitivity and specificity [mh] reproducib$.kw OR (sensitive$ or
specific$).kw
7) #4 AND #5 AND #6 7) #4 AND #5 AND #6
8) #7 BUTNOT (case report [mh] OR
case* [ti] OR report [ti] OR editorial
[pt] OR comment [pt] OR letter [pt])
Limits: English Language
MEDLINE and EMBASE database search from January, 1990 to February, 2002.
Abbreviations: [mh] = [MeSH terms], [ti] = [Title word]

Three-hundred twenty six articles were identified from the MEDLINE search.
Articles were screened using both the titles and abstracts. To qualify for abstraction, an
article must have described, evaluated, or validated a potential indicator of medical
errors, patient safety, or potentially preventable complications based on International
Classification for Diseases -Ninth Revision-Clinical Modifications (ICD-9-CM) coded
administrative (hospital discharge or claims) data. Some indicators were also considered
if they appeared to be readily translated into ICD-9-CM, even if the original authors did
not use ICD-9-CM codes.
This search was adapted slightly and repeated using the OVID interface with
EMBASE8, limited to articles published from January 1990 through the end of first

20
quarter 2002. Our EMBASE search identified 463 references. These articles were
screened in the same manner, after elimination of articles that had already been identified
using MEDLINE9 and the other approaches described above. Only 9 additional articles
met criteria for abstraction.

Section 2C. Development of Initial Candidate List of


Indicators
Indicators that measured rates of complications at both the hospital level and area
level were considered. A flow diagram outlining the selection of indicators is included in
Section 3B. Indicator Selection. Two types of indicators were considered: hospital level
and area level. The intent of a hospital level indicator is to provide a measure of the
potentially preventable complication for patients who received their initial care and the
complication of care within the same hospitalization. On the other hand, the intent of an
area level indicator is to capture all cases of the potentially preventable complication that
occur in a given area (e.g., metropolitan service area or county). Thus, hospital level
measures typically include only cases where a secondary diagnosis code flags a
potentially preventable complication since the patient was being hospitalized for a
different principal diagnosis. In contrast, area level measures would be specified to
include principal diagnosis, as well as secondary diagnoses, for the complications of care,
thereby adding cases where a patient’s risk of the complication occurred in a separate
hospitalization. The denominator specification for these two types of indicators is
described in Section 2E. Empirical Methods.
The literature search located relatively few indicators amenable to identifying
patient safety concerns (see Appendix A) that could be defined using unlinked
administrative data. The majority of such indicators were from the Complications
Screening Program (described below).7 Several similar, but less comprehensive,
measures of potentially preventable complications were identified from other sources in
the literature.

Identifying Potential Indicators


Complications Screening Program
The Complications Screening Program (CSP) was developed by Lisa Iezzoni et
7
al. for the purpose of identifying potentially preventable complications of adult medical
and surgical hospital care, using commonly available administrative data. The algorithm
utilizes discharge abstract data, specifically, ICD-9-CM diagnosis and procedure codes,
patient age, sex, DRG, and date of procedure, to identify 28 complications “that raise
concern about the quality of care based on the rate of such occurrence at individual
hospitals.” 7 The CSP was initially developed using the clinical judgment of the
developers, complemented by “detailed consideration of the ICD-9-CM codebook, and an
extensive review” of the literature on health services research, quality assurance, and
clinical indicators.7 Each of the complications is applied to some or all of the following
specified “risk pools” separately: major surgery, minor surgery, invasive cardiac
procedure, endoscopy, medical patients, all patients. In addition, specified inclusion and

21
exclusion criteria are applied to each complication. These criteria are aimed at ensuring
that the complication developed in-hospital, as opposed to being present on admission,
and that the complication was potentially preventable.
Iezzoni and colleagues published a series of four papers in the mid 1990s on the
face validity and construct validity of the CSP.7, 10-12 First, they asked each of 29
physicians who were not involved in the development of the CSP to review 100 randomly
selected hospital discharge abstracts, including 53 flagged and 47 not flagged by the
algorithm. These physicians were asked whether “on the basis of your review, is there
anything about this summary that would make you want to review the care rendered at
hospitals with high rates of this type of case for potentially avoidable quality-of-care
problems.” Of the 30 cases targeted by a majority of physicians, the CSP flagged 28
(sensitivity=93%); of the 70 cases not targeted by a majority of physicians, the CSP
screens also did not flag 45 (specificity=64%). Second, they reported relationships
between the CSP and hospital characteristics, patient characteristics, and utilization.
Using California discharge abstract data, researchers found that patients with CSP-
defined complications were more likely to be older, to die before discharge, to have
longer lengths of stay, and to incur higher hospital charges, than cases with none of these
complications. Having a chronic condition raised the probability of experiencing a
complication (after adjusting for age), especially among major surgery patients, but the
predictive power of models that used these chronic conditions to predict complications
was relatively poor. More surprisingly, larger and major teaching hospitals, including
hospitals equipped to perform open heart surgery, appeared to have higher complication
rates than smaller and non-teaching hospitals. However, all findings appeared to be
dependent on the risk pool being examined.7, 10-12 It was also notable that hospital ranks
based on indirectly standardized CSP complication rates were not significantly correlated
with hospital ranks based on indirectly standardized Medicare mortality rates (with the
exception of medical cases, among whom the correlation was inverse). Intra-hospital
correlations across the six risk pools were weak.
Four later studies were designed to test criterion and construct validity by
validating the data used to construct CSP screens, validating the screens as a flag for
actual quality problems, and validating the replicability of hospital-level results using
different data sources.13-16 First, Iezzoni et al. trained expert coders to re-abstract ICD-9-
CM diagnosis and procedure codes on a random sample of hospital records from
Connecticut and California, and then assessed how often CSP trigger codes were
corroborated by re-review of the medical record.13 The predictive value of medical
complications was relatively poor, because 58% of the flagged complications in this risk
pool were actually present at admission. Corroboration rates were often even lower when
Iezzoni et al. used objective clinical criteria, abstracted by nurses, to diagnose
complications.14 The last two studies in this series utilized implicit physician review and
explicit nurse review to identify potential quality-of-care problems and process-of-care
failures, respectively, among CSP-flagged cases and unflagged controls. These studies
also raised concerns about the validity of the CSP, as for most indicators flagged cases
were no more likely than unflagged controls to have suffered explicit process
failures.15, 16 It should be noted that potential process failures were perhaps undetectable
by this study, because of limitations in medical record documentation. Details of the

22
performance of the individual complications are contained in Section 3A. Literature
Review Results.
The Complications Screening Program has been purchased by HCIA-Sachs (now
Solucient), although additional development and research completed by this company
was not available to the researchers of this report.

Miller et al. PSIs


Researchers at AHRQ reviewed all ICD-9-CM codes implemented in or before
1999 identifying codes that possibly describe medical errors or reflect the consequences
of such errors.17 Examples of codes identified by AHRQ include iatrogenic
pneumothorax, iatrogenic hypotension, and several “external cause-of-injury codes” (E-
codes). In addition, AHRQ researchers reviewed all codes included in the CSP indicators.
AHRQ investigators applied clinical and coding knowledge to identify those codes most
likely to identify medical error. These codes included foreign body left in during a
procedure, suture of laceration codes, and several other sentinel event codes. These
efforts at AHRQ provided the foundation for the candidate list of potential PSIs for this
report. This initial set of PSIs will be referred to in this report as the Miller et al. PSIs. 17

UCSF-Stanford EPC Development


The EPC team reviewed and updated the Miller et al. PSIs. Additions included
relevant codes from the 2000 and 2001 revisions of ICD-9-CM, and selected codes from
the CSP, such as those not clearly reflective of medical error, but representing a
potentially preventable complication. This process was guided principally by conceptual
considerations. For example, postoperative acute myocardial infarction was included
since recent evidence suggests that it is a potentially preventable complication. 2 A few
codes were also deleted from the initial list based on a review of ICD-9-CM coding
guidelines, described in Coding Clinics for ICD-9-CM and the American Hospital
Association’s ICD-9-CM Coding Handbook. For example, the code 259.3 for
hypoglycemic coma specifically excludes patients with diabetes mellitus, the population
for which this complication is most preventable. This process of updating the Miller et al.
PSIs resulted in a list of over 200 ICD-9-CM codes (valid in 2001) potentially related to
medical error.
Codes were then grouped into indicators. Where feasible, codes were compiled as
they were in the CSP, or in some cases the Miller et al. PSIs,17 depending on which
grouping yielded more clinically homogeneous groups. In most cases the resulting
indicators were not identical to the CSP indicators, although they were closely related, as
some of the specific codes included in the original CSP had been eliminated after our
review of coding guidelines. Five indicators were identical to the CSP indicators. The
remaining codes were then incorporated into the most appropriate CSP-based indicator,
or were grouped into clinically meaningful concepts to define novel indicators. Exclusion
criteria were added based on CSP methods and clinical judgment. As a result, over 40
patient safety indicators were defined that, while building on prior work, reflected
significantly changed measures to focus more narrowly on the most preventable
complications.
Indicators were defined with both a numerator (complication of interest) and a
denominator (population at risk). Different patient subpopulations have inherently

23
different risks for developing a complication, with some patients having almost no risk.
Thus, for each indicator a specified population at risk was specified as a denominator.
The intention was to restrict the complication (and consequently the rate) to a more
homogeneous population who are actually at risk for that complication. The population at
risk for the candidate indicators tended to be narrower than the combination of all risk
pools available in the CSP definitions, and was intended to reflect the population for
which the complication is more likely to reflect a potentially preventable complication. In
general, the population at risk corresponded to one risk pool (e.g., major surgery) from
the CSP, if applicable, or was defined more narrowly.

Initial Selection of Indicators

After the development of this list of potential indicators, a subset of indicators


was selected to undergo face validity testing by clinician panels (see Section 2D.
Clinician Panel Review Methods). Two sources of information guided the selection
process.
First, validation data from previous studies were reviewed and thresholds were set
for indicator retention of CSP based indicators. Four studies were identified that
evaluated the CSP indicators. Three of these studies,13-15 examined the predictive value of
each indicator in identifying a complication that occurred in-hospital, regardless of
whether this complication was due to medical error or was preventable. Coder, physician
and nurse reviewers examined medical charts and used specified criteria to judge whether
or not the flagged complication had indeed occurred during the hospitalization (as
opposed to being present on admission, or not having occurred at all). In a fourth study,16
nurses identified specific process failures that may have contributed to complications. In
order to be retained as a potential PSI, at least one of the first three studies corroborating
the ICD-9-CM code with an actual in-hospital complication needed to demonstrate a
positive predictive value of at least 75%, meaning that 3 out of 4 patients identified by
the measure did indeed have the complication of interest. In addition, the positive
predictive value of a "process failure" identified in the fourth study needed to reach or
exceed 46%, which was the average rate for surgical cases that were not flagged by any
of the CSP indicators. In other words, by this criterion, potential PSIs must have
demonstrated that approximately half or more of the patients flagged received care where
a process failure contributed to a complication, indicating a potentially preventable error.
As a result, we only retained CSP-derived indicators that were at least somewhat
predictive of objectively defined process failures, or medical errors.
Second, specific changes to previous definitions or constructs of indicators fell
into the following general categories that were considered for the initial selection by the
team of this candidate set for face validity testing, as well as discussed during the
clinician panel review process (see Section 2D. Clinician Panel Review Methods):

1. Changes to the denominator definitions (inclusion or exclusion criteria),


intended to reduce bias due to the inclusion of atypical patients or to improve
generalizability to a broader set of patients at risk.

24
2. Elimination of selected ICD-9-CM codes from numerator definitions,
intended to focus attention on more clinically significant complications, or
complications more likely to result from medical errors.
3. Addition of selected ICD-9-CM codes to numerator definitions, intended to
capture related complications that could result from the same or similar
medical errors.
4. Division of a single indicator into two or more related indicators, intended to
create more clinically meaningful and conceptually coherent indicators.
5. Stratification or adjustment by relevant patient characteristics, intended to
reflect fundamental clinical differences among procedures (e.g., vaginal
delivery with or without instrumentation) and the complications that result
from them, or fundamental differences in patient risk (e.g., decubitus ulcer in
lower-risk versus high-risk patients).

A total of 34 indicators, intended to be applied to all age groups, were retained for
face validity testing by clinician panels (Appendix A). Because of the primary intent in
the development of these indicators to detect potentially preventable complications
related to health care exposure, the final definitions for this set of indicators represented
mostly new measures that built upon previous work.

Coding Review
Concurrent with clinician panel review, we contracted with a consultant from
AHIMA to review each of the 34 indicators. The consultant, an expert in ICD-9-CM
coding guidelines, reviewed each code for accuracy of capturing the questioned
complication and population at risk, according to current coding guidelines. She
consulted additional resources, including members of the central staff of ICD-9-CM, as
appropriate. In some cases, additional codes or other refinements to the indicators were
suggested, based on current coding guidelines. For example, clarification of the
procedure codes included in the indicator "Reopening of a surgical site" revealed that the
nature of these codes was substantially different than what the team and panels had
assumed. This resulted in a change to the overall rating of this indicator.

Section 2D. Clinician Panel Review Methods


A structured review of each indicator was undertaken to evaluate the face validity
(from a clinical perspective) of the indicators. Specifically, the panels approach sought to
establish consensual validity, which “extends face validity from one expert to a panel of
experts who examine and rate the appropriateness of each item….”18 The methodology
for the structured review was adapted from the RAND/UCLA Appropriateness Method19
and consisted of an initial independent assessment of each indicator by clinician panelists
using an initial questionnaire, a conference call among all panelists, followed by a final
independent assessment by clinician panelists using the same questionnaire. The panel
process served to refine definitions of some indicators, add new measures, and dismiss
indicators with major concerns from further consideration.

25
This standardized panel approach, although differing somewhat from the
approach used in this report, was used to evaluate potential indicators of primary care
quality20, 21 as well as ambulatory care sensitive conditions.22

Panel Selection

Twenty-one professional clinical organizations were invited to submit


nominations. These organizations were selected based on the applicability of the specialty
or subspecialty to our quality indicators. Organizations that represented general
practitioners (e.g., general surgeons, internists, critical care physicians, perioperative
nurses, and critical care nurses) were asked to nominate more panelists than those
representing sub-specialties. Fifteen organizations submitted nominations: American
Association of Critical-Care Nurses; American Academy of Family Physicians;
American College of Cardiology; American College of Nurse-Midwives; American
College of Obstetricians and Gynecologists; American College of Physicians/American
Society of Internal Medicine; American College of Radiology; American College of
Surgeons; American Geriatric Society; Association of Perioperative Nurses; American
Society of Anesthesiologists; American Society of Health-system Pharmacists; American
Thoracic Society; Association of Women's Health Obstetric and Neonatal Nurses; and
National Association of Inpatient Physicians.
These professional organizations nominated a total of 162 clinicians. Each
nominee was invited to participate in the evaluation. In order to be eligible to participate,
nominees were required to spend at least 30% of their work time on patient care,
including hospitalized patients. Ninety-two nominees accepted this invitation. Five
nominees were ineligible to participate. Nominees were asked to provide information
regarding their practice characteristics, including specialty and subspecialty and setting
(i.e., urban vs. rural location, region of country, and service to underserved populations),
information regarding primary hospital of practice (i.e., funding source) and personal
information (i.e., clinical education history, academic affiliation).
For assignments to each panel, a list of applicable specialties was identified for
the indicators to be evaluated by a given panel. Panelists were selected so that each panel
had diverse membership in terms of practice characteristics and setting. Thus, when a
specific area was over-represented by the pool of eligible nominees, randomly drawn
members from that specific sub-group were contacted first to fill the panels. In addition,
conference call scheduling logistics influenced assignments. Fifty-seven of the eligible
panelists accepted the invitation to participate on specific panels. Four did not participate
in the conference call, and thus were removed from the panels. All other panelists (53)
completed the evaluation in full.

Panel Composition
Eight panels were formed. Complications of medical care indicators were
examined by two panels. Surgical complications indicators were reviewed by three
panels. Another panel assessed indicators related to procedural complications. Finally,
two panels examined obstetric complications indicators. Participants in the panels are
listed in Appendix B. All panels had diversity in the geographic location of panelists, and
the type of practice (see Table 2).

26
Table 2. Multi-specialty Panel Composition
Characteristic % (N)
Gender
Female 38% (20)
Academic Affiliationa
Yes 64% (34)
No 26% (14)
Not reported 9% (5)
Geographic Region
East 26% (14)
West 21% (11)
South 21% (11)
Midwest 32% (17)
Community
Urban 49% (26)
Suburban 19% (10)
Rural 16% (9)
Not reported 15% (8)
Funding of Primary Hospital
Private 42% (22)
Public 32% (17)
Both 6% (3)
Not Reported 21% (11)
Patient Population Served
Underserved 47% (25)
General 28% (15)
Not reported 25% (13)
1
Clinical and/or research affiliation

Initial Evaluation

After agreeing to evaluate each indicator, panelists were sent information (see
Appendix C) regarding administrative data, ICD-9-CM coding, assignment of Diagnostic
Related Groups (DRGs) and Major Diagnostic Categories (MDCs), and specific
definitions for “adverse events or complications,” “preventability,” and “medical error.”
The definitions of these terms, including distinctions are available in Appendix C and in
Section 2A. Framework and Definitions. Panelists were presented with four to five
indicators. The standardized text used to describe each ICD-9-CM code was presented
along with the specific numeric code. Exclusion and inclusion criteria were also given, as
well as the clinical rationale for the indicator and the specification criteria. Panelists were
provided potential questions regarding the indicator definition that the study team
planned to explore during the conference call.
Each of the 5 to 9 panelists from a given panel provided input for a given
indicator by completing a 10-item questionnaire (see Appendix C). This questionnaire
asked panelists to consider the ability of this indicator to screen out conditions present on
admission, the potential preventability of the complication and the ability of the indicator
to identify medical error. In addition, the questionnaire asked panelists to consider the
potential bias, reporting or charting problems, potential for gaming the indicator, and

27
adverse effects of implementing the indicator. Finally, panelists were invited to suggest
changes to the indicator.

Conference Call

Following the submission of the initial evaluation questionnaires, all panelists


participated in a 90-minute conference call for their panel to discuss the indicators. The
purpose of each conference call was to allow panelists to discuss their opinions regarding
each indicator. Following the instructions in the RAND/UCLA method where the
primary goal of interaction between panelists is to allow room for varied opinions about
the appropriateness of an indicator, panelists were explicitly told that consensus was not
the goal of discussion. In some cases, panelists agreed on proposed changes to the
indicator definitions, and such consensus was noted and the definition was modified
accordingly before the final round of rating. Each call was moderated by a team member
(KM), who directed the structure of the call, and ensured that all panelists had a chance to
share their opinions. Also present was a technical expert, who answered questions
regarding administrative data and coding (PR), and a silent observer, who maintained
comprehensive notes of the call (SD). All team members refrained from offering opinion
regarding indicators during the call. Each indicator was discussed for approximately 15
minutes. Agenda items were set based on the feedback received from the initial
evaluation, and in general focused on points of disagreement among panelists. Panelists
were prompted throughout the process to consider the appropriate population at risk for
each indicator (specifically inclusion and exclusion criteria) in addition to the
complication of interest. However, if panelists wished to discuss other aspects of the
indicator, this discussion was allowed within the time allotted for that indicator. If time
remained at the end of a call, topics that were not fully addressed previously were
revisited.

Final Evaluation

Following each conference call, changes to each indicator were made where
suggested by panelists. In each case, near consensus of the panelists must have been
reached during the conference call for the change to be implemented. The indicators were
then redistributed to panelists along with questionnaires used in the initial evaluation.
Each indicator description included explication of any definition changes made and the
reason. Panelists were asked to re-rate each indicator based on their current opinion. They
were asked to keep in mind the discussion during the conference call.

Tabulation of Results

To examine the results of the panels, we applied a modified version of the


“appropriateness” criteria outlined in the RAND/UCLA Appropriateness Method. Results
from the final evaluation questionnaire were used to calculate median scores from the 9
point scale for each question and to categorize the degree of agreement among panelists
(see Table 3). Median scores determined the level of acceptability of the indicator, and

28
dispersion of ratings across the panel for each applicable question determined the
agreement status. Therefore the median and agreement status were independent
measurements for each question. The following six criteria covered in the questionnaire
were used to identify the panel opinions (i.e., median, agreement status category) on the
following aspects of the indicator:
1. Overall usefulness of the indicator,
2. Likelihood that indicator measures a complication and not a comorbidity
(specifically, present on admission),
3. Preventability of complication,
4. Extent to which complication is due to medical error,
5. Likelihood that complication is charted given that it occurs; and
6. Extent that indicator is subject to bias (systematic differences, such as case
mix that could affect the indicator, in a way not related to quality of care).
These evaluations are included in the summary of results for each indicator (Section
3D. Detailed Panel Results by Indicator).

Table 3. Criteria for Agreement Status


Category Panel size Criteria
Agreement 8-10 panelists Two or fewer members rated indicator outside
specific three-point range (1-3.9, 4-6.9, 7-9) in which
the median falls.
5-7 panelists One or fewer panelists rated indicator outside
specific three-point range (1-3.9, 4-6.9, 7-9) in which
the median falls.
Disagreement 8-10 panelists Three or more panelists rated indicator in each of the
extreme three-point ranges (1-3.9, 7-9).
5-7 panelists Two or more panelists rated indicator in each of the
extreme three point ranges (1-3.9, 7-9).
Indeterminate All panel sizes Any panel rating not qualifying as either “agreement”
Agreement or “disagreement” by above criteria.

We used the ratings regarding the overall appropriateness of the indicator (i.e.,
criterion number 1 above based on question #8 on questionnaire in Appendix C) to assess
the overall usefulness as a screen for potential patient safety problems (see Table 4). The
median score and agreement category for this usefulness question were combined into
modified RAND groupings. Akin to the RAND “Appropriate” level, we created two
categories, “Acceptable” and “Acceptable (-).” “Acceptable (-)” refers to indicators
which were considered acceptable, but this distinction was not as clear as for those
receiving a pure “Acceptable” rating. The RAND “Uncertain” level was likewise divided
into two parts, “Unclear,” and the slightly worse category, “Unclear (-).” The RAND
“Inappropriate” level was defined identically but named “Unacceptable.” These
designations, along with some initial administrative data testing and subsequent coding
clarifications, were used to triage indicators into three sets: Accepted Indicators,

29
Experimental Indicators, and Rejected Indicators (see Tables 11-13 in Section 3B.
Indicator Selection).

Table 4. Definitions for Overall Appropriateness of Indicator


Acceptable Median falls between 7 and 9 (inclusive of both), agreement
Acceptable (-): Median falls between 7 and 9 (inclusive of both), indeterminate
agreement
Unclear: Median falls between 7 and 9 (inclusive of both), disagreement, OR
Median falls between 5 and 7 (inclusive of neither), agreement or
indeterminate agreement
Unclear (-): Median between 4 and 5 (inclusive of both), agreement,
indeterminate agreement or disagreement, OR
Median falls between 1 and 3.9 with disagreement.
Unacceptable: Median falls between 1 and 3.9, agreement or indeterminate
agreement.

Surgical Panels

The multi-specialty panels had limited surgeon participation because of the need
to include a variety of specialties without expanding the panel. No surgical subspecialties
were represented, and each panel had at most two participating surgeons. As a result of
panelists frequently requesting more surgical input for some of the indicators, we
convened three additional panels consisting of only surgeons from various subspecialties
to complete a second round of review. The method of review was identical to the
previous panels. The surgeons reviewed the same indicators as were reviewed by the
initial multi-specialty panels. Each panel received the same combinations of indicators, in
their originally proposed form, with two exceptions. One panel received "Minor
Perioperative Physical Injuries" and another "Malignant Hypertension" in addition to the
group of four indicators originally reviewed as a packet by a multi-specialty panel. These
two additional surgical indicators were created based on suggestions by the multi-
specialty panels during the discussion of an indicator called “Complications of
Anesthesia.”
Sixteen organizations representing surgical subspecialties were invited to
nominate ten panelists. Nine organizations submitted at least one nomination, including:
American Association of Hip and Knee Surgeons; American Association of Hand
Surgeons; American Association of Neurological Surgeons; American Academy of
Orthopedic Surgeons; American Society of Colon and Rectal Surgeons; American
Urologic Association; North American Spine Society; Society of Thoracic Surgeons; and
American Society of Transplant Surgeons. In addition to recruiting subspecialists, we
contacted state chapters of the American College of Surgeons from the five most
populous states, to obtain one or two nominations of general surgeons. Four of the 22
contacted chapters sent nominations: San Diego, Southern California, Metropolitan
Chicago, and Central Pennsylvania. We received names of 79 nominees, forty-two of
whom accepted our invitation to participate. Twenty-five were assigned to panels, based
on their availability to participate and their subspecialty. Three panels were constructed

30
with a variety of specialties represented (see Appendix B). Two panelists did not
complete the entire review.
The demographic composition of the surgical panel (see Table 5) differed
significantly from that of the multi-specialty panels only by gender (p<.05), with more
males on the surgical panels than on the multi-specialty panels. No other differences were
significant.

Table 5. Surgical Panel Composition


Characteristic % (N)
Gender
Female 9% (2)
Academic Affiliation
Yes 87% (20)
No 13% (3)
Geographic Region
East 26% (6)
West 17% (4)
South 30% (7)
Midwest 26% (6)
Community
Urban 39% (9)
Suburban 17% (4)
Rural 17% (4)
Not reported 26% (6)
Hospital Affiliation
Private 52% (12)
Public 22% (5)
Both 9% (2)
Not Reported 17% (4)
Population
Underserved 43% (10)
General 22% (5)
Not reported 35% (8)

Surgical panelists followed the same procedure as the multi-specialty panels in


rating each indicator. In order to ensure that similar topics were discussed in the
conference calls of both the multi-specialty and surgical panels, and to obtain surgeon
feedback on changes suggested by the multi-specialty panels, agendas for the conference
calls included those topics discussed by the multi-specialty panels (though the source of
these topics was not noted). As with the multi-specialty panels, the agenda also included
concerns and areas of disagreement based on panelists’ responses to the first round
questionnaire. Panelists then re-rated each indicator based on the suggestions of their own
panel. In some cases the final definitions suggested by consensus in the surgical panel
calls, and therefore proposed in the second-round questionnaire differed substantially
from those rated by the multi-specialty panels. For these cases, the study team reviewed
the reasons for differences in definitions proposed, and defined the indicator based on
input from both panels if possible. Panel results for each indicator note any differences

31
between panels, and explain final decisions regarding indicator definitions and
acceptability.

Section 2E. Empirical Methods


Purpose of Analyses

Empirical analyses were conducted to provide additional information about the


indicators. These analyses were intended not as decision making tools, but rather
explorations into the characteristics of the indicators. Specifically, these analyses explore
the frequency and variation of the indicators, the potential bias, based on limited risk
adjustment, and the relationship between indicators.

Analysis Approach
Data Sources
The data sources used in the empirical analyses were the 1997 Florida State
Inpatient Database (SID) (for initial testing and development; 1995-1997 used for
persistence analysis) and the 1997 State Inpatient Databases (SID) for 19 HCUP
participating states, referred to in this report as the National SID, (for the final empirical
analysis). The Florida SID consists of about 2,000,000 discharges from over 200
hospitals, and was chosen because it is a large diverse state. The National SID consists of
about 19,000,000 discharges from over 2,300 hospitals. The National SID contains all-
payer data on hospital inpatient stays from participating states (Arizona, California,
Colorado, Connecticut, Florida, Illinois, Iowa, Kansas, Maryland, Massachusetts,
Missouri, New Jersey, New York, Oregon, Pennsylvania, South Carolina, Tennessee,
Washington, Wisconsin). All discharges from participating States’ community hospitals
are included in the SID database, which defines community hospitals as nonfederal,
short-term, general, and other specialty hospitals, excluding long-term hospitals and
hospital units of long-term care institutions, psychiatric hospitals, and
alcoholism/chemical dependency treatment facilities. A complete description of the
content of the SID, including details of the participating States’ discharge abstracts, can
be found on the Agency for Healthcare Research and Quality web site
(www.ahrq.gov/data/hcup/hcupsid.htm). Because the Florida SID was used only for
initial testing and development, the empirical results reported are from the National SID.
Descriptive results from the Florida SID are reported for comparison to ensure that the
hospital level results were similar in both data sources. Differences between Florida and
national results are pointed out in the text. The National SID data were also used for the
construction of area measures, with data from the U.S. Census Bureau used to construct
the denominator of these rates.

Reported Patient Safety Indicators


Three sets of patient safety indicators were examined. First, the Accepted patient
safety indicators met the face validity criteria established through the literature review
and clinician panel review. Second, the Experimental patient safety indicators did not

32
meet those criteria, but appeared to warrant further testing and evaluation. Third, several
Accepted patient safety indicators were modified into area indicators, which were
designed to assess the total incidence of the adverse event within geographic areas. For
example, we constructed an indicator for “Transfusion reaction” at both the hospital and
area level. Transfusion reactions that occur after discharge from a hospitalization would
result in a readmission. The area level indicator includes these cases, while the hospital
level restricts the number of transfusion reactions to only those that occur during the
same hospitalization that exposed the patient to this risk.
All potential indicators were examined empirically by developing and conducting
statistical tests for precision, bias, and relatedness of indicators. For each indicator, we
calculated five different estimates of hospital performance. First, we calculated the raw
indicator rate using the number of adverse events in the numerator divided by the number
of discharges in the population at risk by hospital. For the area indicators, the
denominator is the population of the Metropolitan Statistical Area (MSA), New England
County Metropolitan Area (for the New England states) or county (for non-MSA areas)
of the hospital. Second, we adjusted the raw indicator using a logistic regression to
account for differences among hospitals (and areas) in demographics (specifically, age
and gender). Age was modeled using a set of dummy variables to represent 10-year
categories except for young children whose age categories are narrower (i.e., less than 1,
1-4, 5-14, 15-24, 25-34, 35-44, 45-54, 55-64, 65-74, 75-84, and 85 or more years), along
with a parallel set of age-gender interactions. Because of sparse cells, certain age
categories were combined or omitted for selected indicators, such as the obstetric
indicators. Third, we adjusted the raw indicator to account for differences among
hospitals in age, gender and modified DRG category (as described below). Fourth, we
adjusted the raw indicator to account for differences among hospitals in age, gender,
modified DRG and comorbidities (defined using an adaptation of the AHRQ comorbidity
software) of patients. Finally, we applied mutlivariate signal extraction (MSX) methods
to adjust for reliability by estimating the amount of “noise” (i.e., variation due to random
error) relative to the amount of “signal” (i.e., systematic variation in hospital performance
or the ‘reliability’) for each indicator. This or similar “reliability adjustment” has been
used in the literature for similar purposes.23, 24 Mutlivariate methods (taking into account
correlations among indicators in order to extract additional ‘signal’) were applied to most
of the accepted indicators. The exceptions were Death in Low Mortality DRGs and
Failure to Rescue. Only univariate signal extraction methods (smoothing) were applied
to these two indicators and to the experimental indicators, because these indicators
possibly cover broader clinical concepts. Correlations between these indicators and other
indicators may not reflect correlations due to quality of care, and thus inclusion of these
indicators may adversely affect the MSX approximations. For additional details on the
empirical methods, refer to the companion EPC HCUP Quality Indicator Report,
published by AHRQ (http://www.ahrq.gov/data/hcup/qirefine.htm). Additional details on
the modifications made to the DRG and comorbidity categories are described below.

Hospital Fixed Effects


In our risk-adjustment models, we calculated hospital fixed effects using the
standard method with logistic models of first estimating the predicted value for each
discharge, then subtracting the actual outcome from the predicted, and averaging the

33
difference for each hospital to get the hospital fixed effect estimate. In the companion
Quality Indicator Report,3 we used linear regression models with hospital fixed effects
included, arguing that the logistic approach yielded biased estimates due to the omission
of a variable (the hospital) correlated with both the dependent (e.g., in-hospital mortality)
and the independent (e.g., age, gender, APR-DRG) variables in the model. Given the rare
occurrence of many of the PSI, however, the logistic approach may be more appropriate
for this application. Linear methods assume that the distribution of the error term is
normally distributed. This assumption is violated when the outcome is dichotomous. The
QI means were generally an order of magnitude higher than the PSI means, so the
assumption was not as problematic. However, the most appropriate method depends on
the particular characteristics of each indicator, whether QI or PSI. To the extent that bias
is a concern, accounting for the clustering of patients by using a hospital fixed effect is
advantageous. To the extent that extreme values are a concern, then imposing structure
on the error term with logistic methods is advantageous. In the end, the two approaches
can be compared in terms of how much difference it makes in the relative assessment of
provider performance. This is an issue that warrants further analysis, in order to better
understand the trade-offs and limitations of each approach, and under what conditions
and for what indicators each approach might best apply.
Specifically, the risk-adjusted “raw” estimate of a hospital’s performance is
constructed in two steps. In the first step, if we denote whether or not the event
associated with a particular indicator Yk (k=1,…,K) was observed for a particular patient
i in year t (t=1,…,T), then the regression to construct a risk-adjusted “raw” estimate of a
particular patient’s performance on each indicator can be written as:

(1) Ykit = Zit kt + kit , where

Ykit is the kth PSI for patient i in year t (i.e., whether or not the event associated with
the indicator occurred on that discharge);
Zit is a vector of patient covariates for patient i in year t (i.e., the patient-level measures
used as risk adjusters);
kt is a vector of parameters in each year t, giving the effect of each patient risk
adjuster on indicator k (i.e., the magnitude of the risk adjustment associated with each
patient measure); and
kit is the unexplained residual in this patient-level model.

In the second step, we estimated the hospital effect by subtracting the resulting
predictions from this patient-level regression from the actual observed patient-level
outcomes, and taking the mean of this difference for each hospital. That is, for each
hospital j (j=1,…,J),

(2) Mkjt = Ykijt – (Zit kt + kit), where

Mkjt is the “raw” adjusted measure for indicator k for hospital j in year t (i.e., the
hospital “fixed effect” in the patient-level regression); and
Zit is the vector of patient covariates for patient i in year t estimated in Step 1.

34
In addition to age, sex, and age*sex interactions as adjusters in our model, we also
included a modified DRG and comorbidity category for the admission.

Modified DRG Categories


We made two modifications to the Centers for Medicare and Medicaid Services
(CMS, formerly Health Care Financing Administration) Diagnosis-Related Groups
(DRGs). First, we collapsed adjacent DRG categories that were separated by the
presence or absence of comorbidities or complications. For example, DRGs 076
(OTHER RESP SYSTEM OPERATING ROOM PROCEDURES W CC) and 077
(OTHER RESP SYSTEM OPERATING ROOM PROCEDURES W/O CC) were
grouped into one category. The purpose was to avoid adjusting for the complication we
were trying to measure. Appendix D Section 1 lists the categories that were grouped.
Second, we excluded from the logistic models most of the super-MDC DRG categories.
Excluding these categories also avoids adjusting for the complications we were trying to
measure. For example, tracheostomies (DRG 482-483) often result from potentially
preventable respiratory complications that require long-term mechanical ventilation.
Similarly, operating room procedures unrelated to the principal diagnosis (DRG 468,
477) often result from potentially preventable complications that require surgical repair
(i.e., fractures, lacerations). Appendix D Section 2 lists the super-MDC categories that
were excluded and other DRGs that were excluded because they were no longer valid.
In the companion technical report on quality indicators, the risk adjustment
method implemented All Patient Refined (APR)-DRGs, a refinement of DRGs to capture
different levels of complications. However, patient safety indicators, designed to detect
potentially preventable complications, require a risk adjustment approach that does not
inherently remove the differences between patients based on their complications. The
APR-DRGs could be modified to remove applicable complications, on an indicator by
indicator basis, but implementation of such an approach was beyond the scope of the
current project. In this report, APR-DRG risk adjustment was not implemented.

Modified Comorbidity Software


To adjust for comorbidities, we used an updated adaptation of AHRQ
Comorbidity Software (http://www.ahrq.gov/data/hcup/comorbid.htm). The ICD-9-CM
codes used to define the comorbidity categories were modified to address four main
issues. First, we excluded comorbidity categories in the current software that include
conditions likely to represent potentially preventable complications in certain settings,
such as after elective surgery. Specifically, three DRG categories (cardiac arrhythmia,
coagulopathy, and fluid/electrolyte disorders) were removed from the comorbidity
adjustment. Second, most adaptations were designed to capture acute sequelae of chronic
comorbidities, where both conditions are represented by a single ICD-9-CM code. For
example, the definition of hypertension was broadened to include malignant
hypertension, which usually arises in the setting of chronic hypertension. Unless these
"acute on chronic" comorbidities are captured, some patients with especially severe
comorbidities would be mislabeled as not having conditions of interest. Third, the
comorbidity definitions did not include obstetric comorbidity codes, which are relevant
for our obstetric indicators. Codes, when available, for these comorbidities in obstetric

35
patients were added. Fourth, slight updating was necessary based on recent ICD-9-CM
code changes. Modifications made to the AHRQ comorbidity software are explained in
detail in Appendix D, Section 3.

Low Mortality DRGs


In order to be included in the “Low Mortality DRG” indicator, the DRG had to
have an overall in-hospital mortality rate (based on the National SID sample) of less than
0.5%. In addition, if a DRG category was split based on the presence of comorbidities or
complications, then we only included the category if both DRGs (with and without
comorbidities or complications) met the mortality threshold. Otherwise the category was
not included in the “Low mortality DRG” PSI. The indicator is reported as a single
measure and stratified into medical (adult and pediatric), surgical (adult and pediatric),
neonatal, obstetric and psychiatric DRGs. The 126 DRGs included in the measure are
listed in Appendix D, Section 4 by stratification category.

Empirical Analysis Statistics

Using these methods we constructed a set of statistical tests to examine precision,


bias, and relatedness of indicators for all accepted hospital level indicators, and precision
and bias for all accepted area level and experimental indicators. Each of the key statistical
test results was summarized and explained in the overview section of the companion
HCUP Quality Indicator report.3 Tables 6-8 provide a summary of the statistical analyses
and their interpretation.

36
Table 6. Precision Tests
Measure Statistic/ Adjustments Interpretation
Precision. Is most of the variation in an indicator at the level of the hospital? Do smoothed estimates of quality lead to more
precise measures?
a. Observed  Hospital Level  Unadjusted Risk adjustment can either increase or decrease observed
variation in Standard  Age-gender variation. If increase, then differences in patient characteristics
indicator Deviation adjusted mask provider differences. If decrease, then differences in
 Hospital Level  Modified patient characteristics account for provider differences.
Skew Statistic DRG adjusted
 Modified
AHRQ
Comorbidity
adjusted
b. MSX methods  Signal Standard  Reliability Estimates what percentage of the observed variation between
Deviation adjusted hospitals reflects systematic differences versus random noise.
 Signal Share Signal share is a measure of how much of the total variation
 Signal Ratio (patient and provider) is potentially subject to hospital control.
37
Table 7. Bias Tests
Measure Statistic Interpretation
Bias. Does risk adjustment change our assessment of relative hospital performance, after accounting for reliability? Is the impact
greatest among the best or worst performers, or overall? What is the magnitude of the change in performance?
MSX methods: Spearman Rank Correlation Coefficient Risk adjustment matters to the extent that it alters the assessment
unadjusted vs. (Before and After Risk Adjustment) of relative hospital performance. This test determines the impact
age, sex, Modified overall.
DRG, Comorbidity Average Absolute Value Of Change This test determines whether the absolute change in performance
risk adjustment Relative To Mean (After Risk Adjustment) was large or small relative to the overall mean.
Percentage of The Top 10% Of Hospitals This test measures the impact at the highest rates (in general, the
That Remains The Same (After Risk worse performers).
Adjustment)
Percentage of The Bottom 10% Of This test measures the impact at the lowest rates (in general, the
Hospitals That Remains The Same (After better performers).
Risk Adjustment)
Percentage of hospitals that move more This test determines the magnitude of the relative changes.
than two deciles in rank (up or down)
(After Risk Adjustment)
38

Table 8. Relatedness Tests


Measure Statistic Interpretation
3. Relatedness of indicators. Is the indicator related to other indicators in a way that makes clinical sense? Do methods that remove
noise and bias make the relationship clearer?
a. Correlation of Spearman correlation coefficient Are indicators correlated with other indicators in the direction one
indicator with might expect?
other indicators
b. Factor loadings Factor loadings, based on Spearman Do indicators load on factors with other indicators that one might
of indicator correlation, Principal Component Analysis expect?
Chapter 3. Results
The results are presented in four sections. Within each section, the indicators are
presented within their final designated set – Accepted or Experimental, in alphabetical
order. Non-obstetric indicators are followed by obstetric indicators, also in alphabetical
order. The results for each of the rejected indicators are contained in Appendix F. The
first section presents the results of the literature review. The second section presents the
overall results of the clinician review; the third section also reports the results for the
clinician review, but for specific indicators. The final section contains the comparative
empirical results.
Obstetric indicators are grouped together in the results presentations to convey a
number of differences from the other PSIs more clearly. First, the obstetric indicators, for
the most part, were created after a review of the ICD-9-CM codes. There is little or no
precedent for using most of these indicators, and little literature based evidence
discussing these complications as measures of quality of care. In addition, little evidence
of the coding validity of obstetric codes exists. Second, at the end of the clinician review
it appeared that the obstetric panels treated similar complications differently from the
other panels. For example, the diagnosis code for wound dehiscence was rejected by the
multi-specialty panel, due to the ambiguity of the code. The obstetric panel, however,
accepted the ambiguity of the parallel code for cesarean wound dehiscence. Third, an
entirely different set of physicians and nurses, as well as only a subset of hospitals
provide obstetric care. Fourth, empirical analyses found that obstetric PSIs on average
tend to have considerably higher rates than non-obstetric PSIs. In addition, DRG and
comorbidity risk adjustment is likely inadequate for these indicators (DRGs are split only
by delivery type and the presence or absence of any complication or comorbidity, and the
comorbidities examined in the risk adjustment are rare in this population and potentially
not the most important comorbidities for which to risk adjust). A factor analysis found
that these indicators tend to load onto one factor, while non-obstetric indicators appear to
load on a separate factor, for the most part. Because of these considerations, the obstetric
indicators are presented separately in this report, following the non-obstetric indicators in
each subsection.

Section 3A. Literature Review Results


Background

In the context of widespread current interest in measuring and improving patient


safety, potential quality indicators related to potentially preventable complications of
medical care merit special attention. In this section, we review the literature on the
application of administrative data to screening for such complications
The seminal studies that defined the epidemiology of medical errors6, 25, 26 were
based on a methodology that was pioneered by the California Medical Association
(CMA) in 1976.27 Specially trained nurses and medical records administrators screened
inpatient records for any of 18 possible indicators of an adverse event.28 Records that met
one or more of these criteria were then reviewed independently by two board-certified
physicians to identify “injuries due to medical management”; all differences were

39
reconciled by a third independent reviewer. Injuries “caused by the failure to meet
standards reasonably expected of the average physician...” were labeled as “negligent”
adverse events. Another seminal study employed “ethnographers trained in qualitative
observational research” who prospectively identified “situations in which an
inappropriate decision was made...” by attending all rounds, nursing sign-outs, case
conferences, and other “organized settings in which health care providers discussed
adverse events.”29 Neither of these methodologies use ICD-9-CM codes to identify
adverse events. Another set of studies defined postoperative adverse events based on
unusual occurrences and key clinical findings that are included in a proprietary clinical
data system.30-33 Some investigators have defined adverse events de novo, based on
clinical experience and prior literature.34-37 Others have estimated the incidence of adverse
drug events using various pharmacy-based surveillance systems.38, 39
By contrast, relatively few studies have evaluated ICD-9-CM diagnosis or
procedure codes as a method for finding adverse events or medical errors. Numerous
investigators have proposed various ICD-9-CM definitions of adverse events or medical
errors; some are limited to specific conditions or procedures40-43 while others are
applicable to broad groups of hospitalized patients.10, 11, 44-48 However, most of these
investigators initially validated their measures principally by assessing content validity 7
or by demonstrating that they were associated with substantially higher mortality, longer
lengths of stay, and higher charges at the patient level,40, 47, 48 even after adjusting for
demographic characteristics and comorbidities.10, 12 Brailer et al.47 also found a strong
association at the patient level (at 6 hospitals) between their proprietary (CareScience,
Inc.), comorbidity-adjusted complication measure and a composite measure of 15
different adverse events (based on Maryland Hospital Association indicators). Among
these 15 categories, inpatient mortality and unscheduled return to the operating room or
special care unit (among others) were strongly associated with comorbidity-adjusted
complications. Several other proprietary systems (e.g., Risk adjusted Major
Complications, HealthGrades, Inc.; CareEnhance Resource Management Systems,
McKesson Health Solutions; Disease Staging, MEDSTAT, Santa Barbara CA;
Performance Measurement, QuadraMed, Larkspur CA; Intelligent Disease Analysis,
MedAI Inc., Orlando FL) that estimate crude or risk adjusted complication rates based on
administrative data have never been publicly validated.
Although these early studies generally supported the validity of using
administrative data to ascertain adverse events, they also identified several sources of
concern:
1. The ratio of observed to predicted complications, based on ICD-9-CM codes
(predominantly 997.xx through 999.9x) from 776 acute care hospitals, increased
substantially between 1983 and 1984, reflecting the impact of prospective
payment on the reporting of complications.45 Conversely, recent evidence
suggests a significant decrease between 1997 and 1998 in the coding of acute
posthemorrhagic anemia and selected other complications among Medicare
inpatients undergoing hip and femur procedures (perhaps in response to the
Office of the Inspector General’s aggressive compliance program).49
Proprietary data from Solucient, LLC also suggest a sudden 35% decrease in
risk adjusted complications across nearly 3,000 hospitals between 1998 and
1999.50

40
2. Unlike analogous ratios for mortality and readmissions, hospitals’ ratios of
observed to predicted complications varied significantly by region and hospital
case-mix index; such associations would not be expected for a valid measure.45
In other studies, ICD-9-CM coded complications were more frequent at large
hospitals than at smaller hospitals,10 and complication rates were higher at large
hospitals and academic medical centers.11, 41 These findings contradict numerous
studies suggesting better outcomes and processes of care, for at least some
conditions, at high-volume and teaching hospitals.51-53 The most plausible
explanations for this finding (i.e., greater unmeasured severity of illness, more
frequent use of invasive therapies, and more aggressive coding of complications
at teaching hospitals) suggest the possibility of substantial bias in comparing
performance across hospitals of different types.
3. There was minimal association between measures of risk adjusted complications
and other outcome measures (e.g., rates of death, readmission, and major
morbidity) at the hospital level (Spearman r=-0.01 to -0.05, 46; partial r=0.09-
0.1147; Spearman r=-0.01 for surgical patients, r=-0.12 for medical patients).11
Although this finding has been interpreted as “desirable because (complications
measures are) intended to provide information not captured by other outcome
measures”,47 it is concerning that complication measures correlate so poorly
with somewhat better validated measures of quality. 54-65 Two studies of adverse
events after coronary artery bypass surgery represent notable exceptions to these
findings. Specifically, risk adjusted death rates were significantly correlated
with risk adjusted complication rates, according to Ghali et al. (r=0.73-0.74
[p<0.01]43), and risk adjusted “major nonfatal” complication rates, according to
Hartz et al. (r=0.31 and r=0.79 [p=0.035], before and after eliminating a single
outlier.)66
4. Logistic regression models to predict complications, using information available
from administrative data, are generally weaker than models to predict death or
readmission, with receiver operating curve areas or c-statistics (measuring the
model’s ability to discriminate between patients with and without adverse
outcomes) of 0.6-0.710, 41-43 and R-squared statistics (correlating observed and
expected complication rates at the hospital level) of 0.42-0.4845 or 0.16 (for
medical cases) to 0.42 (for major surgery).11 The difficulty of predicting
complications suggests that underlying patient characteristics or other
unmeasured factors may introduce even more bias than in comparative
evaluations of other outcomes.

It should be noted that problems 2-4 above may not be unique to administrative
data, but may apply to clinically derived measures of complications as well. For example,
two studies by the same researchers, using different data sources, found no correlations
between risk adjusted complication measures and hospital/operator volume for PTCA and
CABG.35, 67 Studies based on MedisGroups32 68 data have confirmed that complications,
adjusting for patient risk, are more frequent at large hospitals, hospitals with approved
residency training programs, hospitals with high nurse-to-bed ratios and high proportions
of board-certified anesthesiologists, and hospitals that offer subspecialty services (e.g.,
magnetic resonance imaging, bone marrow transplantation) - precisely the hospitals that

41
would be expected to provide better care. There was essentially no association at the
hospital level between measures of risk adjusted complications and risk adjusted
mortality for CABG (r=0.07, p=0.58),32 and a weak association (r=0.21, 95% CI 0.04-
0.38)69 for elective adult general surgery after full risk adjustment (i.e., r=0.55, 95% CI
0.38-0.72 without risk adjustment). Similarly, the Department of Veterans’ Affairs (VA)
National VA Surgical Risk Study found significantly higher risk adjusted, 30-day
postoperative morbidity at teaching hospitals than at non-teaching hospitals for general,
orthopedic, urologic, and vascular (but not thoracic, neurologic, or otolaryngologic)
surgery, 70 and essentially no association with risk adjusted mortality at the hospital level
(r=-0.01 overall, range r=-0.03 for neurosurgery to r=0.28 for otolaryngologic surgery).60
Finally, discrimination in predicting complications has also been relatively weak (c<0.79)
in these detailed clinical data systems.31, 33, 60, 69

General Issues in Using Complications To Screen for Quality


Problems

The companion technical report on the development of the AHRQ Quality


Indicators describes three3 areas important to the evaluation of a measure (i.e., precision,
minimum bias and construct validity) that are pertinent to potential PSIs.

Precision

As with mortality rates, variations in complication rates may reflect random


variation. However, the higher incidence of most complications compared to mortality
reduces random variation, and provides an important incentive for using complication
rates as quality measures. In addition, precision may be less important for PSIs than for
other types of QIs. To the extent that these indicators capture preventable iatrogenesis,
the precision with which prevalence is estimated at the provider level may be
unimportant. The primary intended use of these indicators is not to compare performance
across providers, but instead to assess the overall performance of the health care system
at the regional, state, or national level, and to provide a screening tool that providers can
use to identify cases that merit internal review.
It should be noted that the ICD-9-CM codes that are most likely to represent
preventable adverse events are also relatively rare (see detailed reviews below). The ICD-
9-CM codes for general complications are more common, but are subject to considerable
coding error and may include a mix of preventable and non-preventable events. Efforts to
focus on ICD-9-CM coded complications that are likely to reflect medical errors will
inevitably increase random variation across providers.

Minimum Bias

All quality indicators, including the proposed PSIs, are susceptible to bias of three
general types: selection effects, confounding, and misclassification. Selection bias arises
when the sample available for quality measurement is not representative of the target
population. In the current context, this problem arises principally for conditions that may
be treated, or procedures that may be performed, in either inpatient or outpatient (short-

42
stay) settings. For these conditions and procedures, HCUP data may not adequately
represent the population of interest. For example, in areas where freestanding birthing
centers have a substantial market share, PSI rates based on HCUP data are likely to be
biased.
Confounding arises in comparing PSI rates across hospitals, health systems, or
regions because of differences in patients’ underlying risk of these events. Patients who
undergo certain procedures, or have certain diagnoses, are inherently at higher risk of
experiencing adverse events, including adverse events due to medical error. Age is also a
known risk factor for medical error, although its effect may be explained by the greater
clinical complexity of care for elderly patients and their greater exposure to potential
hazards.6, 26 Well-established clinical prediction rules allow risk adjustment for patients
experiencing perioperative cardiac and pulmonary complications71-77, but risk adjustment
systems remain relatively unstudied for most other complications 78. Specific clinical
prediction rules have been developed for morbidity after coronary artery bypass
surgery,79 carotid endarterectomy,80-83, and percutaneous coronary interventions,84 but not
for many other high-risk procedures. In general, clinical factors such as the serum
albumin level and functional status37 are clearly associated with the risk of adverse events
among both medical and surgical inpatients. These factors potentially confound the
observed associations between hospital categories and adverse event rates,25, 52 as well as
the performance ranking of individual hospitals. For example, Hartz et al.35 reported that
the Wisconsin hospital with the highest unadjusted rate of major complications after
Coronary Artery Bypass Graft (CABG) had an adjusted relative odds of 0.98, placing it
right in the middle after risk adjustment.
Multiple studies have explored the relative performance of risk adjustment models
for mortality, using administrative versus clinical data (or proprietary systems based on
such data).85-90 Although there is less evidence regarding the relative performance of risk
adjustment models for adverse events, the same findings are likely to apply. For example,
Hartz et al. reported c statistics of 0.71 using ICD-9-CM codes, and 0.80 using clinical
variables, to predict adverse outcomes after stroke among Medicare patients.91 Substantial
opportunity for confounding bias therefore exists when provider-specific adverse event
rates are compared.
Misclassification bias is likely to result from variation in coding practices across
hospitals. As detailed below, we carefully reviewed the available literature to select PSIs
for which the positive predictive value of coding appears to be at least 75%. However,
there is less evidence on sensitivity (i.e., undercoding) than on predictive value (i.e.,
overcoding), so several of the accepted and experimental indicators may suffer from
significant undercoding. Based on current guidelines that only require coding of
“conditions that affect patient care in terms of requiring clinical evaluation... therapeutic
treatment...diagnostic procedures...extended length of hospital stay...increased nursing
care and/or monitoring,”92 we avoided including potentially inconsequential diagnoses in
the PSI definitions. However, we could not always do so, due to the ambiguity of ICD-9-
CM. One recent study suggests that the sensitivity of coding postoperative complications
after elective back surgery varies markedly across hospitals, such that about half of the
difference in risk-adjusted complication rates between low and high outlier hospitals is
attributable to reporting variation.93

43
Construct Validity
The literature identifies only a small number of explicit processes of care that
have proven beneficial in randomized, placebo-controlled trials for preventing certain
complications: (1) thromboembolism prophylaxis for most major surgeries94-102; (2)
perioperative antibiotics for a smaller but still substantial number of surgical
procedures103-110; (3) perioperative nutritional support for severely malnourished patients
requiring laparotomy, thoracotomy111, 112 and hip fracture repair113; (4) perioperative beta
blockers to prevent cardiac complications among high-risk patients undergoing cardiac,114
noncardiac115 or vascular116 surgery; and (5) antiplatelet agents to prevent early restenosis
after percutaneous coronary interventions.117, 118 Other potential interventions to improve
patient safety have been thoroughly reviewed in a recent report.2 To our knowledge, no
additional studies to date have linked these specific processes of care with differences in
risk adjusted rates of adverse outcomes across hospitals or physicians.
Given the small number of evidence-based processes-of-care related to the
prevention of adverse events, one could argue for broad explicit review criteria that
incorporate standards of care based on expert recommendations, rather than insisting on
processes strongly supported by evidence. Condition-specific provider adherence
measures of this type have been associated with the risk of in-hospital complications
among adults admitted for diabetes and chronic obstructive pulmonary disease (COPD),
but not congestive heart failure (CHF).36 Iezzoni and colleagues developed a similar set
of review instruments to compare Medicare cases flagged by the Complications
Screening Program (CSP) in California and Connecticut in 1994 with unflagged cases.16
Even with this broader look at processes of care, flagged cases did not differ significantly
from unflagged cases in terms of the prevalence of generic quality problems. Specifically,
53% of 351 flagged surgical cases demonstrated one or more of 17 process-of-care
problems, versus 46% of 140 unflagged surgical cases. Among medical cases, 5% of both
flagged and unflagged cases demonstrated one or more process-of-care problems. None
of the specific flags proved useful in identifying patients with a higher risk of these
generic process deficiencies, except deep vein thrombosis/pulmonary embolism
(DVT/PE) (11% flagged versus 4% unflagged, p=0.09) and miscellaneous complications
(62% flagged versus 46% unflagged, p=0.06).
Implicit review is based upon global assessment of quality of care by physician
peers.119 In another recent evaluation of the Complications Screening Program, Weingart
and colleagues15 compared flagged and unflagged cases on the prevalence of quality
problems identified by implicit review. Physician reviewers identified potential quality
problems in 29.5% of flagged surgical cases and 15.7% of flagged medical cases,
compared with 2.1% of unflagged medical and surgical controls. However, substantial
variation across specific screens was noted. Potential quality problems were identified in
50% of surgical cases flagged for DVT/PE, but only 5% of surgical cases flagged for
postoperative pneumonia. Potential quality problems were identified in less than 20% of
medical cases flagged by each screen, except for post-procedural hemorrhage or
hematoma (31%). Of two other studies involving structured implicit review by
physicians as a “gold standard” for quality assessment, one confirmed the potential value
of various morbidity-based screening tools based on nurse/staff review,120 but another
found that quality of care was equal between patients with and without complications,
and between hospitals with low and high risk adjusted complication rates.121 In neither of

44
these studies did the authors report the predictive validity of specific adverse outcome
measures.
Part of the difficulty with linking adverse events and processes of care relates to
the inherent lack of reproducibility in implicit assessments of quality. For instance, a
well-known study in the 1980s examining deaths due to pneumonia, myocardial
infarction and stroke reported inter-rater reliability for physicians’ judgment of
“preventable death” as 0.11, 0.51 and 0.55, respectively122. (The first value falls in the
range conventionally regarded as “poor,” while the other two values indicate “moderate”
agreement.) In the Harvard Medical Practice Study, physician reviewers exhibited
substantial agreement in identifying the presence of adverse events (kappa=0.61), but
only “fair” agreement in identifying negligent care (kappa=0.24).6 Two later studies
reported moderate agreement among physician reviewers for the presence of an adverse
event (kappa = 0.41-0.57), but only fair agreement for the judgment of preventability
(kappa = 0.30)123 or negligence (kappa = 0.19-0.24).124 Weingart et al. reported borderline
poor agreement among physician reviewers about both the presence of a CSP
complication (kappa=0.22) and a potential quality problem (kappa = 0.22).15 Agreement
was somewhat better in the National VA Surgical Risk Study, in which physicians used a
5-point scale to rate overall quality of care (ICC=0.40-0.56).121 A more recent study
examined the impact of discussion between reviewers on agreement in assessing
preventability of adverse events.125 The authors created 7 different pairs among 13
reviewers participating in the study. They showed that discussion between the two
physicians in a pair substantially improved their assessment of an adverse event as
iatrogenic from (kappa = 0.46 to 0.71). However, the agreement across pairs remained
relatively unchanged by discussion (kappa = 0.36 before to 0.40 after discussion).
In the absence of identifiable differences in processes-of-care in most cases
studied, residual variation in complication rates after risk adjustment presumably reflects
either unmeasured processes of care or differences in patients' baseline risk of
complications that are not captured through risk adjustment. By definition, these
concepts are difficult to measure, making it difficult to establish the construct validity of
many potential PSIs.
Finally, correlations between adverse events and structural characteristics of
hospitals have been cited as evidence of construct validity. However, these findings are
often difficult to interpret because of uncertainty about which structural characteristics
are truly associated with better care. Structural characteristics are also often difficult to
modify; hence, identifying them has limited value for quality improvement. In evaluating
the Complications Screening Program, Iezzoni and colleagues found that large hospitals,
hospitals performing open heart surgery, and members of the Council of Teaching
Hospitals (COTH) had 10-33% more complications than expected across most risk pools,
whereas small hospitals, hospitals without open heart surgery facilities, and nonmembers
of COTH, had 4-26% fewer complications than expected.11 Similarly, patients at
hospitals with fewer than 100 beds consistently had a 22-49% lower risk of complications
than patients at hospitals with 500 or more beds.10 A study of factors associated with
adverse events after surgery, based on AHRQ’s original HCUP Quality Indicators,
revealed associations between four of these nine indicators and registered nurse staffing
(as detailed below), including three of the five indicators that were judged a priori to be
“nurse-sensitive.”126 Differences in risk-adjusted QI rates across regions and hospital

45
ownership categories were also noted. In evaluating a Risk-Adjusted Complications
Index (RACI) based on administrative data, DesHarnais and colleagues found that
hospitals’ risk-adjusted complication rates were positively associated with their range of
services, but not with their ownership, size, or teaching status.46 Conversely, Myers found
significantly higher complication rates after hysterectomy at teaching hospitals than at
nonteaching hospitals.41 These findings are probably attributable to bias from unmeasured
case mix or differential reporting of complications. Studies based on chart review have
suggested that major teaching hospitals experience more complications than nonteaching
hospitals, but they are better at “rescuing” patients after complications, and relatively few
of their complications (especially adverse drug events) are due to negligence.25, 32, 52
Patient volume should be inversely associated with valid outcome rates, at least for
procedures requiring technical skill, but the literature on this topic has generally focused
on mortality and resource use, with complications of percutaneous coronary
interventions127-135 and stroke after endarterectomy the notable exceptions.136 With the
exception of a few recent studies on nurse staffing and hospital outcomes,126, 137, 138
analyses of structural aspects of care have not been particularly helpful in establishing the
construct validity of morbidity indicators based on administrative data, or suggesting
interventions to improve patient outcomes.

Specific Review of the Evidence for Indicators

The potential patient safety indicators identified through literature and coding
reviews are listed in Appendix A. These indicators were assigned to one of three
categories: Accepted PSIs, Experimental PSIs and Rejected PSIs. Those in the last
category were removed from further analyses based on evidence of poor coding or
construct validity, poor ratings by panelists, or inability to implement the desired
specification after receiving expert coding input. Indicators in the Accepted indicator set
were rated favorably by clinical panels as being useful screens for potentially preventable
complications. Finally, those in the Experimental indicator set fell between the other two
categories, and underwent less extensive empirical analyses. This set is not recommended
without considerable further testing, as described in Section 3B, Indicator Selection.
This section reviews the literature on the derivation and validity of each indicator,
or the ICD-9-CM codes upon which it is based. We briefly compare the definitions
reported in the literature with the final PSI definition. More detailed descriptions of the
definitions, and explanations of differences, are presented in section 3D, Detailed
Clinician Panel Results by Indicator. Literature reviews were performed on all indicators
including those that were rejected based on poor panel ratings, and some that were
rejected for other reasons. Literature reviews for those indicators are not presented in this
section, but are presented in Appendix F. For each indicator, we report separately on
whether it is coded accurately (“coding validity”) and whether it is empirically associated
with substandard quality or errors in processes of care (“construct validity”).
The literature review results are provided to help researchers and providers assess
the usefulness of each indicator in their own epidemiologic or quality improvement work.
It was beyond the scope of this project to review clinical studies linking specific
processes of care to specific, prospectively ascertained complications. Much of this
literature has been summarized in a recent AHRQ report on evidence-based practices to

46
prevent medical errors.2 For example, numerous randomized controlled trials have
proven that thromboembolism prophylaxis reduces the risk of postoperative DVT/PE,
and therefore that higher DVT/PE rates are likely to be associated with poorer quality of
care. This literature review focuses instead on the validity of complication indicators
based on ICD-9-CM diagnosis and/or procedure codes. Tables 9 and 10 summarize the
strength of evidence for each Accepted and Experimental indicator respectively.

Table 9. Summary of Strength of Evidence in Literature for Accepted Indicators

Construct Construct
Explicit Implicit Construct
Indicator Codinga,b Processa,b Processa,b Staffinga,b
Complications of anesthesia 0 0 0 0
Death in low mortality DRGs + 0 + 0
Decubitus ulcer - 0 0 ±
Failure to rescue + 0 0 ++
Foreign body left in during procedure 0 0 0 0
Iatrogenic penumothorax 0 0 0 0
Infection due to medical care 0 0 0 0
Postoperative hip fracture + + + 0
Postoperative hemorrhage or hematoma ± ± + 0
Postoperative physiologic and metabolic
derangements - 0 0 -
Postoperative respiratory failure + ± + ±
Postoperative PE or DVT + + + ±
Postoperative sepsis ± 0 0 -
Technical difficulty with procedure ± 0 0 0
Transfusion reaction 0 0 0 0
Postoperative wound dehiscence 0 0 0 0
Birth trauma - 0 0 0
Obstetric trauma – vaginal delivery with + 0 0 0
instrumentation
Obstetric trauma – vaginal delivery without
instrumentation + 0 0 0
Obstetric trauma – cesarean delivery + 0 0 0
a Level of evidence
(-) Published evidence suggests that the indicator lacks validity in this domain (i.e., less than 50% sensitivity or predictive value;
explicit or implicit process failure rates no more frequent than among control patients).
(0) No published evidence regarding this domain of validity.
(±) Published evidence suggests that the indicator may be valid in this domain, but different studies offer conflicting results (although
study quality may account for these conflicts).
(+) Published evidence suggests that the indicator IS valid, or is likely to be valid, in this domain (i.e., one favorable study).
(++) There is strong evidence supporting the validity of this indicator in this domain (i.e., multiple studies with consistent results, or
studies showing both high sensitivity and high predictive value).
b
Coding: Sensitivity is the proportion of patients who suffered an adverse event, based on detailed chart review or prospective data
collection, for whom that event was coded on a discharge abstract or Medicare claim. Predictive value is the proportion of patients
with a coded adverse event who were confirmed as having suffered that event, based on detailed chart review or prospective data
collection.
Construct, explicit process: Adherence to specific, evidence-based or expert-endorsed processes of care, such as appropriate use of
diagnostic modalities and effective therapies. Our construct is that hospitals that provide better processes of care should experience
fewer adverse events.
Construct, implicit process: Adherence to the “standard of care” for similar patients, based on global assessment of quality by
physician chart reviewers. Our construct is that hospitals that provide better overall care should experience fewer adverse events.
Construct, staffing: Our construct is that hospitals that offer more nursing hours per patient day, better nursing skill mix, better
physician skill mix, or more experienced physicians, should have fewer adverse events.
c
Note that when content validity is exceptionally high, as for transfusion reaction or iatrogenic pneumothorax, construct validity
becomes less important.

47
Table 10. Summary of Strength of Evidence in Literature for Experimental Indicators
a
Construct Construct
Explicit Implicit Construct
Indicator Coding Process Process Staffing
Postoperative aspiration pneumonia + ± + +
CABG following PTCA + 0 0 ++
Decubitus ulcer in high-risk patients - 0 0 0
Postoperative fractures potentially related to
falls + 0 0 0
Intraoperative nerve compression injuries 0 0 0 0
Malignant hyperthermia 0 0 0 0
Postoperative acute myocardial infarction ++ - + -
Postoperative iatrogenic complications – ± 0 + 0
cardiac
Postoperative iatrogenic complications – 0 0 0 0
nervous system
Postoperative reopening of surgical site + - + 0
Postoperative suture of laceration + 0 + +
Obstetric wound complications – cesarean ± 0 0 0
Obstetric wound complications – vaginal ± 0 0 0
Other obstetric complications of delivery ± 0 0 0
Third or fourth degree obstetric lacerations + 0 0 0
Uterine rupture + 0 0 0
Postpartum urinary tract infection - 0 0 0
a
See footnotes to Table 9.

Accepted Indicators

Complications of Anesthesia
Source. A subset of this indicator was originally proposed by Iezzoni et al.10 as
part of the CSP (CSP 21, “Complications relating to anesthetic agents and other CNS
depressants”). Their definition also includes poisoning due to centrally acting muscle
relaxants (968.0) and accidental poisoning by nitrogen oxides (E869.0), which were
omitted from this PSI. Their definition excludes other codes included in this PSI,
namely, poisoning by other and unspecified general anesthetics and external cause of
injury codes for “endotracheal tube wrongly placed during anesthetic procedure”
(E876.3) and adverse effects of anesthetics in therapeutic use (E938.1-E938.9).

Evidence
We were unable to find evidence on validity from prior studies.

Death in Low Mortality DRGs


Source. This indicator was originally proposed by Hannan et al. as a criterion for
targeting “cases that would have a higher percentage of quality of care problems than
cases without the criterion, as judged by medical record review.”139 An alternative form
of this indicator focused on “primary surgical procedures,” rather than DRGs, with less
than 0.5% inpatient mortality.

48
Evidence
Construct validity. Based on two-stage implicit review of 8,109 randomly
selected deaths from 104 New York hospitals in 1985-86, Hannan et al. found that
patients in low-mortality DRGs (<0.5%) were 5.2 times more likely than all other
patients who died (9.8% versus 1.7%) to have received “care that departed from
professionally recognized standards,” after adjusting for patient demographic,
geographic, and hospital characteristics. In 15 of these 26 cases (58%) of substandard
care, the patient’s death was attributed at least partially to that care. The association with
substandard care was stronger for the DRG-based definition of this indicator than for the
procedure-based definition (5.7% versus 1.7%, OR=3.2). We were unable to find other
evidence on the validity of this indicator.

Decubitus Ulcer
Source. This indicator was originally proposed by Iezzoni et al.10 as part of the
CSP (CSP 6, “cellulitis or decubitus ulcer”). Their definition also includes cellulitis of the
upper extremity (682.3-682.4), which was omitted from this PSI. Needleman and
Buerhaus137 identified decubitus ulcer as an “Outcome Potentially Sensitive to Nursing,”
but unlike this PSI their definition includes cellulitis of any site (682). The American
Nurses Association, its state associations, and the California Nursing Outcomes Coalition
have identified the total prevalence of inpatients with Stage I, II, III, or IV pressure ulcers
(based on clinical data collection) as a “nursing-sensitive quality indicator for acute care
settings.”140

Evidence
Coding validity. No evidence on validity is available from CSP studies. Geraci et
al.141 confirmed only 2 of 9 episodes of pressure ulcers (707.0) reported on discharge
abstracts of Veterans Affairs (VA) patients hospitalized in 1987-89 for congestive heart
failure (CHF), chronic obstructive pulmonary disease (COPD), or diabetes; the sensitivity
for a nosocomial ulcer was 40% (2/5). Among Medicare hip fracture patients from 297
hospitals in 1985-86, Keeler et al.51 confirmed 6 of 9 (67%) reported pressure ulcers, but
failed to ascertain 89 additional cases (6% sensitivity) using ICD-9-CM codes. In the
largest study to date, Berlowitz et al.142 found that the sensitivity of a discharge diagnosis
of pressure ulcer among all patients transferred from VA hospitals to VA nursing homes
in 1996 was 31% overall, or 54% for stage IV (deep) ulcers. The overall sensitivity
increased modestly since 1992 (26.0%), and was slightly but statistically significantly
better among medical patients than among surgical patients (33% versus 26%).
Construct validity. Needleman and Buerhaus137 found that nurse staffing was
inconsistently associated with the occurrence of pressure ulcers among medical patients
from 799 hospitals in 11 states in 1997, and was independent of pressure ulcers among
major surgery patients. Nursing skill mix (RN hours/licensed nurse hours) was
significantly associated (in the expected direction) with the pressure ulcer rate among 352
and 295 California hospitals in 1992 and 1994, respectively, and also among 126 and 131
New York hospitals in the same years.138 Total licensed nurse hours per acuity-adjusted
patient day were inconsistently associated with the rate of pressure ulcers.

49
Failure To Rescue
Source. This indicator was originally proposed by Silber et al.31 as a more
powerful tool than the risk adjusted mortality rate to detect true differences in patient
outcomes across hospitals. The underlying premise was that better hospitals are
distinguished not by having fewer adverse occurrences but by more successfully averting
death among (i.e., rescuing) patients who experience such complications. Silber et al’s
original definition was based on key clinical findings abstracted from the medical records
of 2,831 cholecystectomy patients and 3,141 transurethral prostatectomy patients
admitted to 531 hospitals in 1985. The key postoperative diagnoses that defined the
denominator at risk of “ failure to rescue” included cardiac arrhythmias, congestive heart
failure, cardiac arrest, pneumonia, pulmonary embolus, pneumothorax, renal dysfunction,
stroke, wound infection, and unplanned return to surgery.
More recently, Needleman and Buerhaus137 adapted failure to rescue to
administrative data sets, hypothesizing that this outcome might be sensitive to nurse
staffing. Their denominator definition included the ICD-9-CM codes for sepsis,
pneumonia (including aspiration), acute upper gastrointestinal bleeding, shock,
cardiac/respiratory arrest, deep vein thrombosis (DVT), and pulmonary embolus (PE).

Evidence
Construct validity. Silber and colleagues have published a series of studies
establishing the construct validity of failure to rescue rates through their associations with
hospital characteristics and other measures of hospital performance. Among patients
admitted for cholecystectomy and transurethral prostatectomy, failure to rescue was
independent of severity of illness at admission, but was significantly associated with the
presence of surgical housestaff and a lower percentage of board-certified
anesthesiologists.31 The adverse occurrence rate was independent of this hospital
characteristic. In a larger sample of 74,647 patients who underwent general surgical
procedures in 1991-92, lower failure to rescue rates were found at hospitals with high
ratios of registered nurses to beds.68 Failure rates were strongly associated with risk
adjusted mortality rates, as expected, but not with complication rates.143 Finally, among
16,673 patients admitted for coronary artery bypass surgery, failure rates were lower
(whereas complication rates were higher) at hospitals with magnetic resonance imaging
facilities, bone marrow transplantation units, or approved residency training programs. 32
More recently, Needleman and Buerhaus137 confirmed that higher registered nurse
staffing (RN hours/adjusted patient day) and better nursing skill mix (RN hours/licensed
nurse hours) were consistently associated with lower failure to rescue rates among major
surgery patients from 799 hospitals in 11 states in 1997, even using administrative data to
define complications. An increase from the 25th to the 75th percentile on these two
measures of staffing was associated with 5.9% (95% CI, 1.5% to 10.2%) and 3.9% (95%
CI, -1.1% to 8.8%) decreases, respectively, in the rate of failure-to-rescue among major
surgery patients.138 These associations were inconsistent among medical patients, in that
nursing skill mix was associated with the failure-to-rescue rate (rate ratio 0.81, 95% CI
0.66-1.00) but aggregate registered nurse staffing was not (rate ratio 1.00, 95% CI 0.99-
1.01). An increase from the 25th to the 75th percentile on nursing skill mix was associated
with a 2.5% (95% CI, 0.0% to 5.0%) decrease in the failure-to-rescue rate among medical
patients.

50
Foreign Body Left in During Procedure
Source. This indicator was originally proposed by Iezzoni et al.10 as part of the
Complications Screening Program (CSP “sentinel events”), along with gas gangrene,
CNS abscess, anoxic brain injury, accidental puncture or laceration, wound dehiscence,
and ABO/Rh transfusion reactions (all of which were omitted from this PSI). It was also
included as one component of a broader indicator (“adverse events and iatrogenic
complications”) in AHRQ’s original HCUP Quality Indicators.144 It was proposed by
Miller et al. 17 in the “Patient Safety Indicator Algorithms and Groupings.” Based on
expert consensus panels, McKesson Health Solutions included this indicator in its
CareEnhance Resource Management Systems, Quality Profiler Complications Measures
Module.

Evidence
We were unable to find evidence on validity from prior studies, which is likely
due to the rarity of this diagnosis.

Iatrogenic Pneumothorax
Source. This diagnosis code was proposed by Miller et al.17 as one component of
a broader indicator (“iatrogenic conditions”) in the “Patient Safety Indicator Algorithms
and Groupings.” It was also included as one component of a broader indicator (“adverse
events and iatrogenic complications”) in AHRQ’s Version 1.3 HCUP Quality Indicators.

Evidence
We were unable to find evidence on validity from prior studies, which is probably
because this diagnosis code was introduced in 1994.

Infection Due to Medical Care


Source. This indicator was originally proposed by Iezzoni et al. as part of the
Complications Screening Program (CSP 11, “miscellaneous complications”). Their
definition also includes other specified and unspecified complications of procedures or
medical care, air embolism, persistent postoperative fistula, minor transfusion reactions,
and an array of external cause of injury codes representing various “misadventures” and
“abnormal reaction of patient” during medical care, including aspiration (which were
omitted from this PSI).10 The University HealthSystem Consortium adopted the CSP
indicator for major (#2933) and minor (#2961) surgery patients. A much narrower
definition, including only 999.3 (“other infection after infusion, injection, transfusion,
vaccination”) was proposed by Miller et al.17 in the “Patient Safety Indicator Algorithms
and Groupings.” The American Nurses Association and its state associations have
identified the number of laboratory-confirmed bacteremic episodes associated with
central lines per critical care patient day as a “nursing-sensitive quality indicator for acute
care settings.”140

51
Evidence
No evidence on validity is available from CSP studies, because this code was
grouped with “miscellaneous complications.” Geraci et al.141 grouped this code with
sepsis (see below). Keeler et al.51 grouped this code with pneumonia and hip joint
infection. We were unable to find other evidence on the validity of this indicator.

Postoperative Hemorrhage or Hematoma


Source. This indicator was originally proposed by Iezzoni et al.10 as part of the
Complications Screening Program (CSP 24, “post-procedural hemorrhage or
hematoma”), although their definition allowed either procedure (i.e., control of
hemorrhage) or diagnosis (i.e., hemorrhage, hematoma, or seroma) codes. By contrast,
the current definition requires either a hemorrhage diagnosis with an associated
procedure to control that hemorrhage, or a hematoma diagnosis with an associated
procedure to drain that hematoma. The University HealthSystem Consortium adopted the
CSP indicator for medical (#2804), cardiac procedure (#2912), and major surgery
(#2947) patients. It was also included as one component of a broader indicator (“adverse
events and iatrogenic complications”) in AHRQ’s original HCUP Quality Indicators.144

Evidence
Coding validity. The original CSP definition had a relatively high confirmation
rate among major surgical cases in the FY1994 Medicare inpatient claims files from
California and Connecticut (83% by coders’ review, 57% by physicians’ review, 52% by
nurse-abstracted clinical documentation, and 76% if nurses also accepted physicians’
notes as adequate documentation). 13-15 Its confirmation rate was moderate among medical
cases (49% by coders’ review, 55% by physicians’ review, 29% by nurse-abstracted
clinical documentation, and 65% if nurses also accepted physicians’ notes), partially
because some cases were present at admission. An earlier study of elderly Medicare
beneficiaries from Massachusetts, Alabama, Iowa, and New York in FY1993 revealed
poorer confirmation rates of 34% (35/104) among major surgical cases (of whom 17 or
49% lacked laboratory or clinical evidence of significant blood loss) and 28% (24/85)
among medical cases (of whom 10 or 42% lacked laboratory or clinical evidence of
significant blood loss).145
Among 185 total knee replacement patients from 5 Ontario hospitals in 1984-90,
Hawker et al.146 found that the sensitivity and predictive value of hemorrhage codes
(definition not given) were 57% (8/14) and 80% (8/10), respectively. Faciszewski et al.147
aggregated postoperative hemorrhage or hematoma (998.1) with wound dehiscence
(998.3), and reported a pooled confirmation rate of 17% (1/6) with 3% (1/34) sensitivity
of coding among 310 patients who underwent spinal fusion at the Marshfield Clinic in
1991-92 (given an unusually broad clinical definition of these wound complications).
Romano et al.93 identified 6 of 16 episodes of hemorrhage or hematoma (998.1) using
discharge abstracts of diskectomy patients at 30 California hospitals in 1990-91; there
were no false positives.
At least two studies have estimated the validity of hemorrhage codes using a gold
standard based on transfusion “requirement.” Hartz and Kuhn identified only 146 of 568
(26%) episodes of bleeding (defined as requiring return to surgery or transfusion of at

52
least 6 units of blood products) by applying this indicator (998.1) to Medicare patients
who underwent coronary artery bypass surgery in Wisconsin in 1990-91; the predictive
value was 75% (146/195).66 In comparison with the VA’s National Surgical Quality
Improvement Program database from 123 hospitals in 1994-95, in which hemorrhage is
defined by transfusion of at least four units of blood products within 30 days after
surgery, the ICD-9-CM diagnosis (998.1) had a sensitivity of 13% and a predictive value
of 10%.148
Construct validity. Explicit process of care failures in the CSP validation study
were relatively frequent among major surgical cases with CSP 24, but not among medical
cases (66% and 13%, respectively), after excluding patients who had hemorrhage or
hematoma at admission.16 Cases flagged on this indicator and unflagged controls did not
differ significantly on a composite of 17 generic process criteria. Similarly, cases flagged
on this indicator and unflagged controls did not differ significantly on a composite of 4
specific process criteria for major surgical cases and 2 specific process criteria for
medical cases in the earlier study of elderly Medicare beneficiaries from Massachusetts,
Alabama, Iowa, and New York.145 Physician reviewers identified potential quality
problems in 37% of major surgery patients and 31% of medical patients with CSP 24
(versus 2% of unflagged controls for each risk group).15

Postoperative Hip Fracture


Source. This indicator was originally proposed by Iezzoni et al.10 as part of the
CSP (CSP 25, “in-hospital hip fracture or fall”). Their definition also includes any
documented fall, based on external cause of injury codes, which was omitted from this
PSI. Needleman and Buerhaus 137 considered in-hospital hip fracture as an “Outcome
Potentially Sensitive to Nursing,” based on input from their Technical Expert Panel, but
discarded it because the “event rate was too low to be useful.” The American Nurses
Association, its state associations, and the California Nursing Outcomes Coalition have
identified the number of patient falls leading to injury per 1,000 patient days (based on
clinical data collection) as a “nursing-sensitive quality indicator for acute care
settings.”140

Evidence
Coding validity. The original CSP definition had an adequate confirmation rate
among major surgical cases in the FY1994 Medicare inpatient claims files from
California and Connecticut (57% by coders’ review, 71% by physicians’ review), but a
very poor confirmation rate among medical cases (11% by both coders’ and physicians’
review).13, 15 This problem was attributable to the fact that most hip fractures among
medical inpatients were actually comorbid diagnoses present at admission rather than
complications of hospital care. Nurse reviews were not performed.
Construct validity. Explicit process of care failures in the CSP validation study
were relatively frequent among cases with CSP 25 (76% of major surgery patients, 54%
of medical patients), after excluding patients who had hip fractures at admission, but
unflagged controls were not evaluated on the same criteria.16 Physician reviewers
identified potential quality problems in 24% of major surgery patients and 5% of medical
patients with CSP 25 (versus 2% of unflagged controls for each risk group).15

53
Postoperative Physiologic and Metabolic Derangements
Source. This indicator was originally proposed by Iezzoni et al.10 as part of the
CSP (CSP 20, “postoperative physiologic and metabolic derangements”). Their definition
also includes (non-diabetic) hypoglycemic coma (251.0), postoperative shock (998.0),
and oliguria/anuria (788.5), which were omitted from this PSI, but it excludes several
codes that were included in this PSI, namely, diabetes with hyperosmolarity, diabetes
with other (hypoglycemic) coma, and acute renal failure. The University HealthSystem
Consortium adopted the CSP indicator for major surgery patients (#2945). Needleman
and Buerhaus137 identified postoperative physiologic/metabolic derangement as an
“Outcome Potentially Sensitive to Nursing,” but they added fluid and electrolyte
disorders (276) to the original CSP 20. Hannan et al. had earlier focused an analogous
indicator exclusively on those fluid and electrolyte disorders.139

Evidence
Coding validity. No evidence on validity is available from CSP studies. Geraci et
al.141 confirmed (by serum chemistry) only 5 of 15 (33%) episodes of acute renal failure
(584, 586) and 12 of 34 (35%) episodes of hypoglycemia (E932.3, 251.0, 251.2, 962.3)
reported on discharge abstracts of VA patients hospitalized in 1987-89 for CHF, COPD,
or diabetes. The sensitivity for a 2.0 mg/dL or greater increase in serum creatinine was
28% (5/18), while the sensitivity for symptomatic diabetic hypoglycemia less than 70
mg/dL was 16% (12/76). Romano et al.93 identified 2 of 2 episodes of acute renal failure
or hypoglycemia (251.0, 251.2, E932.3, 584.x) using discharge abstracts of diskectomy
patients at 30 California hospitals in 1990-91; there were no false positives. In
comparison with the VA’s National Surgical Quality Improvement Program database
from 123 hospitals in 1994-95, in which acute renal failure is defined as requiring
dialysis within 30 days after surgery, ICD-9-CM diagnoses (585 or 788.5) had a
sensitivity of 8% and a predictive value of 4%.148
Construct validity. Based on two-stage review of 8,109 randomly selected deaths
from 104 New York hospitals in 1985-86, Hannan et al.139 reported that cases with a
secondary diagnosis of fluid and electrolyte disorders were no more likely to have
received care that departed from professionally recognized standards than cases without
that code (2.2% versus 1.7%, OR=1.13), after adjusting for patient demographic,
geographic, and hospital characteristics. However, these ICD-9-CM codes were omitted
from the accepted AHRQ PSI. Needleman and Buerhaus137 found that nurse staffing was
independent of the occurrence of metabolic derangement among major surgery patients
from 799 hospitals in 11 states in 1997.

Postoperative Pulmonary Embolism or Deep Vein Thrombosis


Source. This indicator was originally proposed by Iezzoni et al.10 as part of the
CSP (CSP 22, “venous thrombosis and pulmonary embolism”), although their definition
was slightly narrower. It was one of AHRQ’s original HCUP Quality Indicators144 for
major surgery and invasive vascular procedure patients. Needleman and Buerhaus137
identified DVT/PE as an “Outcome Potentially Sensitive to Nursing,” using the same
CSP definition. The Health Care Financing Administration (now CMS) selected “venous
thrombosis or pulmonary embolism following selected inpatient surgical procedures” as

54
one of its surveillance measures of Medicare quality of care.149 A code introduced in 1995
(415.11) that maps to this indicator in the final AHRQ PSI was proposed by Miller et al.17
as one component of a broader indicator (“iatrogenic conditions”) in the “Patient Safety
Indicator Algorithms and Groupings.”

Evidence
Coding validity. CSP 22 had a moderately high confirmation rate among major
surgical cases in the FY1994 Medicare inpatient claims files from California and
Connecticut (59% by coders’ review, 70% by physicians’ review, 60% by nurse-
abstracted clinical documentation, and 68% if nurses also accepted physicians’ notes as
adequate documentation). Its confirmation rate among medical cases was poor (32% by
coders’ review, 28% by physicians’ review, 32% by nurse-abstracted clinical
documentation, and 39% if nurses also accepted physicians’ notes as adequate
documentation) because many cases were present at admission.13-15
Geraci et al.34 confirmed only 1 of 6 episodes of DVT (451.1x) or PE (415.1)
reported on discharge abstracts of Veterans Affairs (VA) patients hospitalized in 1987-89
for CHF, COPD, or diabetes; the sensitivity was 100% (1/1). Among Medicare hip
fracture patients from 297 hospitals in 1985-86, by contrast, Keeler et al.51 confirmed 11
of 20 (88%) reported PE cases, and failed to ascertain just 6 cases (65% sensitivity) using
ICD-9-CM codes. For DVT (451.x, 453.x, 997.2), they found just 1 of 6 cases using ICD-
9-CM codes (but no false positive codes). Among 185 total knee replacement patients
from 5 Ontario hospitals in 1984-90, Hawker et al.146 found that the sensitivity and
predictive value of DVT codes (definition not given) were 50% (4/8) and 100%,
respectively. Romano et al.93 identified 5 of 6 episodes of thromboembolic disease
(415.1x, 451.1x, 451.2, 451.8x, 451.9, 453.2, 453.8, 453.9) using discharge abstracts of
diskectomy patients at 30 California hospitals; there was one false positive. In
comparison with the VA’s National Surgical Quality Improvement Program database
from 123 hospitals in 1994-95, the ICD-9-CM diagnosis of PE (415.1) had a sensitivity
of 49% and a predictive value of 48% for PE within 30 days after surgery.148 Although
Best et al. also reported on the ability to use administrative data to find cases of DVT,
their results cannot be interpreted due to misapplication of ICD-9-CM.
Other studies using the California patient discharge data set have demonstrated
that ICD-9-CM codes for DVT and PE have high predictive value when listed as the
principal diagnosis for readmissions after major orthopedic surgery (i.e., 17/17 or 100%)
or after inferior vena cava filter placement (i.e., 64/65 or 98%).150 However, these
findings do not directly address the validity of DVT/PE as a secondary diagnosis among
patients treated by anticoagulation.
Construct validity. Explicit process of care failures in the CSP validation study
were relatively frequent among both major surgical and medical cases with CSP 22 (72%
and 69%, respectively), after disqualifying cases in which DVT/PE was actually present
at admission.16 Major surgical cases flagged on this indicator and unflagged controls
differed marginally (11% versus 4%, p=0.09) on a composite of 17 generic process
criteria; medical cases and controls were not evaluated on the same criteria. Physician
reviewers identified potential quality problems in 50% of major surgery patients and 20%
of medical patients with CSP 22 (versus 2% of unflagged controls for each risk group).15

55
Needleman and Buerhaus137 found that nurse staffing was independent of the
occurrence of DVT/PE among both major surgical or medical patients from 799 hospitals
in 11 states in 1997. However, Kovner and Gergen reported that among 506 community
hospitals in the 1993 NIS, having more registered nurse hours and non-RN hours per
adjusted patient day were both associated with a lower rate of DVT/PE after major
surgery.126 Nurse staffing was not associated with the rate of DVT/PE after invasive
vascular procedures.

Postoperative Respiratory Failure


Source. This indicator was originally proposed by Iezzoni et al.10 as part of the
CSP (CSP 3, “postoperative pulmonary compromise”). Their broader definition also
includes not just respiratory failure, but also pulmonary congestion, other (or
postoperative) pulmonary insufficiency, and acute pulmonary edema, which were omitted
from this PSI. The University HealthSystem Consortium (#2927) and AHRQ’s original
HCUP Quality Indicators144 adopted the CSP indicator for major surgery patients.
Needleman and Buerhaus137 identified postoperative pulmonary failure as an “Outcome
Potentially Sensitive to Nursing,” using the original CSP definition.

Evidence
Coding validity. CSP 3 had a relatively high confirmation rate among major
surgical cases in the FY1994 Medicare inpatient claims files from California and
Connecticut (72% by coders’ review, 75% by physicians’ review).13, 15 Nurse reviews
were not performed. An earlier study of elderly Medicare beneficiaries from
Massachusetts, Alabama, Iowa, and New York in FY1993 revealed a similarly high
confirmation rate of 72% (66/92) among major surgical cases, although 27% of those
patients (18/66) had inadequate clinical documentation of the diagnosis.145
Geraci et al.34 confirmed 1 of 2 episodes of respiratory failure (518.81, 518.82)
reported on discharge abstracts of VA patients hospitalized in 1987-89 for CHF or
diabetes; the sensitivity for respiratory decompensation requiring mechanical ventilation
was 25% (1/4). Best et al.148 reported on the ability to use administrative data to find
cases of “unplanned intubation,” but their results cannot be interpreted due to
misapplication of ICD-9-CM.
Construct validity. Explicit process of care failures in the CSP validation study
were slightly but not significantly more frequent among major surgical cases with CSP 3
than among unflagged controls (52% versus 46%).16 Indeed, cases flagged on this
indicator were significantly less likely than unflagged controls (24% versus 64%) to have
at least one of four specific process-of-care problems in the earlier study of elderly
Medicare beneficiaries from Massachusetts, Alabama, Iowa, and New York.145 Physician
reviewers identified potential quality problems in 20% of major surgery patients with
CSP 3 (versus 2% of unflagged controls).15
Needleman and Buerhaus137 found that nurse staffing was independent of the
occurrence of pulmonary failure among major surgery patients from 799 hospitals in 11
states in 1997. However, Kovner and Gergen reported that among 506 community
hospitals in the 1993 NIS, having more registered nurse hours per adjusted patient day
was associated with a lower rate of “pulmonary compromise” after major surgery.126

56
Postoperative Sepsis
Source. This indicator was originally proposed by Iezzoni et al.10 as part of the
Complications Screening Program (CSP 7, “septicemia”), although their definition also
includes unspecified bacteremia, which was omitted from this PSI. Needleman and
Buerhaus 137 identified sepsis as an “Outcome Potentially Sensitive to Nursing,” using the
same CSP definition.

Evidence
Coding validity. No evidence on validity is available from CSP studies.
Barbour151 reported that only 38% (53/141) of discharge abstracts from 5 VA medical
centers in 1990 with a diagnosis of sepsis (038.x) actually had hospital-acquired sepsis.
However, this review was not limited to cases with a secondary diagnosis of sepsis, and
sensitivity could not be evaluated. Massanari et al.152 identified 79% of cases of
“nosocomial bacteremia” using 1984 hospital discharge data from the University of Iowa,
but no definitions were provided. Geraci et al.34 confirmed (by blood culture) only 2 of 15
episodes of sepsis or “other infection” (038.x, 999.3) reported on discharge abstracts of
VA patients hospitalized in 1987-89 for CHF, COPD, or diabetes; the sensitivity for a
positive blood culture was 50% (2/4). Romano et al.93 identified 2 of 3 episodes of sepsis
or bacteremia (038.x, 707.0) using discharge abstracts of diskectomy patients at 30
California hospitals in 1990-91; there were no false positives. Belio-Blasco et al.153
reported that “discharge forms” had a sensitivity of 18% (7/39) and a specificity of 100%
for identifying nosocomial bacteremia among surgical patients in a Spanish teaching
hospital. In comparison with the VA’s National Surgical Quality Improvement Program
database from 123 hospitals in 1994-95, in which “systemic sepsis” is defined by a
positive blood culture and systemic manifestations of sepsis within 30 days after surgery,
the ICD-9-CM diagnosis (038.x) had a sensitivity of 37% and a predictive value of
30%.148
Construct validity. Needleman and Buerhaus 137 found that nurse staffing was
independent of the occurrence of sepsis among both major surgical or medical patients
from 799 hospitals in 11 states in 1997.

Postoperative Wound Dehiscence


Source. An indicator on this topic (998.3) was originally proposed by Hannan et
al. to target “cases that would have a higher percentage of quality of care problems than
cases without the criterion, as judged by medical record review.”139 The same code was
also included as one component of a broader indicator (“adverse events and iatrogenic
complications”) in AHRQ’s original HCUP Quality Indicators.144 Iezzoni et al.10
identified an associated procedure code for reclosure of an abdominal wall dehiscence
(54.61), and included both codes in the CSP (CSP “sentinel events” and CSP 9,
“reopening of surgical site,” respectively). Miller et al.17 suggested the use of both codes
(as “wound disruption”) in the original “AHRQ PSI Algorithms and Groupings.”

Evidence
Coding validity. No evidence on validity is available from CSP studies. Among
185 total knee replacement patients from 5 Ontario hospitals in 1984-90, Hawker et al.146

57
found that the sensitivity and predictive value of 998.3 were both 100% (4/4).
Faciszewski et al.147 aggregated wound dehiscence (998.3) with postoperative
hemorrhage or hematoma (998.1), and reported a pooled confirmation rate of 17% (1/6)
with 3% (1/34) sensitivity of coding among 310 patients who underwent spinal fusion at
the Marshfield Clinic in 1991-92 (given an unusually broad clinical definition of these
wound complications). In comparison with the VA’s National Surgical Quality
Improvement Program database from 123 hospitals in 1994-95, in which dehiscence is
defined as fascial disruption within 30 days after surgery, the ICD-9-CM diagnosis of
wound dehiscence (998.3) had a sensitivity of 25% and a predictive value of 23%.148 This
code (998.3) was ultimately removed from the accepted PSI because our clinical panel
was concerned that the ICD-9-CM definition was too broad and failed to distinguish skin
from fascial separation.
Construct validity. Based on two-stage review of 8,109 randomly selected deaths
from 104 New York hospitals in 1985-86, Hannan et al.139 reported that cases with a
secondary diagnosis of 998.3 (wound disruption) were 3.0 times more likely to have
received care that departed from professionally recognized standards than cases without
that code (4.3% versus 1.7%), after adjusting for patient demographic, geographic, and
hospital characteristics. In 3 of these 7 cases (44%) of substandard care, the patient’s
death was attributed at least partially to that care. However, this code was removed from
the accepted PSI after discussions with our clinical panel.

Technical Difficulty With Procedure


Source. This indicator was originally proposed by Iezzoni et al.10 as part of the
CSP, although unlike the final PSI, its codes were split between two CSP indicators (CSP
27, “technical difficulty with medical care,” and “sentinel events”). The latter indicator
also includes gas gangrene, CNS abscess, anoxic brain injury, foreign body left in, wound
dehiscence, and ABO/Rh transfusion reactions, all of which were omitted from this PSI.
The former indicator also includes failure of sterile precautions, mechanical failure of
instrument or apparatus, and “contaminated or infected blood, other fluid, drug,” etc,
although these codes were not included in the final definition of this PSI. It was also
included as one component of a broader indicator (“adverse events and iatrogenic
complications”) in AHRQ’s original HCUP Quality Indicators.144 The University
HealthSystem Consortium adopted CSP 27 as an indicator for medical (#2806) and major
surgery (#2956) patients. Miller et al. 17 also split this set of ICD-9-CM codes into two
broader indicators (“miscellaneous misadventures” and “E codes”) in the original
“AHRQ PSI Algorithms and Groupings.” Based on expert consensus panels, McKesson
Health Solutions included one component of this PSI (998.2, “Accidental Puncture or
Laceration”) in its CareEnhance Resource Management Systems, Quality Profiler
Complications Measures Module.

Evidence
Coding validity. No evidence on validity is available from CSP studies. A study
of laparoscopic cholecystectomy in 18 Ontario hospitals in 1991-95154 found that 95%
(99/104) of patients with an ICD-9 code of 998.2 or E870.0 had a confirmed injury to the
bile duct or gallbladder. However, only 27% had a clinically significant injury that
required any intervention; sensitivity of reporting was not evaluated. A similar study of

58
all cholecystectomies performed in Western Australia between 1988 and 1994 reported
that these two ICD-9 codes had a sensitivity of 40% (19/48) and a predictive value of
23% (19/84) in identifying bile duct injuries.155 Among 185 total knee replacement
patients from 5 Ontario hospitals in 1984-90, Hawker et al.146 found that the sensitivity
and predictive value of codes describing “miscellaneous mishaps during or as a direct
result of surgery” (definition not given) were 86% (6/7) and 55% (6/11), respectively.
Romano et al.93 identified 19 of 45 episodes of accidental puncture or laceration (998.2,
E870.0, or related procedure) using discharge abstracts of diskectomy patients at 30
California hospitals in 1990-91; there was one false positive.

Transfusion Reaction
Source. This indicator was originally proposed by Iezzoni et al.10 as part of the
Complications Screening Program (CSP “sentinel events”), along with gas gangrene,
CNS abscess, anoxic brain injury, accidental puncture or laceration, wound dehiscence,
and foreign body left in (all of which were omitted from this PSI). It was also included as
one component of a broader indicator (“adverse events and iatrogenic complications”) in
AHRQ’s original HCUP Quality Indicators.144 It was proposed by Miller et al. 17 in the
original “AHRQ PSI Algorithms and Groupings,” although their definition also includes
minor transfusion reactions (999.8), which was omitted from this PSI.

Evidence
We were unable to find evidence on validity from prior studies, most likely
because this complication is quite rare.

Accepted Obstetric Indicators

Birth Trauma – Injury to Neonate


Source. This indicator has been widely used in the obstetric community, although
it is most commonly based on chart review rather than administrative data. It was
proposed by Miller et al.17 in the original “AHRQ PSI Algorithms and Groupings,”
although their definition also includes injury to the brachial plexus (767.6), which was
excluded from this PSI. Based on expert consensus panels, McKesson Health Solutions
included a broader version of this indicator (767.xx) in its CareEnhance Resource
Management Systems, Quality Profiler Complications Measures Module.

Evidence
Coding validity. A study of 669 newborns at Georgetown University Hospital
who had a discharge diagnosis of birth trauma (codes not specified) found that only 25%
(164/669) had sustained a significant injury to the head, neck, or shoulder.156 The
remaining patients either had superficial injuries or injuries inferior to the neck. We were
unable to find other evidence on the validity of this indicator. Towner et al. linked
California maternal and infant discharge abstracts from 1992 through 1994, but they used
only infant discharge abstracts to describe the incidence of neonatal intracranial injury,
and they did not report the extent of agreement between the two data sets.157

59
Obstetric Trauma (All Delivery Types)
Source. An overlapping subset of this indicator (third or fourth-degree perineal
laceration [664.2x-664.3x]) has been adopted by the Joint Commission for the
Accreditation of Healthcare Organizations (JCAHO) as a core performance measure for
“pregnancy and related conditions” (PR-25). (The JCAHO indicator was less preferred by
the clinical panelists than a definition restricted to fourth degree lacerations, so the
JCAHO definition was retained for exploration as an Experimental indicator.) Based on
expert consensus panels, McKesson Health Solutions included the JCAHO indicator in its
CareEnhance Resource Management Systems, Quality Profiler Complications Measures
Module. Fourth degree laceration (664.3x), one of the codes mapped to this PSI, was
included as one component of a broader indicator (“obstetrical complications”) in
AHRQ’s original HCUP Quality Indicators.144

Evidence
Coding validity. In a stratified probability sample of 1,611 vaginal and cesarean
deliveries from 51 California hospitals in 1992-93, the weighted sensitivity and predictive
value of coding for third and fourth degree lacerations and vulvar/perineal hematomas
(based on either diagnosis or procedure codes) were 89% (311/340) and 90% (311/337),
respectively.158 The authors did not report coding validity for third and fourth degree
lacerations separately. We were unable to find other evidence on validity from prior
studies.

Experimental Indicators

Aspiration Pneumonia
Source. This indicator was originally proposed by Iezzoni et al.10 as part of the
CSP (CSP 2, “aspiration pneumonia”). Needleman and Buerhaus137 identified
postoperative pneumonia as an “Outcome Potentially Sensitive to Nursing,” but their
definition aggregated bacterial, aspiration (507.0), and “hypostatic” (514) pneumonia.
The University HealthSystem Consortium adopted the CSP indicator for major surgery
patients (#2924).

Evidence
Coding validity. CSP 2 had a moderate confirmation rate among major surgical
cases in the FY1994 Medicare inpatient claims files from California and Connecticut
(77% by coders’ review, 59% by physicians’ review, 50% by nurse-abstracted clinical
documentation, and 85% if nurses also accepted physicians’ notes as adequate
documentation).13-15 Geraci et al.34 confirmed (by chest radiography) 0 of 7 episodes of
aspiration pneumonia (482.9, 507.0) reported on discharge abstracts of VA patients
hospitalized in 1987-89 for CHF, COPD, or diabetes; the sensitivity for a new alveolar
infiltrate was 0% (0/5).

60
Construct validity. Explicit process of care failures in the CSP validation study
were relatively frequent among major surgical cases with CSP 2 (69%), after excluding
two patients who had aspiration pneumonia at admission.16 Cases flagged on this
indicator and unflagged controls did not differ significantly on a composite of 17 generic
process criteria. Physician reviewers identified potential quality problems in 21% of
major surgery patients with CSP 2 (versus 2% of unflagged controls).15
Needleman and Buerhaus137 found that higher registered nurse staffing (RN
hours/adjusted patient day) and better nursing skill mix (RN hours/licensed nurse hours)
were consistently associated with the occurrence of pneumonia (including aspiration and
“hypostatic” pneumonia) among medical patients from 799 hospitals in 11 states in 1997.
An increase from the 25th to the 75th percentile on these two measures of staffing was
associated with 2.7% (95% CI, -0.4% to 5.8%) and 6.4% (95% CI, 2.8% to 10.0%)
decreases, respectively, in the rate of pneumonia.159 Skill mix was “weakly” associated
with the rate of pneumonia among major surgical patients. Nursing skill mix was
significantly associated (in the expected direction) with the pneumonia rate among 352
and 295 California hospitals in 1992 and 1994, respectively, but not among 126 and 131
New York hospitals in the same years.138 Total licensed nurse hours per acuity-adjusted
patient day were not associated with the pneumonia rate, except in California in 1994,
where the association was actually positive.

CABG Following PTCA


Source. This indicator was developed by the University HealthSystem
Consortium (#2906) to identify patients who experienced a complication of PTCA that
required urgent surgical repair. This indicator has been used in several studies of PTCA
outcomes and the relationship between volume and outcome.127-135

Evidence
We were unable to find evidence on validity from prior studies, except insofar as
higher hospital angioplasty volume has consistently been associated with lower risk of
CABG following PTCA.127-135 Physician volume generally has an independent effect on
the risk of CABG following PTCA, confirming that this measure is sensitive to operator
experience and skill,132-135 although some recent data suggest that this effect may
disappear at high-volume hospitals.160 One study involving Medicare inpatient claims
from 1987 through 1990 also showed that CABG following PTCA was slightly less
frequent at hospitals with “major” medical school affiliations than at other hospitals. 131

61
Decubitus Ulcer in High-Risk Patients
Source. This variation of Accepted PSI “Decubitus ulcer” was designed in
response to concerns that the accepted indicator excludes the subset of patients at highest
risk of developing pressure ulcers if they receive inadequate care in the hospital. It
differs from Accepted PSI “Decubitus Ulcer” in that the denominator population is
limited to patients with hemiplegia, paraplegia, or quadriplegia, and patients admitted
from long term care facilities. The American Nurses Association, its state associations,
and the California Nursing Outcomes Coalition have identified the total prevalence of
inpatients with Stage I, II, III, or IV pressure ulcers (based on clinical data collection) as
a “nursing-sensitive quality indicator for acute care settings.”140

Evidence
We were unable to find evidence on validity from prior studies, but this is simply
a modified version of an indicator on the accepted list. Validity may be lower in this
setting, if a substantial proportion of pressure sores are pre-existing, but may be higher if
these patients are especially sensitive to the effects of suboptimal nursing care.

In-Hospital Fractures Possibly Related to Falls


Source. This indicator was developed by our clinical panels, based on Accepted
indicator “Postoperative hip fracture.” Needleman and Buerhaus 137 considered in-
hospital fall or fracture as an “Outcome Potentially Sensitive to Nursing,” based on input
from their Technical Expert Panel, but discarded it because the “event rate was too low to
be useful.” The American Nurses Association, its state associations, and the California
Nursing Outcomes Coalition have identified the number of patient falls leading to injury
per 1,000 patient days (based on clinical data collection) as a “nursing-sensitive quality
indicator for acute care settings.”140

Evidence
Coding validity. Among 185 total knee replacement patients from 5 Ontario
hospitals in 1984-90, Hawker et al.146 found that the sensitivity and predictive value of
“fall and fracture” codes (definition not given) were 80% (4/5) and 100%, respectively.
We were unable to find other evidence for this indicator.

Intraoperative Nerve Compression Injuries


Source. A subset of this indicator (brachial plexus lesions [353.0]) was originally
proposed by Iezzoni et al.10 as part of the CSP (CSP 13, “postoperative complications
relating to central or peripheral nervous system”). The University HealthSystem
Consortium adopted this CSP indicator for major surgery patients (#2934). However, this
indicator was extensively revised after discussions with our clinical panels.

Evidence

62
We were unable to find evidence on validity from prior studies, because this
complication is quite rare. Best et al.148 reported on the ability to use administrative data
to find cases of “other neurologic” (including peripheral nerve) deficits, but their results
cannot be interpreted due to misapplication of ICD-9-CM.

Malignant Hyperthermia

Source. This indicator was created after review of ICD-9-CM codes, and
discussions with our clinical panel.

Evidence
We were unable to find evidence on validity from prior studies, because this
diagnosis code was introduced in 1998.

Postoperative Acute Myocardial Infarction


Source. This indicator was originally proposed by Iezzoni et al.10 as part of the
CSP (CSP 14, “postoperative acute myocardial infarction”). The University
HealthSystem Consortium (#2935) and AHRQ’s original HCUP Quality Indicators144
adopted this CSP indicator for major surgery patients.

Evidence
Coding validity. CSP 14 had a high confirmation rate among major surgical
cases in the FY1994 Medicare inpatient claims files from California and Connecticut
(84% by coders’ review, 95% by physicians’ review, 81% by nurse-abstracted clinical
documentation, and 89% if nurses also accepted physicians’ notes as adequate
documentation).13-15 An earlier study of elderly Medicare beneficiaries from
Massachusetts, Alabama, Iowa, and New York in FY1993 revealed a similarly high
confirmation rate of 84% (69/82) among major surgical cases, although 39% of those
patients (27/69) had neither electrocardiographic nor enzyme evidence supporting the
diagnosis.145
Geraci et al.141 identified 0 of 3 AMI episodes (410.x1) using the discharge
abstracts of VA patients hospitalized in 1987-89 for CHF, COPD, or diabetes. In
comparison with the VA’s National Surgical Quality Improvement Program database
from 123 hospitals in 1994-95, the ICD-9-CM diagnosis of AMI (410.xx) had a
sensitivity of 58% and a predictive value of 47% for Q-wave infarctions within 30 days
after surgery.148? By contrast, the 1985 National DRG Validation Study suggested that the
sensitivity of ICD-9-CM 410.xx exceeds 75%, even when it is coded as a secondary
diagnosis (n=67) rather than as the reason for admission.161

63
Construct validity. Explicit process of care failures in the CSP validation study
were only moderately frequent among major surgical cases with CSP 14 (46%).16 Cases
flagged by this indicator and unflagged controls differed significantly (p<0.02) on a
composite of 17 generic process criteria, but the latter group actually demonstrated worse
performance. Similarly, cases flagged on this indicator were significantly less likely than
unflagged controls (29% versus 57%) to have at least one of seven specific process-of-
care problems in the earlier study of elderly Medicare beneficiaries from Massachusetts,
Alabama, Iowa, and New York.145 Physician reviewers identified potential quality
problems in 22% of major surgery patients with CSP 14 (versus 2% of unflagged
controls).15 Kovner and Gergen reported that among 506 community hospitals in the 1993
NIS, having more registered nurses per adjusted patient day was not associated with
lower rates of AMI after major surgery.126

Postoperative Iatrogenic Complications – Cardiac System


Source. This indicator was originally proposed by Hannan et al. as a criterion for
targeting “cases that would have a higher percentage of quality of care problems than
cases without the criterion, as judged by medical record review.”139 It was endorsed by
Iezzoni et al.10 as one component of a much broader indicator (CSP 26, “iatrogenic
complications”) in the CSP. The definition of that indicator includes central nervous
system, cardiac, peripheral vascular, respiratory, gastrointestinal, urinary, and unspecified
amputation stump complications, as well as complications affecting other body systems.
It was also included as one component of a broader indicator (“adverse events and
iatrogenic complications”) in AHRQ’s original HCUP Quality Indicators.144 The
University HealthSystem Consortium adopted this CSP indicator for cardiac procedure
patients (#2913).

Evidence
Coding validity. CSP 26 had a very high confirmation rate among major surgical
cases in the FY1994 Medicare inpatient claims files from California and Connecticut
(92% by coders’ review) and a borderline confirmation rate among medical cases (59%
by coders’ review).13 Physician reviews were not performed. Faciszewski et al. 147
confirmed only 20% (2/10) of reported cases of cardiac complications (997.1) among 310
patients who underwent spinal fusion at the Marshfield Clinic in 1991-92. The sensitivity
of coding for this complication was 40% (2/5). Among 185 total knee replacement
patients from 5 Ontario hospitals in 1984-90, Hawker et al.146 found that the sensitivity
and predictive value of cardiac complication codes (definition not given) were 67% (6/9)
and 86% (6/7), respectively. Romano et al. 93 identified 2 of 5 episodes of cardiac
complications (with 2 false positives) using discharge abstracts of diskectomy patients at
30 California hospitals in 1990-91.
Construct validity. Explicit process of care failures in the CSP validation study
were slightly but not significantly more frequent among cases with CSP 26 (58%
surgical, 9% medical) than among unflagged controls (46% surgical, 5% medical). Based
on two-stage review of 8,109 randomly selected deaths from 104 New York hospitals in
1985-86, Hannan et al.139 reported that cases with a secondary diagnosis of 997.1
(cardiac) were 3.4 times more likely to have received care that departed from
professionally recognized standards than cases without that code (7.1% versus 1.7%),

64
after adjusting for patient demographic, geographic, and hospital characteristics. In 25 of
these 33 cases (76%) of substandard care, the patient’s death was attributed at least
partially to that care.

Postoperative Iatrogenic Complications – Nervous System


Source. This diagnosis code was originally proposed by Iezzoni et al.10 as one
component of a much broader indicator (CSP 26, “iatrogenic complications”), which was
part of the CSP. Their definition includes central nervous system, cardiac, peripheral
vascular, respiratory, gastrointestinal, urinary, and unspecified amputation stump
complications, as well as complications affecting other body systems. It was also
included as one component of a broader indicator (“adverse events and iatrogenic
complications”) in AHRQ’s original HCUP Quality Indicators.144 The University
HealthSystem Consortium adopted this CSP indicator for cardiac procedure patients
(#2913).

Evidence
Coding validity. CSP 26 had a very high confirmation rate among major surgical
cases in the FY1994 Medicare inpatient claims files from California and Connecticut
(92% by coders’ review) and a borderline confirmation rate among medical cases (59%
by coders’ review).13 Physician reviews were not performed. Romano et al.93 identified 1
of 2 episodes of CNS complications (with 4 false positives) using discharge abstracts of
diskectomy patients at 30 California hospitals in 1990-91.
Construct validity. Explicit process of care failures in the CSP validation study
were slightly but not significantly more frequent among cases with CSP 26 (58%
surgical, 9% medical) than among unflagged controls (46% surgical, 5% medical).

Reopening of Surgical Site


Source. This indicator was originally proposed by Iezzoni et al.10 as part of the
CSP (CSP 9, “reopening of surgical site”), although their definition was slightly broader
than the proposed PSI (i.e., it includes revision of corrective procedure on heart (35.95)
and reclosure of postoperative disruption of the abdominal wall (54.61)). The University
HealthSystem Consortium adopted this CSP indicator for major surgery patients (#2930).

Evidence
Coding validity. CSP 9 had a relatively high confirmation rate among major
surgical cases in the FY1994 Medicare inpatient claims files from California and
Connecticut (97% by coders’ review, 61% by physicians’ review, 84% by nurse-
abstracted clinical documentation).13-15
Construct validity. Explicit process of care failures in the CSP validation study
were only moderately frequent among major surgical cases with CSP 9 (43%), after
excluding one patient who had this complication at admission,16 but unflagged controls
were not evaluated on the same criteria. Physician reviewers identified potential quality
problems in 48% of major surgery patients with CSP 9 (versus 2% of unflagged
controls).15

65
Suture of Laceration
Source. This indicator was originally proposed by Iezzoni et al.10 as part of the
CSP (CSP 17, “procedure-related perforation or laceration”). Their definition includes
diagnosis codes (not included in this PSI) for spontaneous perforation of the esophagus
(530.4), intestine (569.83), gallbladder (575.4), or bile duct (576.3), as well as procedure
codes for repair of various organ lacerations. It was utilized by Miller et al.17 in the
original “AHRQ PSI Algorithms and Groupings,” although their definition added suture
of laceration of diaphragm (34.82), small intestine (46.73), and anus (49.71). These
additional codes were included in this PSI, along with a few more codes (e.g. laceration
of nerve). The University HealthSystem Consortium adopted this CSP indicator for major
surgery patients (#2941).

Evidence
Coding validity. This cluster is very similar to CSP 17, which had a relatively
high confirmation rate among major surgical cases in the FY1994 Medicare inpatient
claims files from California and Connecticut (71% by coders’ review, 58% by
physicians’ review, 69% by nurse-abstracted clinical documentation, and 75% if nurses
also accepted physicians’ notes as adequate documentation).13-15 The CSP criteria were
not fully successful in excluding pre-admission trauma, but it is not clear which code(s)
accounted for this problem. An earlier study of elderly Medicare beneficiaries from
Massachusetts, Alabama, Iowa, and New York in FY1993 revealed a similar
confirmation rate of 70% (65/93) among major surgical cases, although 18% of those
patients (12/65) lacked clear physical examination evidence of the diagnosis.145
Construct validity. Physician reviewers identified potential quality problems in
36% of major surgery patients with CSP 17 (versus 2% of unflagged controls).15 In the
New York SID from 1997, nursing expertise (full-time and part-time RNs as a proportion
of all licensed nurses) below the statewide median level was associated with a higher
unadjusted rate of this indicator (24 versus 15 events per 10,000 discharges).17

Experimental Obstetric Indicators


Obstetric Wound Complications – Cesarean Delivery
Source. Disruption of a cesarean wound (674.1x) was proposed by Miller et al.17
as part of a broader indicator (“obstetrical misadventures”) in the original “AHRQ PSI
Algorithms and Groupings.” It was also included as one component of a broader indicator
(“obstetrical complications”) in AHRQ’s original HCUP Quality Indicators.144

Evidence
Coding validity. Weiss et al.162 reviewed 636 deliveries in Massachusetts
hospitals in 1990-97 reported to have had cesarean wound disruption (674.1x), and found
that 29% (179/636) were actually uterine ruptures before or during labor. Therefore, the
maximum possible predictive value of this diagnosis was 71%. In a stratified probability
sample of 1,611 vaginal and cesarean deliveries from 51 California hospitals in 1992-93,
the sensitivity and predictive value of wound disruption, hematoma, or infection (based

66
on either diagnosis or procedure codes) were 27% and 91%, respectively.163 We were
unable to find other evidence on validity from prior studies.

Obstetric Wound Complications – Vaginal Delivery


Source. This variation of the above PSI was designed as a “sister” measure for
vaginal deliveries, based on review of ICD-9-CM codes and discussions with the clinical
panel. Perineal wound disruption (674.2x), one of the codes mapped to this PSI, was also
included as one component of a broader indicator (“obstetrical complications”) in
AHRQ’s original HCUP Quality Indicators.

Evidence
Coding validity. In a stratified probability sample of 1,611 vaginal and cesarean
deliveries from 51 California hospitals in 1992-93, the weighted sensitivity and predictive
value of wound disruption, hematoma, or infection (based on either diagnosis or
procedure codes) were 27% (18/37) and 91% (18/21), respectively.163 We were unable to
find other evidence on validity from prior studies.

Other Obstetric Complications


Source. These diagnosis codes were proposed by Miller et al. 17 as part of a
broader indicator (“obstetrical misadventures”) in the original “AHRQ PSI Algorithms
and Groupings.” They include codes 668.x and 669.x (pulmonary, cardiac, and central
nervous system complications, other specified and unspecified complications of
anesthesia or sedation, shock and other major complications of obstetric procedures,
acute postpartum renal failure). All of the codes mapped to this PSI were included as part
of a broader indicator (“obstetrical complications”) in AHRQ’s original HCUP Quality
Indicators.144

Evidence
Coding validity. In a stratified probability sample of 1,611 vaginal and cesarean
deliveries from 51 California hospitals in 1992-93, the weighted sensitivity and predictive
value of coding for cardiac (668.1x, 995.4) and pulmonary (668.2x) complications of
obstetric anesthesia or analgesia were 24% (8/16) and 97% (8/9), respectively.163 The
authors did not report coding validity for the other components of this PSI. We were
unable to find other evidence on validity from prior studies.

Postpartum Urinary Tract Infection


Source. This indicator was created after review of ICD-9-CM codes and
discussions with the clinical panel. The definition is specific to “infections of the
genitourinary tract” that are labeled as postpartum complications, although some of these
infections may have originated in the antepartum period.

Evidence

67
Coding validity. In a stratified probability sample of 1,611 vaginal and cesarean
deliveries from 51 California hospitals in 1992-93, the weighted sensitivity and predictive
value of postpartum urinary tract infection were 20% (5/13) and 41% (5/8),
respectively.163 We were unable to find other evidence on validity from prior studies,
because this indicator has not previously been used as a measure of quality.

Third or Fourth Degree Obstetric Lacerations


Source. This indicator has been adopted by the JCAHO as a core performance
measure for “pregnancy and related conditions” (PR-25). A revised version of this
indicator, based on input from our clinical panel, qualified as Accepted indicators,
“Obstetric trauma.”

Evidence
Coding validity. In a stratified probability sample of 1,611 deliveries from 51
California hospitals in 1992-93, the weighted sensitivity and predictive value of coding
for third and fourth degree lacerations and vulvar/perineal hematomas (based on either
diagnosis or procedure codes) were 89% (311/340) and 90% (311/337), respectively.158
The authors did not report coding validity for third and fourth degree lacerations
separately. We were unable to find other evidence on validity from prior studies.

Uterine Rupture
Source. This indicator has been widely used for monitoring the impact of vaginal
birth after cesarean delivery, which is associated with an increased incidence of uterine
rupture.164, 165

Evidence
Coding validity. Weiss et al.162 reviewed 615 deliveries in Massachusetts
hospitals in 1990-97 reported to have had uterine rupture before or during labor (665.0x,
665.10, 665.11), and confirmed 51% (306/615). The maximum possible sensitivity was
64% (306/480), because some uterine ruptures were miscoded as cesarean wound
disruption (674.1x). We describe this estimate as the “maximum possible sensitivity”
because false negatives were only captured if they were miscoded with 674.1.
Construct validity. Although we found no data on how often quality-of-care
problems are associated with uterine rupture, Gregory et al. showed that women in
California who delivered at hospitals with high attempted VBAC (vaginal birth after
cesarean) rates in 1995 were more likely to have successful VBAC, but also more likely
to experience uterine rupture, than women who delivered at hospitals with lower VBAC
rates. This finding is consistent with the construct that high uterine rupture rates reflect
an overly aggressive approach to VBAC. Induction of labor with prostaglandins has been
associated with a major increase in the risk of uterine rupture (RR=15.6).164, 165

Section 3B. Indicator Selection


Indicator selection consisted of a multi-stage process, shown in Flow Diagram 1.
Promising indicators identified from the literature or other sources were assessed for face

68
validity by clinicians through a structured process. The first round specifications of
indicators were usually modified to varying extents based on clinical and coding input.
Then for each indicator, the revised specification was rated by panelists on a number of
dimensions, but most importantly the likely usefulness of the indicator as a screen for
potentially preventable complications of care. The usefulness rating provided the primary
filter by which indicators were grouped into three categories representing the more
promising to less useful indicators — a.) Accepted, b.) Experimental, or c.) Rejected.
Table 11 provides a summary of Accepted PSIs and the panel ratings show that these
indicators were rated as fairly useful by either practically all of the panelists (Acceptable)
or most with minimal dissent from those rating it lower (Acceptable (-)). Table 12 lists
the Experimental PSIs, those measures which panelists were less sanguine about than
those in the Accepted indicator set or that were more problematic to specify according to
the intent of the panel discussion. Each indicator in the Experimental indicator set has
some positive characteristics, along with some relatively important potential limitations.
Table 13 lists Rejected indicators, indicators that received low ratings by the panelists,
and did not merit further exploration. The footnotes to these tables summarize
idiosyncratic reasons for the categorization rationale.

69
Table 11. Accepted Indicators (provider and area level)
Indicator Name Multi-specialty Panel Surgical Panel Definition
Evaluationa Evaluationa Used
Complications of anesthesia 3 Acceptable (-) Surgical
Death in low mortality DRGs M2 Acceptable
Decubitus ulcer M1 Acceptable
Failure to rescue M2 Acceptable
Foreign body left in during S2 Acceptable 2 Acceptable (-) Same
procedureb
Iatrogenic pneumothoraxb P1 Acceptable
Infection due to medical careb M1 Acceptable (-)
Postoperative hemorrhage or S1 Acceptable (-) 3 Acceptable Surgical
hematomad
Postoperative hip fracturec M1 Acceptable
Postoperative physiologic and S3 Acceptable (-) 3 Unclear Surgical
metabolic derangements
Postoperative respiratory failure S2 Unclear 2 Acceptable (-) Surgical
Postoperative pulmonary embolism S1 Acceptable (-) 1 Acceptable Same
or deep venous thrombosis
Postoperative sepsis M1 Acceptable (-)
Postoperative wound dehiscenceb S2 Acceptable (-) 2 Acceptable (-) Surgical
Technical difficulty with procedureb P1 Acceptable
Transfusion reactionb S3 Acceptable 3 Acceptable Same
Birth trauma-injury to neonate O1 Acceptable
Obstetric trauma - cesarean sectione O1 Acceptable (-)
Obstetric trauma - vaginal with O1 Acceptable (-)
instrumente
Obstetric trauma - vaginal without O1 Acceptable (-)
instrumente
a
M, P, O, S refer to Medical, Procedure, Obstetric or Surgery Multi-specialty Panels and their identifying number (see
Appendix B for further detail). 1,2,3 refers to the Surgical Panel, if reviewed by Surgical Panel (see Appendix B).
“Acceptable” indicates that the indicator was rated as useful by almost all panelists. “Acceptable (-)” indicates that the
indicator was rated as useful by most panelists, although a few rated it as less useful (but not as poor). “Unclear”
indicates that panelists rated the usefulness of the indicator as moderate. Panel overall ratings are described in detail
Clinician Panel Review Methods (Section 2D) under Tabulation of Results subsection.
b
Provider and area level indicators specified for this indicator.
c
Panel requested other fractures in addition to hip fracture, but empirical analyses indicated concerns about ability to
operationalize well enough for accepted list.
d
Codes for post-op hemorrhage or hematoma were expanded to include 5th digits in October 1996, and therefore this
indicator is invalid before that date.
e
Obstetric trauma indicators were not rated separately, though panelists were informed that the indicator would be split
into three types of delivery.

70
Table 12. Experimental Indicators

Indicator Name Multi-specialty Panel Surgical Panel Definition


Evaluationa Evaluationa Used
Aspiration pneumonia S2 Unclear 2 Unclear Same
CABG after PTCAb P1 Acceptable
Decubitus ulcer in high risk
patientsc
In-hospital fractures possibly M1 Acceptable
related to fallsd
Intraoperative nerve compression S3 Acceptable 3 Acceptable Surgical
injuriese
Malignant hyperthermiaf S3 Acceptable 1 Acceptable (-) Same
Postoperative acute myocardial S1 Unclear (-) 3 Acceptable (-) Surgical
infarctiong
Postoperative iatrogenic P1 Not rated
complications – cardiac systemh separately
Postoperative iatrogenic P1 Not rated
complications – nervous systemh,i separately
Reopening of surgical sitej S2 Unclear 3 Acceptable (-) Surgical
Suture of lacerationk S2 Acceptable 2 Unclear (-) Surgical
Obstetric wound complications- O2 Acceptable
cesarean section
Obstetric wound complications- O2 Unclear
vaginal delivery
Other obstetric complications O2 Unclear
Post-partum urinary tract infection O2 Acceptable (-)
Third or fourth degree obstetric
laceration (JCAHO)l
Uterine rupturem
a
M, P, O, S refer to Medical, Procedure, Obstetric or Surgery Multi-specialty Panels and their identifying number (see
Appendix B for further detail). 1,2,3 refers to the Surgical Panel, if reviewed by Surgical Panel (see Appendix B).
“Acceptable” indicates that the indicator was rated as useful by almost all panelists. “Acceptable (-)” indicates that the
indicator was rated as useful by most panelists, although a few rated it as less useful (but not as poor). “Unclear”
indicates that almost all panelists rated the usefulness of the indicator as moderate.
”Unclear (-)” indicates that most of the panelists rated the usefulness as moderate, although a few rated it as less useful.
Panel overall ratings are described in detail Clinician Panel Review Methods (Section 2D) under Tabulation of Results
subsection.
b
Accepted by panel, but lack of review by physicians performing PTCA led to demoting indicator.
c
Indicator suggested by panel, with concerns, and by AHRQ.
d
This indicator was defined as closely to the panel suggestion as possible, but empirical analysis showed higher fracture
rates in non-elderly men. Further analysis led to exclusions and a more limited list of fractures to reduce the likelihood
of capturing fractures unrelated to falls. However, the problem still persists to some degree. We therefore demoted the
indicator to the experimental list and retained a CSP based version of the hip fracture indicator on the accepted list.
e
This indicator is extremely rare, leading to questions regarding coding and operationalization. This indicator requires
the code 997.09 which was not added until October 1995. This indicator is invalid before that date.
f
This code (995.86) was added in October 1998 and thus this indicator is invalid before this date. Although accepted by
panels, with one dissent, we cannot evaluate because data sources date only to 1997.
g
This indicator was rejected by the multi-specialty panel (median=4), but accepted by the surgical panel.
h
These indicators, although accepted by panel were demoted due to concern that panel discussions were not
comprehensive enough to justify acceptance for each of the split indicators.
i
Codes for iatrogenic nervous system complications were expanded to include 5th digits in October 1995, and therefore
this indicator is invalid before that date.
j
Accepted by surgical panel only, but concerns about operationalization remain and cannot be easily resolved.
k
This indicator was rejected by surgical panel (median = 5), accepted by multi-specialty.
l
This indicator is a core JCAHO indicator, not reviewed by panel, although 4th degree lacerations are part of the
Obstetric Trauma indicator on the Accepted Listing.
m
This indicator was split off from other Obstetric complications, due to questions on operationalization of panel
requests and strong arguments for splitting.

71
Table 13. Rejected Indicators

Indicator Name Multi-specialty Panel Surgical Panel Definition


Evaluationa Evaluationa Used
Dosage complications M2 Unclear (-)
Iatrogenic hypotension P1 Unclear (-)
Intestinal infection due to C. difficile M1 Unclear (-)
PO Iatrogenic complications – P1 Not rated
digestive complicationsb separately
PO Iatrogenic complications – P1 Not rated
respiratory complicationsb separately
PO Iatrogenic complications – P1 Not rated
urinary complicationsb separately
PO Iatrogenic complications – P1 Not rated
vascular complicationsc separately
Postoperative pneumonia S1 Unclear (-) 3 Unclear Same
Unexpected LOS/Conditional LOS M2 Unclear Unable to
specify panel
suggestions
Obstetric thrombosis or embolism O2 Unclear (-)
Puerperal infection O2 Unclear (-)
a
M, P, O, S refer to Medical, Procedure, Obstetric or Surgery Multi-specialty Panels and their identifying number (see
Appendix B for further detail). “Unclear” indicates that almost all panelists rated the usefulness of the indicator as
moderate. ”Unclear (-)” indicates that most of the panelists rated the usefulness as moderate, although a few rated it as
less useful. Panel overall ratings are described in detail Clinician Panel Review Methods (Section 2D) under Tabulation
of Results subsection.
b
Panel accepted the concept of capturing a set of iatrogenic complications, but empirical analyses suggests that most
complications in this category are clinically insignificant.
c
Panel accepted, but covers same complications as vascular complications indicator, which is more complete measure.

The degree to which panelists perceived indicators as preventable (e.g., “Foreign


body left in during procedure,” “Decubitus ulcer,” “Obstetric trauma-cesarean section”)
tended to relate to the usefulness rating. In other words, the higher the rating for
usefulness, the higher the rating for preventability. All indicators in the Accepted
indicator set received a median rating of at least 6 by one or more panels (on a scale from
1 to 9 where higher scores represent the opinion that a complication is preventable).
However, some rejected indicators that panelists thought would surely be preventable
(e.g., dosage complications received a median score of 8) were rated poorly overall
because of problems with the indicator (e.g., that it would be inconsistently documented).
The adapted UCLA/RAND method may be applied to the preventability ratings to
identify complications felt by panelists to be more or less preventable, although this
rating does not take into account other potential pitfalls of indicators, such as bias or
charting practices. Table 14 shows the results of this categorization for the preventability
ratings for the Accepted indicators.
For most indicators, panelists rated the medical error scale lower than the
preventability scale. However, several indicators had relatively high scores (median, 7 –
8) equivalent for both of these scales – “Foreign body left in during procedure,”
“Decubitus ulcer,” “Iatrogenic pneumothorax,” “Dosage complications,” “In-hospital
fracture,” and “Transfusion reaction.” Again, the UCLA/RAND method may be applied
to the medical error ratings. Table 15 demonstrates the wider dispersion in Accepted
indicators when medical error ratings are used.

72
Table 14. Groupings Based on Preventability
Acceptable Acceptable (-) Unclear Unclear (-)
Decubitus ulcer Comp. of anesthesia Death in low Failure to rescue
mortality DRG
Foreign body Infection due to PO hemmorhage/ PO physio. or
med. care hematoma metab. derangement
Iatrogenic PO PE or DVTb PO pulmonary
pneumothoraxa compromise
In-hosp. fracturea Transfusion PO wound
reaction dehiscence
Tech. diff. with Birth trauma Postoperative
procedure sepsis
OB trauma (all Post-partum UTI OB wound comp. –
delivery types) c-sect
a
Panel ratings based on definitions different than final definitions. For “Iatrogenic pneumothorax,” the rated
denominator was restricted to patients receiving thorocentesis or central lines; the final definition expands the
denominator to all patients (with same exclusions). For “In-hospital fracture” panelists rated the broader Experimental
indicator, which was replaced in the Accepted set by “Postoperative hip fracture” due to operationalization concerns.
b
Vascular complications rated as Unclear (-) by surgical panel.

Table 15. Grouping Based on Medical Error


Acceptable Acceptable (-) Unclear Unclear (-)
Decubitus ulcerg Comp. of Death in low mort. Failure to rescue
anesthesiag DRG
c, g
Foreign body In-hosp. fracturea, g Infection due to PO hemmorhage/
med. care hematoma
Iatrogenic Transfusion PO PE or DVTb PO pulmonary
pneumothoraxa, g reactiond, g compromise
PO wound Birth trauma
dehiscencee
Postoperative OB trauma
sepsis
Tech. diff. with
procedure
PO physio. or meta.
Derangementf
a
Panel ratings based on definitions different than final definitions. (See Table 14 footnote)
b
Vascular complications rated as Unacceptable by surgical panel.
c
Foreign body rated as Acceptable (-) by surgical panel.
d
Transfusion reaction rated as Unclear (-) by surgical panel.
e
PO wound dehiscence rated as Unclear (-) by surgical panel.
f
PO physiologic and metabolic derangement rated as Unclear (-) by surgical panel.
g
Rated highly on both preventability and medical error questions.

Although the Accepted indicators did have relatively high ratings regarding the
overall usefulness of the indicator, the panel review only addressed the face validity of
the indicators. Additional research will be required to establish the validity of all
indicators. In general, Accepted indicators have more compelling validity based on the
current findings than do Experimental indicators. Each of the Experimental indicators is
subject to one or more major concerns that tend to group into three categories. First,
panelists rated some of the Experimental indicators lower than the Accepted indicators
because they had concerns regarding the construct validity of the indicator (the ability of

73
the indicator to measure potentially preventable complications). Additional research
utilizing other sources of data, such as medical charts, will help to determine the
construct validity of these indicators. Although all indicators have no or little current
evidence regarding their construct validity, panelists felt particularly concerned about
those indicators designated as Experimental. Second, a few indicators either did not have
adequate panel review, or were not evaluated by panels (since they were added after the
panel review). These indicators should be reviewed by clinical panels with appropriate
composition (e.g., inclusion of cardiac surgeons and interventional cardiologists for
“CABG after PTCA”). Finally, a few indicators were of interest to the panels, but could
not be operationalized adequately within the project timeframe and resources, and will
therefore require investigation into whether available codes capture the complication of
interest and risk pool adequately. Table 16 identifies the suggested research for each of
the Experimental indicators.

Table 16. Suggested Initial Further Research for Experimental Indicators

Operationalization
Construct Validity

Clinician Panel
Review

Review
Indicator
Aspiration pneumonia X
CABG after PTCA X
Decubitus ulcer in high risk patients X X
In-hospital Fractures possibly related to falls X
Intraoperative nerve compression injuries X X
Malignant hyperthermia X X
Postoperative acute myocardial infarction X Xa
Postoperative iatrogenic complications – cardiac system X
Postoperative iatrogenic complications – nervous system X
Reopening of surgical site X
Suture of laceration X Xa
Obstetric wound complications – cesarean section X
Obstetric wound complications - vaginal delivery X
Other obstetric complications X
Post-partum urinary tract infection X
Third or fourth degree obstetric laceration (JCAHO) X
Uterine rupture X X
a
Indicators were accepted by one panel, but rejected by another. Additional review may aid in interpreting these
differences of opinion.

Most of the indicators were specified to include pediatric patients. To assess the
applicability of the indicators to the pediatric population, rates were also calculated for
the following age strata: less than one year, 1 – 14 years, 15 – 24 years and 25 years and
older (see Appendix G, Supplemental Tables 3 and 4). Many indicators appear to have
similar rates across all pediatric patients as adults. However, the mechanisms of
complication development may differ in the pediatric population. For instance, DVTs in a
pediatric population may be more reflective of catheter care and use than perioperative

74
prevention strategies. Where mechanisms or risk factors may differ from the adult
population, they are noted in Section 3D.
The remaining portions of the report focus on reporting more details about these
indicators. Section 3C. Overall Clinician Review Results provides general themes related
to these indicators and highlighted by the panel discussions. Section 3D. Detailed Panel
Results by Indicator, provides details on the definition choices made for each indicator,
and the concerns raised specific to each indicator. Section 3E. Comparative Empirical
Results, relates the findings of the empirical analyses for indicators in the Accepted and
Experimental indicator sets. Appendix E provides the detailed specification for the final
definitions used for each indicator, and Section 3D. Detailed Panel Results by Indicator
also includes the basic definition and rationale for each indicator. As previously noted, all
of the results for and brief descriptions of the Rejected indicators are presented in
Appendix F.

75
Flow Diagram 1. Process for the Selection of Indicators

Section 3C. Overall Clinician Panel Review Results


Potential indicators reported in literature, Miller et al. Patient Specific codes obtained
including ComplicationsDuring the course of the clinician review, panelists discussed from
Screening Safety Indicators a review
and offered of
both
Program specific suggestions regarding a specific indicator, as well as general ICD-9-CM themes about
quality indicator use. These "themes" provided important insights into how quality
improvement
Selected and indicators
codes included based on are viewed by clinicians, how such indicators are likely to be
used and interpreted, and
clinical logic and knowledge of the validity of such indicators from a clinical perspective.
coding practices
While our sample of clinicians was diverse, it is not a nationally representative sample, as
these individuals were nominated andlist
Initial volunteered to participate.
of PSI codes (200+ Nevertheless,
codes) the
themes that consistently arose in the process are important to address in the development
and use of quality indicators. While many of these themes reflect areas covered in
previous studies, the novel, though not surprising, finding is that clinician panelists
considered these areasofvital
Grouping codestoand
discuss as they
assignment of provided input about the development of
patient safety inclusion/exclusion
and complications indicators.
criteria based on CSP,
Miller et al. PSIs and clinical knowledge.
Application of Quality Indicators
40+ preliminary indicators
Panelists repeatedly discussed that the validity of quality indicators is dependent
on the intended use (e.g., public reporting of provider rates versus internal quality
Selection of indicators for review based on
improvement). For example, an indicator designed to knowledge
coding be more specific increases
and validity evidencethe
surety that the indicator will most certainly flagreported
only cases
in thewhere a medical error or
literature.
process failure has occurred. The tradeoff, as with any diagnostic test, is that the indicator
will then be less 34sensitive, missing
indicators true instances
reviewed of error. For internal
by multispecialty panels quality
improvement, it may be more useful to identify changes in
2 indicators created by multispecialty panelsrates of complications that
may signal a potential process flaw. While this approach is less precise in terms of
yielding only cases of high concern, it would likely identify a broader range of potential
qualitytoconcerns.
Changes For public
indicators based on reporting of provider rates, however, a choice to emphasize
sensitivity
panel review andover specificity in designing indicators may lead to misinterpretation about a
professional
15 indicators reviewed by
particular
coding input. providers’ performance, as some that may use such data may be unfamiliar
surgical
with the extensive list of caveats that must be considered whenpanels
interpreting results for
Indicators assigned to sets
each quality indicator. The primary goal of the AHRQ quality indicators is to implement
based on panel ratings.
screening tools, meaning that further investigation is expected to certify that an abnormal
rate is indeed due to a quality problem. Nonetheless, panelists remained concerned that if
Additional indicators were added post-
20 indicators assigned to Accepted set
these indicators were used to report rates publicly, such limitations review wouldtobeexperimental
obscured.set based on
17 indicators assigned to Experimental set panel suggestions. Some indicators split
11 indicatorsofrejected
Purpose Quality Indicators into several indicators based on panel
suggestion.

Indicators may be designed for a variety of uses. There is a distinction between


theFinal
userevisions
of QIs as to indicators
"case finding tools" and as "quality improvement" tools. Case finding
based on final coding input, and
tools are primarily
exploratory analyses.used to identify a specific case or patient in which a quality problem
may have led to the outcome in question. In some cases, this may be used for case
investigation, mortality and morbidity discussions, or negligence attributions. Another
Final PSItools,
way to use the indicators is as quality improvement set in which the rate of a
complication provides the most useful information. Unlike case finding tools, this

76
approach focusing on complication rates admits that not each case will reflect negligence
or medical error. However, hospitals with extremely high rates compared to similar
institutions may have cause for concern. Interventions may be able to reduce the rate of a
complication, but not always prevent a complication from occurring in a particular
patient. Panelists were told that this indicator set is designed as a quality improvement
tool. Like indicators used for public reporting of provider rates, indicators used for case
finding must be much more specific than quality improvement tools, since imprecision
from a more sensitive measure may cause problems. Panelists expressed concern that
some of the indicators under development may be construed as case finding tools, despite
being designed and validated as quality improvement tools. In this event, physicians or
other clinicians may be unfairly accused of negligence in a particular case, when, in fact,
the clinician could not have prevented the outcome for that particular patient.

Importance of Risk Adjustment or Stratification

Panelists noted that for many indicators, case mix, screening and charting
practices, and other factors vary systematically between providers. Panelists discussed
alternatives to address such bias, as outlined below.
For many indicators, the exclusion of certain high risk populations, such as
trauma patients, may increase the homogeneity of the population at risk. Such restrictions
would decrease bias that could result from inconsistent distribution among hospitals of
high risk populations. In some cases, panelists favored such exclusions when the
population was at such a high risk, that most of the complications would not be
preventable. Panelists noted that this approach has the undesired effect of obscuring
outstanding quality care, where some providers may be better at preventing complications
in high risk patients. This difference would be very important to illuminate, leading some
panelists to suggest stratification rather than exclusions.
Stratification has the advantage of allowing providers to view rates of
complications in patients with varying risks of developing that complication. Such
stratification would remove bias caused by high risk patients. For instance, deep vein
thromboses (DVT) and pulmonary embolism (PE) are more common after some
orthopedic surgeries. Providers specializing in orthopedic surgery may appear to have an
abnormally high rate of DVT/PE, although the rate is due primarily to case mix. Stratified
rates would allow the provider to view the orthopedic surgical complications rates
separately from other lower risk procedures, allowing exploration of whether the high
rate was indeed due to the provider’s orthopedic surgery case-mix. Panelists suggested
stratifying some indicators by primary procedure type, trauma, elective and urgent
admission, and specified comorbidities. In addition to singling out potentially high risk
strata, stratification may aid in illuminating the source of a particularly high rate, beyond
case mix differences. For demonstration, panelists noted that DVT and PE are identified
differently by different providers. Some providers specifically screen for DVT after
surgery, while others do not. Thus, providers that screen will appear to have a higher rate,
simply because they detect more DVTs. Stratification by DVT rate versus PE rate would
allow providers to identify whether a high rate is driven by a higher rate of DVTs, which
may be due to screening, or whether the more serious and less ambiguous PE rate is also

77
high. The review of each specific indicator notes suggestions that panelists made
regarding stratification.
In some cases, stratification may not be the best or only approach. Panelists noted
that case mix adjustment is desirable for many indicators, especially when a variety of
factors, such as age, sex, principal procedure or diagnosis, and comorbidities, may
influence the likelihood of complications occurring, and when many of these factors vary
systematically by providers. Under these circumstances, case-mix adjustment may be
easier to interpret than stratification or other approaches. However, case-mix adjustment
has many caveats, especially when limited to administrative data. Panelists noted that for
many of these indicators, risk adjustment using administrative data is a blunt tool.
Additional clinical data would provide much better risk adjustment information. Such
data are likely to differ by indicator, and often would require chart review. However,
even some risk adjustment may indicate whether or not there is a possibility that a high
rate could be due to differences in case mix. While many panelists expressed concern that
without risk adjustment indicator results would be misconstrued as due to poor quality of
care, some panelists also expressed that blaming high rates on case mix differences may
not be appropriate. Their point of view was that adequate risk adjustment could reveal
under what circumstances high complication rates appear attributable to case mix
differences.

Understanding of Data

Throughout the structured review process, it was clear that some panelists had
sophisticated knowledge of administrative data and ICD-9-CM coding, while many
panelists were unclear about the limitations of administrative data. To remedy this
problem, we provided panelists with information on coding and administrative data.
Throughout the conference call we clarified any misconceptions regarding the available
data. Through these interventions, panelists’ understanding appeared sufficient regarding
the limited nature of administrative data. However, we did note that before this education,
panelists often assumed that administrative data were clinically rich, containing
information on physiological data or very specified diagnoses or procedures. Most
panelists were unaware of how ICD-9-CM codes were assigned; unaware that such codes
are based on the physician notes and are therefore subject to differences in physicians’
diagnosis and charting practices. Panelists were also often unaware that the precise
timing of a diagnosis or procedure was impossible to ascertain with most administrative
data. The variety of baseline knowledge regarding administrative data from which
indicators are constructed suggests potential future problems in interpretation. Physicians
and other clinicians, as well as the public and other end users may assume that the data
from which indicators are created are detailed, and therefore that indicators or risk
adjustment procedures are more clinically valid than is true. A lack of understanding of
administrative data may promote inappropriate use of indicators. Without understanding
data elements captured in an indicator specification, users of indicators may have
difficulties determining what additional data collection efforts might help explain varying
rates observed by providers. It should be noted that while some panelists appeared to
believe that administrative data were more detailed, others had great skepticism about its
use (see below).

78
Charting, Coding and Reporting

Panelists expressed skepticism about the quality of coding for some of the
indicators, stemming from a variety of problems ranging from incentives to chart events
to possible inexperience of coders assigning ICD-9-CM codes. Panelists noted that there
are many reasons why a physician may not chart a diagnosis or procedure. First, some of
the reviewed complications, such as "failure of sterile procedures" or "suture of
laceration" when the laceration is minor, may not be coded by some physicians because
they may not seem to be clinically significant. In these cases the "rate" of a complication
is related mostly to the detail of the physician notes, and thus may be biased. In some
cases, there may be disincentive to specifically chart a complication of questionable
clinical importance. The culture of a hospital may discourage reporting of errors, if a
physician feels that they will be punished for reporting the error. Thus, hospitals with
good reporting programs for medical error may appear to have poorer quality of care than
hospitals that do not encourage error reporting.
In some cases, the clinical significance of a complication may be very clear, and
will usually be charted. However, panelists noted that there still may be variation in
charting these complications. Since ICD-9-CM codes are assigned based on physicians’
written notes, the exact term a physician uses to describe a condition effects the code
assigned. For instance, pneumonia and atelectasis may be used by different physicians to
describe the same clinical findings, resulting in different ICD-9-CM codes. In addition,
physicians may have differing clinical thresholds and diagnostic practices when
identifying a condition. In the pneumonia example, some physicians may diagnose
pneumonia using chest x-ray findings, while others may require positive results from a
broncoscopy before documenting the diagnosis. Again, these variations result in varying
"rates" without true variation in the rate of the actual complication. Even when the
complication is clearly defined, some indicators require that the complication be labeled
as the direct result of a procedure or medical care, or "iatrogenic". Panelists reported that
such a link is often not included in the chart. If another code is available, such as is the
case for hypotension, for instance, that code is likely to be assigned. Coders, by direction,
and because they are not physicians, do not make inferences during coding to correct
some of these variations. In fact, panelists repeatedly expressed skepticism about the
accuracy of coding from physician notes, although specific observations of inaccuracy
were not reported.

Summary

Throughout our clinical panel review process, we identified recurring themes


relating to the usefulness of indicators in a clinical setting. Panelists noted that many
problems associated with indicators might not be accurately noted when interpreting
indicators in a clinical setting, and generally expressed concern regarding the use of these
indicators as definitive quality measures or for public reporting. However, panelists did
express interest and indicated a need for such quality indicators, especially for non-
punitive internal quality monitoring and improvement.

79
80
Section 3D. Detailed Panel Results by Indicator
This section reports the results of the clinician panel’s ratings and discussion of
each indicator. Medical, procedure and obstetric related indicators were reviewed by
multi-specialty panels. A subset of indicators was then reviewed by surgical panels. The
table (Table 17) below summarizes the genealogy or history of panel reviews for each
indicator; letters in parentheses after an indicator show the final disposition of the
indicator based on panel and other findings. Rejected means that the indicator was not
retained for further evaluations, usually due to low ratings by the panelists. These
rejected indicators are in addition to ones that were not even evaluated by clinical panels.
Experimental indicates that the indicator was of some potential use as a patient safety
indicator, but had generated some reasonable concerns that would need to be explored
through chart reviews or other methods that were outside of the scope of this project.
These indicators were evaluated as an Experimental indicator set in the empirical
analysis. The final disposition, Accepted means that an indicator as specified after panel
input was thought to be useful as a screen for potentially preventable complications of
care. These Accepted indicators were evaluated empirically in detail. In this section,
Accepted indicators are presented first, in alphabetical order; non-obstetric indicators are
followed by obstetric indicators. Next Experimental indicators are presented, also in
alphabetical order; again, non-obstetric indicators are followed by obstetric indicators.
For explanation of the isolation of obstetric indicators see the introduction to this chapter.
The results for each Rejected indicator are found in Appendix F.
Each indicator review follows the same pattern. First, a brief description of the
indicator rationale is given followed by the final definition of the indicator. The definition
shown reflects the suggested changes made by the panel. The original definitions
presented to the panel may be found in Appendix I. The final definition is followed by
the final post-conference call ratings for each indicator. These ratings are usually based
on the definition provided. In cases where changes were made after the panel’s final
rating, an explanation is included in the narrative. Finally, two sections describe the input
of the panel. The first section, “Changes to the indicator” documents suggested and
implemented changes to the definition and the rationale for each. Definitional changes
included changes to both the complication of interest and the population at risk. The
second section, “Concerns not addressable by changes” documents any concerns raised
during the conference call and subsequent ratings about the indicator.

81
Table 17. Indicators Reviewed by Panel Type
Multi-specialty Panelb Surgical Panelb
a Final
Indicator Post Conf. Pre Conf. Post Conf.
Pre Conf. Call Designationc
Call Call Call
Aspiration pneumonia XXX XXX XXX XXX Experimental
Birth trauma - injury to
XXX XXX Accepted
neonate
CABG following PTCA XXX XXX Experimental
Complications of anesthesiad XXX XXX XXX XXX Accepted
Death in low mortality DRGs XXX XXX Accepted
Decubitus ulcer XXX XXX Accepted
Decubitus ulcer in high-risk
Experimental
patiente
Dosage complications XXX XXX Rejected
Failure to rescuef XXX XXX Accepted
Foreign body left in during
XXX XXX XXX XXX Accepted
procedure
Iatrogenic hypotension XXX XXX Rejected
Iatrogenic pneumothorax XXX XXX Accepted
Infection due to medical care XXX XXX Accepted
In-hospital fractures possibly
XXX Experimental
related to fallsg
Intestinal infection due to
XXX XXX Rejected
Clostridium difficile
Intraoperative nerve
XXX XXX XXX Experimental
compression injuriesi
Malignant hyperthermiaj XXX XXX XXX Experimental
Obstetric thrombosis or
XXX XXX Rejected
embolism
Obstetric trauma-cesarean
Accepted
section
Obstetric trauma-vaginal with Obstetric Obstetric
Accepted
instrument traumak traumak
Obstetric trauma- vaginal
Accepted
without instrument
Obstetric wound
complications-cesarean XXX Experimental
Obstetric
section delivery
Wound
Obstetric wound
Complicationsl
complications-vaginal XXX Experimental
delivery
Other obstetric complications XXX XXX Experimental
Postoperative acute
XXX XXX XXX XXX Experimental
myocardial infarction
Postoperative hemorrhage or
XXX XXX XXX XXX Accepted
hematoma
Postoperative iatrogenic Postoperative Postoperative
complications-cardiac iatrogenic iatrogenic Experimental
system complicationsm complications
Postoperative iatrogenic
Rejected
complications-digestive
Postoperative iatrogenic Experimental
complications-nervous

82
Multi-specialty Panelb Surgical Panelb
a Final
Indicator Post Conf. Pre Conf. Post Conf.
Pre Conf. Call Designationc
Call Call Call
Postoperative iatrogenic
Rejected
complications-respiratory
Postoperative iatrogenic
Rejected
complications-urinary
Postoperative iatrogenic
Rejected
complications-vascular
Postoperative hip fractureh XXX Accepted
Postoperative physiologic
XXX XXX XXX XXX Accepted
and metabolic derangements
Postoperative pneumonia XXX XXX XXX XXX Rejected
Postoperative respiratory
XXX XXX XXX XXX Accepted
failure
Postoperative pulmonary
embolism or deep venous XXX XXX XXX XXX Accepted
thrombosis
Postoperative sepsis XXX XXX Accepted
Postoperative wound
XXX XXX XXX XXX Accepted
dehiscence
Post-partum UTI XXX Experimental
Puerperal infection XXX XXX Rejected
Reopening of surgical site XXX XXX XXX XXX Experimental
Suture of laceration XXX XXX XXX XXX Experimental
Technical difficulty with
XXX XXX Accepted
procedure
Transfusion reaction XXX XXX XXX XXX Accepted
Unexpected LOS/ Conditional
XXX XXX Rejected
LOSn
Uterine Ruptureo Experimental
a
Obstetric and non-obstetric indicators are included in this table for ease of finding indicators on table.
b
XXX denotes indicator was reviewed.
c
Accepted and experimental indicators were empirically evaluated; rejected indicators were not.
d
Multi-specialty panel suggested that this indicator be dropped and suggested two indicators (minor peri-operative physical injuries
and malignant hyperthermia) in lieu of indicator. Surgical panel reviewed and revised original indicator.
e
Indicator was created after clinical panel reviews based on panel suggestion, underwent empirical evaluation only.
f
Clinicians on multi-specialty panel evaluated 2 failure to rescue indicators with different definitions. Both definitions were
combined into the single "Failure to rescue" indicator following the conference call.
g
Original indicator was titled "Postoperative hip fracture and fall" prior to conference call; the new indicator reflects suggested
change of panel.
h
Indicator was accepted in lieu of the suggested indicator due to difficulty operationalizing the suggested indicator “in-hospital
fractures, possibly due to falls”
i
Original indicator was titled "Minor-perioperative physical injury." Indicator name changed to "Intraoperative nerve compression
injury" when corneal abrasion and lip laceration were eliminated from the definition.
j
Indicator was created based on panel suggestion following discussion of “Complications of Anesthesia” indicator.
k
Indicator was stratified according to delivery type following final rating due to panelist suggestions.
l
Indicator was stratified according to delivery type following initial rating due to panelist suggestions.
m
Indicator was split into 5 indicators, reflecting the individual complication codes included in the indicator. For the final rating,
panelists were informed of the intention to split the indicator, but panelists provided only one rating.
n
Multi-specialty panel reviewed 2 definitions, selecting “Unexpected LOS” for further consideration.
o
Indicator was created after clinical panels reviewed the “Other obstetric complications” Indicator

83
The review of each indicator includes the indicator name, description with rationale,
definition, panel ratings and a summary of panel comments. More detailed specifications of
indicators are documented in Appendix E. The six questions about aspects of the indicator (e.g.,
how preventable the complication is) were rated by panelists on a scale from 1 to 9, with the
higher numbers relating to better patient safety measures, with one exception. In the case of the
question related to how subject an indicator might be to bias (e.g., effects of case mix), a lower
rating corresponds to a better patient safety indicator. Each rating table shows the panel median
score, as well as the level of agreement, where “agreement” corresponds to little dispersion of
opinion, “indeterminate” means that the opinion ranged but did not reach the point of clear
“disagreement”, the final category where there were panelists with diametrically different
opinions. Section 2D. Clinician Panel Review Methods provides details on agreement
categorization. The indicators are organized according to final designation as accepted or
experimental, with non-obstetric indicators preceding obstetric indicators. Indicators that were
reviewed, but ultimately rejected can be found in Appendix F.

Accepted Indicators

Complications of Anesthesia

This indicator is intended to flag cases of specific complications due to anesthesia that
can be clearly identified using administrative data. Specifically, the final definition captures
cases flagged by External Cause-of-Injury Codes (E-Codes) and complications codes for adverse
effects from the administration of therapeutic drugs, and the overdose of anesthetic agents used
primarily in therapeutic settings.
Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM diagnosis codes for [anesthesia complications] in
any secondary diagnosis field per 100 discharges.
Denominator All [surgical] discharges.

Exclude patients with codes for poisoning due to anesthetics [E855.1, 968.1-4,
968.7] AND any diagnosis code for [active drug dependence], [active
nondependent abuse of drugs], or [self- inflicted injury].
Post-Conference Call Panel Ratings
a
Question Median Agreement status Median Agreement status
(MS) (MS) (S) (S)
Overall rating Not Rated 7 Indeterminate
Not present on admission Not Rated 5.3 Indeterminate
Preventability Not Rated 7.5 Indeterminate
Due to medical error Not Rated 7.3 Indeterminate
Charting by physicians Not Rated 5.3 Indeterminate
Bias (lower rating favorable) Not Rated 6.8 Disagreement
a
Multi-specialty Panel – Surgical Complications 3

84
Surgical Panel – Surgical Complications 3
Multi-specialty Panel Results

This panel agreed that this indicator should be dropped as originally defined. They
suggested the creation of two alternate indicators related to complications of anesthesia:
“Malignant hyperthermia” and “Minor perioperative injuries”. Thus, this indicator was not rated
after discussion by this panel.
Concerns not addressable by changes. This panel felt strongly that shock due to
anesthesia was too nebulous of a diagnosis. This diagnosis varies widely depending on the
charting and judgment, and this diagnosis may represent many varied physiological states. In
addition, there was concern that shock was expected in certain situations, such as major
abscesses. Finally, in many instances shock may not be clearly attributable to anesthesia, as it
may have arisen from a variety of causes. The panel suggested this code be omitted.
The panel also expressed concern regarding the code for incorrect placement of
endotrachial tube. Panelists were unsure what events would be assigned this code. They noted
that in surgery, misplacement would be corrected immediately, and likely would not be charted.
If the tube could not be placed correctly, the patient would be awakened. They noted that these
few cases do not represent medical error. Indeed, they noted that true misplacement that resulted
in harm to the patient does represent medical error, but they expressed skepticism over whether
or not this code would be limited to those situations.
Panelists suggested several additional situations that could be monitored. A few
situations, such as anoxic brain damage, did not have specific ICD-9-CM codes. Air embolism
was included in another indicator. Suggestions for monitoring malignant hyperthermia and lip
lacerations were included in new indicators.

Surgical Panel Results

Changes to the indicator. The surgical panel also expressed concern about the code for
shock due to anesthesia. In addition to the concerns expressed by the multi-specialty panel, this
panel specifically noted that shock may be labeled as hypotension instead of shock. They also
noted that shock due to anesthesia is not always preventable. For these reasons, they suggested
removing the code.
The panel suggested instead adding a variety of additional codes that may be used for
reactions to and overdose of anesthetics. These codes include so-called “E-codes” for adverse
effects of the administration of therapeutic drugs. Panelists did express concern that E-codes are
not consistently coded, but agreed that they should be tracked nonetheless. Other codes included
a series of codes representing accidental poisoning by anesthetics, limited to anesthetics that are
not commonly used as recreational drugs, with specific exclusions to reduce the chance that
poisoning was present on admission.
Concerns not addressable by changes. No other concerns were added.

Summary Across Panels

The two panels suggested different, almost entirely new, indicators, rejecting the original
definition for this indicator. As a result all ratings were considered separately. The multi-
specialty panel created two indicators that were rated separately. The surgical panels revised the

85
definition of this indicator, and rated its overall usefulness as relatively favorable. As such, this
indicator was retained in the Accepted provider level indicator set.
Panelists had concerns about the frequency of coding of these complications, especially
since the use of E-codes is considered voluntary and appears to vary widely between providers.
Plausibly a “reaction” may be described without attributing it to anesthetic. Another concern is
that some of these cases would be present on admission (e.g., due to recreational drug use).
Ideally, this indicator would be used with a coding designation that distinguishes conditions
present on admission from those that develop in-hospital. However, this is not available in the
administrative data used to define this indicator, and so this concern was addressed by
eliminating codes for drugs that are commonly used as recreational drugs. While this does not
eliminate the chance that these codes represent intentional or accidental overdose on the part of
the patient, it should eliminate many of these cases.

Death in Low Mortality Drgs

This indicator is intended to identify in-hospital deaths in patients unlikely to die during
hospitalization. The underlying assumption is that when patients admitted for an extremely low-
mortality condition or procedure die, a health care error is more likely to be responsible. Patients
experiencing trauma, or having an immunocompromised state or cancer are excluded, as these
patients have higher non-preventable mortality.

Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator All discharges with disposition of "deceased" per 100 population at risk.
Denominator Patients in DRGs with less than 0.5% mortality rate, based on NIS 1997 [low
mortality DRG]. If a DRG is divided into "without/with complications" both
DRGs must have mortality rates below 0.5% to qualify for inclusion.

Exclude patients with any code for [trauma], [immunocompromised] state, or


[cancer].
Post-Conference Call Panel Ratingsa

Question Median Agreement status


Overall rating 7.5 Agreement
Not present on admission Not applicable Not applicable
Preventability 6 Indeterminate agreement
Due to medical error 6 Indeterminate agreement
Charting by physicians 9 Agreement
Bias (lower rating is favorable) 4.5 Indeterminate agreement
a
Medical Complications 2 Multi-specialty Panel

Changes to the indicator. Panelists suggested no changes to this indicator.


Concerns not addressable through changes. Panelists expressed some concern
regarding bias inherent in this indicator. Specifically, panelists noted that hospital case-mix may
affect the rate of death in low mortality DRGs. Patients referred from skilled nursing facilities,
those with certain comorbidities and older patients may be at higher risk of dying. Risk

86
adjustment for comorbidities and age was highly advocated. Panelists also suggested that social
factors play a role, with socio-economic status being correlated with many other risk factors that
may affect the health and healing of the patient. Some panelists advocated for stratification by
insurance status. Finally, panelists noted that some hospitals accept transfers from other
hospitals. At times, these transfers are very appropriate, but sometimes the transfer occurs too
late for the receiving hospital to prevent death. If these scenarios occur systematically, this
indicator could be biased against referral centers. Panelists also expressed that hospital size may
be a factor. Since deaths in these DRGs are rare, hospitals that have very few patients may be
more affected by random variation.
Despite the concerns expressed regarding bias in the low mortality DRG indicator,
panelists noted that this indicator was of great interest. Panelists noted that although many deaths
in these DRGs are likely to be non-preventable and not due to medical error, that all deaths in
low mortality DRGs should be subject to internal review, and that high rates may indicate a
quality problem. However, panelists were quick to emphasize use of this indicator as a screening
tool for internal quality improvement efforts. Given potential bias and questions about the extent
of preventability, panelists advocated that this indicator not be subject to public reporting.

Summary

The overall usefulness of this indicator was rated as favorable by panelists, and as such it
was retained in the Accepted provider level indicator set. To standardize the indicator, since the
denominator of this indicator includes many heterogeneous patients cared for by different
services, this indicator should be stratified by DRG type (i.e., medical, surgical, psychiatric,
obstetric, pediatric) when used as an indicator of quality.

Decubitus Ulcer

This indicator is intended to flag cases of in-hospital decubitus ulcers. It is related to a


complications indicator developed as part of the Complications Screening Program, 7 although it
omits several of the original codes for cellulitis. In order to better screen out cases of decubitus
ulcer that are present on admission, this indicator limits its definition of decubitus ulcer to
secondary diagnoses (meaning decubitus ulcer was not labeled as the principal diagnosis). In
addition, this indicator excludes patients that have a length of stay less than 4 days, as it is
unlikely that a decubitus ulcer would develop within this period of time. Finally, this indicator
excludes patients who are particularly susceptible to decubitus ulcer, namely patients with major
skin disorders (MDC 9) and paralysis.

Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM code of 707.0 in any secondary diagnosis field per
100 discharges.
Denominator All [medical] and [surgical] discharges.

Include only patients with a length of stay of more than 4 days.

Exclude patients in MDC 9 or patients with any diagnosis of [hemiplegia,


paraplegia, or quadriplegia].

87
Exclude patients admitted from a [long term care facility].

Post-Conference Call Panel Ratingsa


Question Median Agreement status
Overall rating 8 Agreement
Not present on admission 8 Agreement
Preventability 8 Agreement
Due to medical error 8 Agreement
Charting by physicians 7 Indeterminate agreement
Bias (lower rating is favorable) 3 Indeterminate agreement
a
Medical Complications 1 Multi-specialty Panel

Changes to the indicator. The original definition of this indicator was based on the
Complications Screening Program.7 This included an exclusion for patients older than 80 years
of age, since these patients may be more likely to have pre-existing decubiti. Panelists felt that
this exclusion was undesirable, as it eliminates patients who should be monitored. Panelists
instead suggested that patients admitted from a long-term care facility be excluded, as these
patients may have an increased risk of having decubiti present on admission.
The original definition included only patients with a length of stay of 10 days or more, to
better ensure that the decubiti developed within the admission in question. Panelists agreed that
this length of stay was too long, limiting the indicator to only the most ill patients. Instead,
panelists agreed to limit the indicator to patients with length of stay to 4 days or more, a
limitation utilized for this indicator in a study by Needleman et al.137
Concerns not addressable through changes. Most panelists had few concerns
regarding this indicator. In general panelists felt that this complication was preventable, and in
many cases reflects medical error, although a small number of cases may not be preventable.
One panelist suggested that little published evidence exists regarding practices that providers
may adopt to reduce decubitus ulcer rates.
Some panelists had minimal concern that reporting of decubiti may vary by providers.
Specifically, staging of decubitus ulcers affects the charting of the complication, with earlier
stage ulcers reported more variably than later stage ulcers. Nurses were noted to be more vigilant
than physicians in reporting ulcers; however, nursing notes are not considered when assigning
ICD-9-CM diagnosis codes. In addition, some facilities routinely screen for decubitus ulcers as
part of quality improvement programs, while other facilities do not. Hospitals that screen would
have an artificially high rate of ulcers as compared to other hospitals. If this concern is
demonstrated in reality, than this indicator may be somewhat biased.
A final source of potential bias is case mix. Panelists noted that very ill patients may be at
higher risk for developing decubiti, and therefore hospitals that care for sicker patients may have
higher rates of this complication. In addition, one panelist noted that since patients admitted from
long-term care facilities are excluded, that hospitals admitting more patients from these facilities
may appear better than other facilities.
Although panelists chose to retain the exclusion of high risk patients, many panelists
expressed interest in tracking decubiti in a higher risk population. It was felt that bias may result
from adding these patients to the population at risk. On the other hand, the high risk population is

88
one for which vigilance of the treatment team should be high and may have a substantial effect.
They suggested, that if possible in the future, that high risk patients also be tracked separately.
An indicator for this purpose was added to the experimental set because of its face validity, but
need for further testing.

Summary
The overall usefulness indicator was rated as very favorable by panelists. Although
panelists felt that this complication most often reflected medical error, concerns regarding the
systematic screening for ulcers and reliability of coding, especially for early stage ulcers brought
into question that assertion. Thus, this indicator appears to be best used as a rate based indicator,
despite its high rating on the medical error question. This indicator was retained in the Accepted
provider level indicator set.
This indicator includes pediatric patients. Pressure sores are very unusual in children,
except among the most critically ill children (who may be paralyzed to improve ventilator
management) and children with chronic neurologic problems.

Failure To Rescue

This indicator is intended to identify patients that die following the development of a
complication. The underlying assumption is that good hospitals may not be able to prevent
complications, but they identify these complications quickly and treat them aggressively to
prevent adverse sequelae, such as death. The original definition of this indicator was developed
by Silber et al.31 and was based on clinical data, focusing on complications of cardiac surgery
that were serious and often non-preventable. Jack Needleman and colleagues, in a recent study,
operationalized failure to rescue using administrative data only, across a wide range of surgical
and medical patients.137 Needleman’s list of complications was closely related to the
complications defined in the Complications Screening Program.7 These complications include
exclusions designed to avoid counting patients with the complication present on admission. In
this definition, Needleman used patients identified under his modified definition as having a
serious iatrogenic complication as the population at risk. Patients that transferred to or from
another hospital are excluded. Patients admitted from a long-term facility are also excluded.

Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator All discharges with disposition of "deceased" per 100 population at risk.
Denominator Discharges with potential complications of care listed in [failure to rescue]
definition (i.e., pneumonia, DVT/PE, sepsis, acute renal failure, shock/cardiac
arrest, or GI hemorrhage/acute ulcer). Exclusion criteria specific to each
diagnosis.

Exclude patients [transferred to acute care facility].

Exclude patients [transferred from acute care facility]

Exclude patients admitted from a [long-term care facility].

89
Post-Conference Call Panel Ratingsa
Question Median Agreement status
Overall rating 7 Agreement
Not present on admission 7 Indeterminate agreement
Preventability 5 Agreement
Due to medical error 5 Indeterminate agreement
Charting by physicians 8 Agreement
Bias (lower rating is favorable) 4 Disagreement
a
Medical Complications 2 Multi-specialty Panel

Changes to the indicator. Panelists were asked for additional suggestions of


complications to be included in the denominator of this indicator. Panelists unanimously
suggested that acute renal failure be added.
Panelists expressed concern regarding patients with “do not resuscitate” (DNR) status. In
cases where this DNR status is not a direct result of poor quality of care, it would be contrary to
patient desire and poor quality of care to rescue a patient. In addition, very old patients, or
patients with advanced cancer or human immunodeficiency virus (HIV) may not desire or may
be particularly difficult to rescue from these complications. As a result, several changes were
suggested for this indicator. These changes include the stratification of this indicator by age, such
that patients over 75 years may be examined separately from younger patients. In addition,
panelists suggested the exclusion of patients admitted from long term care facilities. Although
these changes do not directly nor completely address panelist concerns, they may improve ability
to interpret results.
Panelists also noted that transfer practices may play a role in this indicator. As patients
that develop some complications may be transferred to more specialized hospitals, referral
centers may not always be able to rescue that patient, particularly if the transfer occurs too late.
In this case the referral care center would appear to have poorer quality than the hospital in
which the complication arose in the first place. Thus, patients who have been transferred to or
from another acute care facility are also excluded from this indicator.
Concerns not addressable through changes. Panelists expressed some concern over
the validity of this indicator, although it was eventually accepted by panelists for inclusion. Some
panelists wanted to see additional validity work on the concept that failure to rescue is a valid
marker of quality of care. Others were concerned that although the concept may be valid, that it
would be very difficult to operationalize this indicator well, with varied definitions of
complications, difficulty ascertaining whether the complication occurred in-hospital, and the lack
of adjustment for the many factors that influence the ability and appropriateness of the hospital
to rescue a patient from these complications.
Panelists noted that several adverse incentives may be introduced by implementing this
indicator. In particular, since some type of adjustment may be desirable, this indicator may
encourage the upcoding of complications and comorbidities to inflate the denominator or
manipulate risk adjustment. Others noted that this indicator could encourage irresponsible
resource use and allocation, although this is likely to be a controversial idea. Finally, panelists
emphasized that this indicator should be used internally by hospitals, as it is not validated for
public reporting.

90
Summary
The overall usefulness of this indicator was rated favorably and as such it is included in
the Accepted provider level indicator set. However, this indicator may be fundamentally
different than other indicators reviewed in this report, as it may reflect different aspects of
quality of care (effectiveness in rescuing a patient from a complication versus preventing a
complication). For this reason, this indicator has been considered separately from other
indicators in this report.
This indicator includes children. It is important to note that children beyond the neonatal
period inherently recover better from physiological stress and thus may have a higher rescue rate.

Foreign Body Left in During Procedure

This indicator is intended to flag cases of a foreign body accidentally left in body during
a procedure. It is based on an indicator developed as part of the Complications Screening
Program,7 although all codes are considered sentinel events in that system. The indicator is
defined both on the area level by including all cases, and on the hospital level by restricting cases
to those flagged by a secondary diagnosis or procedure code.

Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for [foreign body left in during procedure]
in any secondary diagnosis field per 100 surgical discharges.
Denominator All [medical] and [surgical] discharges.

Post-Conference Call Panel Ratings


a
Question Median Agreement status Median Agreement status
(MS) (MS) (S) (S)
Overall rating 8 Agreement 7 Indeterminate
Not present on admission 8 Agreement 7 Agreement
Preventability 8 Agreement 7.5 Agreement
Due to medical error 8 Agreement 7 Indeterminate
Charting by physicians 7 Agreement 8 Indeterminate
Bias (lower rating favorable) 3.5 Indeterminate 4 Indeterminate
a
Multi-specialty Panel – Surgical Complications 2
Surgical Panel – Surgical Complications 2

Multi-specialty Panel Results

Changes to the indicator. Panelists were queried regarding the addition of the code for
the removal of foreign body from the peritoneal cavity. This code may include some foreign
bodies accidentally left in during abdominal surgery when the physician has not specified that
the foreign body was not accidentally left in, or the coder chooses to use this code instead of the
998 code. This procedure code was included in Iezonni’s CSP.7 Panelists agreed that this code

91
would also pick up some important events, although this code does not specify that the foreign
body must be left in accidentally.
Concerns not addressable by changes. Panelists noted that each case of foreign body
left in during procedure needed examination. Some automated systems do report this
complication when a foreign body is actually left in intentionally. In addition, other cases may
require a foreign body to remain. As some codes do not specify that the foreign body must
accidentally be left in the body during procedure, some of these foreign bodies may be left in the
patient intentionally. This code can be used when a granuloma occurs from a suture accidentally
left in the body. Panelists agreed that such granulomas are substantially different in terms of
morbidity from other foreign bodies accidentally left in during a procedure. They recommended
that the percentage of suture granulomas be ascertained when using this indicator.
Some patients seem to be more likely to have foreign bodies left in during a procedure.
Although panelists agreed that these patients (e.g., trauma) should not be excluded, except in the
case of removal of foreign body from the abdominal cavity (e.g., possible gun shots). Panelists
suggested that users of this indicator examine these cases closely. Panelists suggested that this
indicator be adjusted for emergency surgery or type of procedure.

Surgical Panel Results

Changes to the indicator. Panelists suggested no changes to this indicator.


Concerns not addressable by changes. Panelists, especially orthopedic surgeons, noted
that some foreign bodies are left in on purpose. This occurs frequently, such as when a k-wire or
a drill bit breaks off during a procedure. To remove the foreign body may cause more damage
than to leave it in. In this case, surgeons felt that the foreign body did not reflect a medical error.
The panelists felt that this indicator should be stratified or risk adjusted for the type of procedure.
Panelists were concerned about the coding of this indicator. Specifically, this coding requires the
physician to note that the foreign body was accidentally left in. There was concern that this
additional information would not always be reported. Because of this situation, some physicians
have a higher rate than others. Therefore, physicians who do not specify that a foreign body was
left in accidentally would not be flagged by this indicator. Panelists also noted that some foreign
bodies left in do not cause substantial morbidity, although the foreign body may be removed,
resulting in a diagnosis code or an E-code. Some foreign bodies do not represent a clinically
significant complication.
Panelists noted that the population at risk included both medical and surgical patients, but
not all of these patients are at risk. The panelists felt that limiting to surgical patients would
decrease the sensitivity of this indicator substantially. However, it should be made clear that not
all patients in the denominator are actually at risk. Therefore, some hospitals may appear to have
a lower rate if they have less medical patients who have undergone invasive procedures.
The surgical panel was also queried about removing the code related to removal of
foreign body from peritoneal cavity. However, this panel felt that the category was too broad,
and could easily include a number of cases where no foreign body was left in. For this reason,
they suggested that this code not be included.

Summary Across Panels


Both panels believed that this indicator was useful in identifying cases of a foreign body
left in during a procedure. They suggested that since this indicator was likely to yield few cases,

92
that each case identified be examined carefully by the hospital. Since both panels did not agree to
add the code for removal of foreign bodies in the peritoneal cavity, this code was not included.
Given the favorable rating of the overall usefulness of this indicator, it is included in the
Accepted provider level indicator set. An area level analog of this indicator was included in the
Accepted area level indicator set.

Iatrogenic Pneumothorax

This indicator is intended to flag cases of pneumothorax caused by medical care. The
area level indicator is intended to capture all cases of iatrogenic pneumothorax, not only those
occurring in-hospital. The provider level indicator is restricted to secondary diagnosis of
iatrogenic pneumothorax, and is intended to flag cases occurring during the hospitalization. To
exclude patients that may be more susceptible to non-preventable iatrogenic pneumothorax, or
patients with miscoded traumatic pneumothorax, this indicator excludes all trauma patients.

Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM code of 512.1 in any diagnosis field per 100
discharges.
Denominator All discharges.

Exclude patients with any diagnosis of [trauma].

Exclude patients with any code indicating [thoracic surgery] or [lung or pleural
biopsy] or assigned to [cardiac surgery].

Post-Conference Call Panel Ratings


a
Question Median Agreement status
Overall rating 7.5 Agreement
Not present on admission 8 Agreement
Preventability 8 Agreement
Due to medical error 8 Agreement
Charting by physicians 7 Indeterminate agreement
Bias (lower rating is favorable) 3 Indeterminate agreement
a
Procedural Complications 1 Multi-specialty Panel

Changes to the indicator. The original definition of this indicator included all patients,
surgical and medical. Panelists noted that pneumothorax can arise from different causes,
primarily as a result of a procedure, or from barotrauma in ventilated patients. They noted that
although ventilator management matters, pneumothorax arising from barotrauma is much less
straightforward than that arising from procedures such as central line placement. Thus, panelists
suggested that the indicator would better reflect quality of care, if it were restricted to patients
receiving a central line, Swan-Ganz catheter, or thorocentesis (see summary paragraph below, as
this change was ultimately removed).

93
Pneumothorax is an expected complication of some procedures, namely thoracic surgery
and pleural or lung biopsy. Panelists felt that these patients should be excluded, since
pneumothorax may not be preventable in those patients.
Concerns not addressable through changes. Panelists noted that pneumothorax is a
good marker of operator skill. In particular, panelists postulated a clear “July effect” of increased
rates when new residents begin performing such procedures.
A few panelists noted that it would be helpful to know the exact procedure associated
with the pneumothorax, specifically the approach of the central line placement (e.g., subclavian,
jugular). Panelists did express concern that some patients with a recorded central line placement
may also be ventilated. In this case it would be impossible to tell from administrative data
whether the complication arose from the central line placement procedure or from barotrauma.
Finally, it should be noted that this indicator includes Peripherally Inserted Central
Catheter (PICC) line placement as well as central line placement, due to coding constraints.
Panelists felt that this was not of concern. They noted that an appropriate replacement of use of
central line access with PICC lines might occur to some degree as a result of implementing this
indicator.

Summary
Panelists rated the overall usefulness of this indicator favorably, although the definition
rated included the suggested denominator, limited to patients receiving a central line, Swan-Ganz
catheter or thorocentesis. However, exploratory empirical analyses found that this denominator
was not reliably defined using administrative data, as these procedures appeared to be under-
reported. Thus, the ratings reported reflect a definition that could not be operationalized, and
must be considered in that context. Although the panelists noted that this complication, given the
definition rated, reflected medical error, the actual final definition of this indicator includes cases
which may be less reflective of medical error. Specifically, this indicator includes patients in
whom a pneumothorax resulted from barotrauma, including patients with acute respiratory
distress syndrome. Thus, this indicator may not as clearly detect medical error as suggested by
the panel ratings.
Panelists expressed concern that some approaches of placing a central line (e.g.,
subclavian) may be more likely to result in pneumothorax than other approaches (e.g., internal
jugular). However, other complications, such as complications of the carotid artery would be
more common with internal jugular approaches. Thus, if providers simply change approach they
may have a decrease in pneumothorax, but an increase in other unmeasured complications.
This indicator includes children, which was not discussed by panelists. It should be noted
that the smaller anatomy of children may increase the technical complexity of these procedures
in this population (especially among neonates). However, these procedures are less likely to be
performed in this population in unmonitored settings.
Given the high overall rating of the indicator, and the great interest in identifying this
complication, this indicator was included in the Accepted provider level indicator set. An area
level analog of this indicator was included in the Accepted area level indicator set.

Infection Due to Medical Care

This indicator is intended to flag cases of infection due to medical care, specifically those
related to IV lines and catheters. As an area indicator, it is intended to capture all cases of such

94
infection, not only those that occur in-hospital. Defined as a hospital level indicator, it captures
cases based on secondary diagnosis, and is therefore limited to those infections associated with
the same hospitalization. This indicator excludes patients with potential immunocompromised
states (e.g., AIDS, cancer, transplant), as they may be more susceptible to such infection.

Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM code of 999.3 or 996.62 in any diagnosis field per
100 discharges.
Denominator All [medical] and [surgical] discharges.

Excludes patients with any diagnosis code for [immunocompromised] state or


[cancer].

Post-Conference Call Panel Ratings


a
Question Median Agreement status
Overall rating 8 Indeterminate agreement
Not present on admission 7 Indeterminate agreement
Preventability 7 Indeterminate agreement
Due to medical error 6 Indeterminate agreement
Charting by physicians 7 Agreement
Bias (lower rating is favorable) 3.5 Indeterminate agreement
a
Medical Complications 1 Multi-specialty Panel

Changes to the indicator. The original definition of this indicator included several ICD-
9-CM codes representing infections that may arise as a result of medical care, including
intravenous (IV) and catheter infections and infection due to contaminated or infected blood or
other substance. Panelists felt that these two codes identified two very different complications
and should not be combined. They felt that the former code, which focused on IV and catheter
infections, was most useful for quality improvement, while the latter code is likely to be very
rare and poorly reported. For this reason, panelists agreed that this indicator should only include
the code for "other infection due to medical care," focusing on IV and catheter infections. A
second code was added after consultation with a coding specialist, as this code also is used to
denote catheter infections.
Panelists expressed that the existing exclusion criteria for this indicator needed revision.
The original definition excluded trauma patients, as these patients may be at a higher risk for
these types of infection. The panel agreed unanimously that these patients should be tracked and
therefore included in the population at risk. Panelists did feel that immunocompromised patients
were at a higher risk of developing these complications, and that these infections may be less
preventable in this population. Therefore, the panel agreed to exclude immunocompromised
patients from the population at risk.
Concerns not addressable through changes. Panelists noted that while many of these
infections are preventable, even with the best of care, there is a normal underlying rate of these
infections. Panelists also expressed concern over the charting of this indicator. Panelists noted
that charting of these infections is likely to be varied, and reflect differences in documenting

95
clinically less significant infections, or the aggressiveness of treating such infections. Despite the
potential of bias due to charting or under-reporting, panelists for the most part felt that these
complications were important to track. Finally, as with other indicators tracking infections,
concern regarding the potential overuse of prophylactic antibiotics remains.

Summary
Panelists rated the overall usefulness of this indicator favorably, and they expressed
particular interest in tracking IV and catheter related infections. This indicator was retained as in
the Accepted provider level indicator set. An area level analog of this indicator was included in
the Accepted area level indicator set.
This indicator includes children and neonates, which was not specifically discussed by
panelists. It should be noted that high-risk neonates are at particularly high risk for catheter-
related infections.

Postoperative Hemorrhage and Hematoma

This indicator is intended to flag cases of hemorrhage or hematoma following a surgical


procedure. It is based on an indicator developed as part of the Complications Screening
Program.7 This indicator limits hemorrhage and hematoma codes to secondary procedure and
diagnosis codes in order to isolate those hemorrhages that can truly be linked to a surgical
procedure. For the same reason, this indicator eliminates all procedures to control hemorrhages
that take place before the principal procedure. To ensure that the reported hematoma or
hemorrhage is a clinically significant complication, such diagnoses must be accompanied by a
procedure code, indicating clinical intervention.

Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for [postoperative hemorrhage] or
postoperative hematoma] in any secondary diagnosis field AND code for
postoperative [control of hemorrhage] or [drainage of hematoma]
(respectively) in any secondary procedure code field per 100 surgical discharges.

Procedure code for postoperative control of hemorrhage or hematoma must occur


on the same day or after the principal procedure.
Denominator All [surgical] discharges.

Exclude all obstetric admissions (MDC 14 and 15).

96
Post-Conference Call Panel Ratings
a
Question Median Agreement status Median Agreement status
(MS) (MS) (S) (S)
Overall rating 7 Indeterminate 7 Agreement
Not present on admission 8 Agreement 8 Agreement
Preventability 8 Agreement 6 Indeterminate
Due to medical error 4.5 Indeterminate 5 Agreement
Charting by physicians 7 Agreement 8 Agreement
Bias (lower rating favorable) 5 Disagreement 3 Disagreement
a
Multi-specialty Panel – Surgical Complications 1
Surgical Panel – Surgical Complications 1

Multi-specialty Panel Results

Changes to the indicator. Panelists did not suggest any changes to this indicator to
address concerns.
Concerns not addressable through changes. Panelists noted that risk of developing
postoperative hemorrhage or hematoma differs in complicated and uncomplicated cases. They
suggested that an exclusion be added for patients with coagulopathies or for those on
anticoagulant medication. However, this exclusion cannot be adequately implemented using
administrative data. They suggested that this indicator be risk adjusted, rather than using
exclusions of complicated cases. This panel felt that examining the overall rate followed by
further investigations would be more useful than creating a homogenous denominator of
uncomplicated cases. This panel noted that postoperative hemorrhage and severe hematoma are
captured frequently because they require a return to the operating room. However, some
panelists expressed that during the re-operative procedure, it is often difficult to find the source
of the hemorrhage. They questioned whether or not surgical technique influenced the rate of
postoperative hemorrhage or hematoma. Overall, this panel deferred to the surgical specialists in
reviewing this indicator.

Surgical Panel Results

Changes to the indicator. The panelists noted that seromas are often clinically
insignificant complications. They expressed that this complication is not of interest and should
be removed from the indicator. The panel also noted that some hematomas may be insignificant,
but that those requiring a procedure are highly significant and should be tracked. The panelists
expressed the desire to have any diagnosis code linked to a procedure for drainage of hematoma.
The procedure for drainage of hematoma is not specific to hematoma but may also include
draining of other fluids, including abscesses or seromas. Because of this non-specificity of
procedure codes, all procedure codes must be paired with a diagnosis code for hemorrhage or
hematoma in order to be included in this indicator. Panelists felt that this specification would
limit the flagged complications to those reflecting higher morbidity of patients.
Concerns not addressable through changes. Surgical panelists noted that post-surgical
hemorrhage or hematoma occurs in non-surgical patients undergoing invasive procedures such as

97
those undergoing PTCA or cardiac catheterization. They noted that this is an important
population that is not covered by this indicator. They also noted that additional patients would be
missed if they were admitted for hematoma after an outpatient surgery or if they were discharged
before the hemorrhage or hematoma occurred and then readmitted to the hospital. Panelists felt
that these patients were particularly import to track. However, the administrative data used in this
project do not allow for tracking readmissions, or admissions after outpatient surgery. Panelists
noted that some patients may be at higher risk for developing a postoperative hemorrhage or
hematoma. Specifically, like the multi-specialty panel, the surgical panel was concerned about
patients with coagulopathies, and those on anticoagulants. They suggested that where possible,
this indicator be stratified for patients with underlying clotting differences. They also noted that
patients admitted for trauma may be at a higher risk for developing postoperative hemorrhage or
may have a hemorrhage diagnosed that occurred during the trauma. They also suggest that this
indicator be stratified for trauma and non-trauma patients.

Summary Across Panels


Because the multi-specialty panelists suggested further surgical input for this indicator,
the changes to definitions suggested by the surgical panel were implemented. The ratings of the
surgical panelists were considered more valid, and resulted in the indicator being included in the
Accepted provider level indicator set.

Postoperative Hip Fracture


In-Hospital Fractures Possibly Related To Falls
(Initially reviewed: “In-hospital hip fracture and fall”; see Summary below)

This indicator is intended to flag cases of in-hospital fracture, specifically hip fractures
for one version of the indicator, and a broader group of fractures possibly related to falls for
another version of the indicator. It is related to an indicator developed as part of the
Complications Screening Program.7 This indicator limits diagnosis codes to secondary diagnosis
codes in order to eliminate fractures that were present on admission. It further excludes patients
in MDC 8 (musculoskeletal disorders) and patients with indications for trauma or cancer, or
principal diagnoses of seizure, syncope, stroke, coma, cardiac arrest, or poisoning, as these
patients may have a fracture present on admission.
Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM code for [fracture] in any secondary diagnosis field
per 100 surgical discharges.
Denominator All [surgical] discharges.

Exclude all patients with diseases and disorders of the musculoskeletal system
and connective tissue (MDC 8).

Excludes patients with principal diagnosis codes for [seizure], [syncope],


[stroke], [coma], [cardiac arrest], [anoxic brain injury], [poisoning],
[delirium or other psychoses], [trauma], [minor trauma and/or physical
abuse], indication of [alcohol or drug abuse], or [self-inflicted injury].

Exclude patients with any diagnosis of [metastatic cancer], [lymphoid


malignancy] or [bone malignancy].

98
Exclude patients 17 years of age or younger.

Post-Conference Call Panel Ratings


a
Question Median Agreement status
Overall rating 8 Agreement
Not present on admission 7 Indeterminate agreement
Preventability 8 Agreement
Due to medical error 7 Indeterminate agreement
Charting by physicians 8 Agreement
Bias (lower rating is favorable) 3 Indeterminate agreement
a
Medical Complications 1 Multi-specialty Panel

Changes to the indicator. Panelists noted the following:


In-hospital falls. Panelists expressed concern that physicians would variably report in-
hospital falls. Therefore, providers who record falls less would appear to have higher quality,
without actually having lower rates of falls. In addition, panelists were concerned that the
definitions of "fall" may vary. Although coding conventions require that any recorded fall result
in a medical intervention or injury, that intervention could be screening x-rays or other
procedures. Panelists were concerned that some clinically insignificant falls would be variably
reported. Overall, panelists agreed unanimously that falls should not be tracked in this indicator,
and these codes were removed.
Expansion of tracked fractures. Panelists agreed that in-hospital hip fractures were severe
complications that increase patient morbidity and resource consumption. Panelists also reported
that many preventable falls and injuries in hospitals do not result in hip fractures, but other types
of fractures, including other extremity fractures. Panelists agreed that all fractures occurring in
the hospital setting were important to track. This indicator specification was expanded to include
all types of fractures. (However, empirical testing of this specification revealed a
disproportionate number of fractures in younger men, raising the concern that the administrative
data exclusions were not adequately limiting the population at risk, as these fractures seemed
more likely to occur as a result of trauma rather than in-hospital falls. Thus, it was felt that this
change could not be implemented. As a result, the panel ratings, which were clearly based on the
indicator measuring in-hospital fractures, would be more applicable to the “In-hospital fracture
possibly related to falls” Experimental indicator which shows increasing prevalence with
increasing patient age, as expected.)
Addition of exclusions. In response to the final questionnaire, panelists suggested that
patients with delirium may be at higher risk for having fractures present on admission. In
response, patients with a principal diagnosis of delirium were excluded from the population at
risk. In addition, panelists noted that patients with lymphoma or bone cancer are at a higher risk
for non-preventable fractures in-hospital. These patients were also excluded from the population
at risk for both of the empirically tested indicator definitions (i.e., in-hospital hip fracture on the
accepted indicator set, and in-hospital fractures possibly related to falls on the experimental
indicator set).

99
Concerns not addressable through changes. After implementing the changes listed
above, a few relatively minor concerns remained. Panelists rated this indicator very well, despite
these concerns. Several panelists expressed a desire to expand the population at risk to medical
patients in addition to surgical patients. This change was not implemented based on data reported
by Iezzoni et al.15 in relation to their "In-hospital hip fracture and fall" indicator. They reported
that only 11% of "flagged" cases of in-hospital hip fracture in medical patients actually
represented true cases of this complication, with most of the "false positives" representing
fractures that were present on admission. On the other hand, 51%-71% of "flagged" cases in
surgical patients represented true occurrences of in-hospital hip fractures and falls. To minimize
the number of "false positive" cases, we chose to limit this indicator to surgical patients, who are
less likely to have such a fracture present on admission (given our exclusions to the population at
risk).
Panelists did express that given the occurrence of an in-hospital fracture, some of these
fractures may not be preventable by good quality care. Fractures may be more likely in the aged
and frail population, who have weaker bones, and are more vulnerable to falls. This may result in
some slight bias for this indicator for hospitals that care for more of these patients. Finally, in the
effort to prevent some falls, adverse effects may occur. One panelist expressed concern that
deconditioning may be a particularly dangerous side effect of efforts to reduce fractures by
decreasing the mobilization of elderly patients.

Summary
Although this indicator was initially presented as "In-hospital hip fracture and fall,"
panelists unanimously suggested that falls should be eliminated from this indicator and that all
in-hospital fractures should be included. The resulting indicator implemented both of these
changes, and was termed "In-hospital fracture possibly related to falls." The exclusion of
children was added after empirical analysis revealed that children did not have a substantial
number of cases in the numerator. Ratings are reported for this specification. However, the “In-
hospital hip fracture” indicator was selected for inclusion in the Accepted provider level
indicator set, as a subset of the preferred specification of a broader group of fractures related to
in-hospital falls. The more inclusive fracture indicator was retained on the Experimental
indicator set because of both its potential usefulness and its need for further validation to assure
restriction to the intended group of patients who likely experience in-hospital fall.

Postoperative Physiologic and Metabolic Derangements

This indicator is intended to flag cases of selected postoperative metabolic or physiologic


complications. It is based on an indicator developed as part of the Complications Screening
Program.7 The population at risk is limited to elective surgical patients, as patients undergoing
non-elective surgery may develop less preventable derangements. In addition, each diagnosis has
specific exclusions, designed to reduce the number of flagged cases in which the diagnosis was
present on admission or was more likely to be non-preventable.

Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for [physiologic and metabolic
derangements] in any secondary diagnosis field per 100 surgical discharges.

100
Discharges with acute renal failure (subgroup of physiologic and metabolic
derangements) must be accompanied by a procedure code for dialysis (39.95,
54.98).
Denominator All [elective] [surgical] discharges.

Exclude patients with both a diagnosis code of ketoacidosis, hyperosmolarity or


other coma (subgroups of physiologic and metabolic derangements coding) AND
a principal diagnosis of [diabetes].

Exclude patients with both a secondary diagnosis code for acute renal failure
(subgroup of physiologic and metabolic derangements coding) AND a principal
diagnosis of [acute myocardial infarction], [cardiac arrhythmia], [cardiac
arrest], [shock], [hemorrhage] or [gastrointestinal hemorrhage].

Exclude all obstetric admissions (MDC 14 and 15).


Post-Conference Call Panel Ratingsa

Question Median Agreement status Median Agreement status


(MS) (MS) (S) (S)
Overall rating 8 Indeterminate 6.8 Indeterminate
Not present on admission 7.5 Indeterminate 7 Indeterminate
Preventability 7 Indeterminate 6 Disagreement
Due to medical error 6 Indeterminate 5.3 Disagreement
Charting by physicians 7 Indeterminate 7 Indeterminate
Bias (lower rating favorable) 6 Indeterminate 3.5 Indeterminate
a
Multi-specialty panel – Surgical Complications 3
Surgical panel – Surgical Complications 3

Multi-specialty Panel Results

Changes to the indicator. The multi-specialty panel suggested several changes to this
indicator. First, they agreed that diabetic comas be added in addition to diabetic ketoacidosis.
They noted that hyperosmolar coma is less clearly medical error than hypoglycemic coma, but
that both should be tracked. They also supported the addition of hyponatremia to the indicator,
suggesting that appropriate fluid management should prevent this complication when it is
clinically severe. They conceded that both minor and major hyponatremia would be caught by
this indicator, and noted that further investigation would be needed to examine only the severe
cases. Finally, this panel supported the removal of shock from this indicator, noting that this
diagnosis is nebulous and subject to interpretation. Thus, it is impossible to know what
physiological state exactly is represented by this code.
In addition to changes in the numerator, this panel supported the limitation of the
population at risk to elective surgery patients. This panel felt that only these patients could be
appropriately screened and managed preoperatively in an effort to prevent these complications.
Patients admitted emergently or urgently may not have the same opportunity for assessment, and
thus complications in these patients may be less preventable.

101
Concerns not addressable through changes. Panelists noted that the coding of some
metabolic and physiologic complications may be lacking. Specifically they noted that if the
episode is relatively transient, such as in some cases of diabetic ketoacidosis, then the physician
may not code the episode. In other cases, some physicians may be quite vigilant in recording
small physiologic disturbance, such as minor oliguria, resulting in the capture of non-clinically
significant events in this indicator. Similarly, they noted that acute renal failure is a vague
diagnosis, and that use of specific creatinine levels would be a better indicator of renal failure.

Surgical Panel Results

Changes to the indicator. The surgical panel suggested most of the same changes
supported by the multi-specialty panel, for similar reasons, and some additional changes.
Panelists supported the removal of shock and addition of diabetic comas, as well as the limitation
of the population at risk to elective surgical patients. However, the panel did not support the
addition of hyponatremia. They noted that most hyponatremia is clinically insignificant, and does
not constitute a serious adverse event. They further argued that a diagnosis of hyponatremia
represents a variety of severities and that it was impossible to distinguish easily which events
were clinically significant.
Panelists expressed similar concerns about oliguria and anuria as they did about
hyponatremia. They expressed that oliguria is difficult to define and in many patients difficult to
prevent. The varied preventability and definitions introduce extreme bias to this indicator. For
this reason, they argued that these codes be dropped from the indicator. Acute renal failure also
suffers from the problem of varied definitions. What one doctor calls acute renal failure, another
may not. In addition, the inclusion of this code may help to shift patients to a higher paying
DRG, increasing its use artificially. To ensure that the only renal failure cases that are picked up
are those that are clinically severe, this panel suggested that acute renal failure be included only
when it is paired with a procedure code for dialysis.
Finally, panelists questioned the exclusion of MDC 8. This exclusion was included to
exclude patients with hemodialysis who are at increased risk of developing acute renal failure
which is not due to medical error. However, panelists felt that this exclusion was too broad and
did not really identify patients who were at increased risk for acute renal failure after surgery
which is not due to medical error.
Concerns not addressable through changes. No additional concerns were discussed
during the conference call.

Summary Across Panels


The two indicators proposed by each panel differed substantially in their definitions. For
this reason it was necessary to select a definition. The inclusion of hyponatremia could not
adequately be specified, as it was difficult to exclude patients that are at a high risk of developing
this complication. The multi-specialty panel also expressed similar concerns over oliguria and
acute renal failure as the surgical panel, although they did not feel as strongly about these
concerns. Because these concerns were expressed by both panels, we chose the most
conservative indicator, that proposed by the surgical panel. This indicator is included in the
Accepted provider level indicator set, given the high overall rating of the indicator.
This indicator includes children, which was not specifically discussed by the panel. It
should be noted that the incidence of these complications is a function of the underlying

102
prevalence of diabetes and renal impairment which are less common among children than among
adults.

Postoperative Respiratory Failure


(formerly Postoperative pulmonary compromise)

This indicator is intended to flag cases of Postoperative respiratory failure, specifically


respiratory failure. It is based on an indicator developed as part of the Complications Screening
Program.7 This indicator limits the code for respiratory failure to secondary diagnosis codes in
order to eliminate respiratory failure that was present on admission. It further excludes patients
who have major respiratory or circulatory disorders, as these patients may have respiratory
failure present on admission, or may be more likely to develop such compromise after surgical
procedures. This indicator also limits the population at risk to elective surgery patients, as these
patients were judged to be at a lower risk for non-preventable complications.

Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for acute respiratory failure (518.81) in any
secondary diagnosis field per 100 surgical discharges.
Denominator All [elective] [surgical] discharges.

Exclude patients with respiratory or circulatory diseases (MDC 4 and MDC 5).

Exclude all obstetric admissions (MDC 14 and 15)

Post-Conference Call Panel Ratingsa


Question Median Agreement status Median Agreement status
(MS) (MS) (S) (S)
Overall rating 6.5 Indeterminate 7 Indeterminate
Not present on admission 6.5 Indeterminate 8 Agreement
Preventability 6 Indeterminate 6 Indeterminate
Due to medical error 4.5 Agreement 4 Agreement
Charting by physicians 6 Indeterminate 8 Agreement
Bias (lower rating favorable) 6 Indeterminate 6 Indeterminate
a
Multi-specialty panel – Surgical Complications 2
Surgical panel – Surgical Complications 2

Multi-specialty Panel Results

Changes to the indicator. The panel suggested that only acute respiratory failure and
acute edema of lung, unspecified be used. These complications were felt to be the only
complications from the original definitions that are more likely to be preventable, and for which
variations in rates might be meaningful in reference to the quality of care.
Panelists felt that the population at risk should be limited to patients undergoing elective
surgical procedures, as complications in these patients were felt to be more preventable
compared with non-elective surgery cases. In addition, panelists suggested that trauma patients

103
should be excluded, as some pulmonary complications are expected in the course of treatment
for trauma.
Concerns not addressable by changes. Panelists noted that this indicator is “messy,” in
that even with the more conservative definition, preventability of these complications in some
patients is dubious. Further, panelists expressed concern that the clinical definition of these
complications may vary from provider to provider.

Surgical Panel Results

Changes to the indicator. Panelists felt that only acute respiratory failure should be
retained in this indicator. They noted that this is a clinically significant event that is at least
partially preventable. ICD-9-CM coding guidelines state "Respiratory failure is a life-threatening
disorder that requires close patient monitoring and evaluation, with aggressive management
usually requiring placement of the patient in a monitored bed, aggressive respiratory therapy,
and/or mechanical ventilation."166
Panelists felt that mechanical ventilation is a hard clinical endpoint, and thus, there
would be less variation in the severity of the conditions captured by this indicator. All other
codes in the original indicator definition were considered to be either less preventable or
nebulous as to their clinical significance, and thus were eliminated.
The surgical panel agreed that the population at risk should be limited to elective surgical
patients for similar reasons as the multi-specialty panel.
Concerns not addressable by changes. Panelists expressed concern that acute
respiratory failure is affected by case mix and type of surgery. For instance, patients undergoing
hepatic resections or patients that are immunocompromised or malnourished may be more likely
to develop these complications. As a result, this indicator may be subject to some bias.

Summary Across Panels


Both panels rated the overall usefulness of this indicator as relatively favorable. The
surgical panel proposed a more conservative indicator than the multi-specialty panel. Since it
was beyond the scope of our study to inquire of the multi-specialty panel regarding the more
conservative definition, the more conservative definition was retained as an Accepted provider
level indicator.

Postoperative Pulmonary Embolism or Deep Venous Thrombosis

This indicator is intended to flag cases of postoperative venous thromboses and


embolism, specifically pulmonary embolism (PE) and deep venous thromobosis (DVT). It is
closely related to an indicator developed as part of the Complications Screening Program. 7 This
indicator limits vascular complications codes to secondary diagnosis codes in order to eliminate
complications that were present on admission. It further excludes patients who have principal
diagnosis of DVT, as these patients are likely to have had PE/DVT present on admission.

104
Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for [deep vein thrombosis] or [pulmonary
embolism] in any secondary diagnosis field per 100 surgical discharges.
Denominator All [surgical] discharges.

Exclude patients with a principal diagnosis of [deep vein thrombosis].

Exclude all obstetric admissions (MDC 14 and 15).

Exclude patients with secondary procedure code 38.7 when this procedure occurs
on the day of or previous to the day of the principal procedure.

Panelists suggested that this indicator be reported for PE and DVT separately. Thus, this
indicator would be reported by the software as three rates - the overall thromboembolism rate,
the PE rate, and the DVT rate (all other codes). Panelists felt that the reporting of PE and DVT
separately would allow users to distinguish rates which may be higher than expected due to
routine postoperative screening for DVT, or other differences in diagnostic methods.
Post-Conference Call Panel Ratings
a
Question Median Agreement status Median Agreement status
(MS) (MS) (S) (S)
Overall rating 7 Indeterminate 7 Agreement
Not present on admission 7 Indeterminate 7 Agreement
Preventability 7 Indeterminate 6 Disagreement
Due to medical error 6 Indeterminate 3 Indeterminate
Charting by physicians 7 Indeterminate 7 Indeterminate
Bias (lower rating favorable) 5 Indeterminate 6.5 Indeterminate
a
Multi-specialty panel – Surgical Complications 1
Surgical Panel – Surgical Complications 1

Multi-specialty Panel Results

Changes to the indicator. Panelists expressed concern about the code for venous
embolism, and thrombosis of the vena cava. Panelists felt that these complications were not
preventable through the same mechanisms as the other diagnoses included in the definition (e.g.,
pulmonary embolism, phlebitis and thrombophlebitis, femoral vein or other deep vessels, etc.).
Although some vena cava thromboses may result from intra vena cava (IVC) filters, the panel
was concerned that the pathophysiology of thrombosis in this setting is quite different, and that
the decision to place an IVC involves a difficult balancing of risks and benefits. For this reason
the code for venous embolism of thrombosis of the vena cava was removed from the definition
of this indicator.
Concerns not addressable through changes. There were no other additional concerns
regarding this indicator expressed during the conference call.

105
Surgical Panel Results
Changes to the indicator. This panel expressed concerns regarding the code for
phlebitis for venous embolism and thrombosis of the vena cava. They felt that the data on IVC
filters were still inconclusive and that venous embolism and thrombosis of the vena cava
represented a different type of complication than the other codes. They recommended that the
code for venous embolism of thrombosis of the vena cava be deleted from the indicator
definition.
Panelists were concerned that reporting pulmonary embolism and deep venous
thrombosis together may be misleading. Panelists noted that, although in many cases pulmonary
embolism and deep venous thrombosis are simply different manifestations of the same
complication, deep vein thrombosis is reported more variably. Several panelists noted that some
hospitals routinely screen patients for deep vein thrombosis, while others do not. In addition,
deep vein thrombosis is diagnosed by various methods. While some providers require ultrasound
verification, others require clinical symptoms in order to diagnose deep vein thrombosis. These
differences in diagnosis may lead to bias for this indicator. For this reason, panelists suggested
that this indicator include reporting of three rates: the overall thrombosis embolism and the
pulmonary embolism rate together, the pulmonary embolism rate alone, and the deep vein
thrombosis embolism rate alone. This suggestion will be incorporated into the final software for
this indicator.
Concerns not addressable through changes. It is widely documented that the risk for
DVT/PE varies greatly according to the type of procedure performed. As clotting is more
common in peripheral orthopedic procedures, these surgeries have a higher postoperative
vascular complication rate than other types of surgeries. Panelists noted, that because of this
difference in underlying risk for deep vein thrombosis or pulmonary embolism, that this indicator
should be adjusted or stratified according to surgical procedure types. Panelists also noted that
despite varying causes for developing DVT/PE that preventative techniques currently exist and
the proper use of these techniques should reduce the rate of venous thrombosis or pulmonary
embolism. Panelists did note that the literature surrounding preventative techniques is limited to
deep vein thrombosis and may or may not be generalized to pulmonary embolism.

Summary Across Panels


Both panels rated the overall usefulness of this indicator relatively highly as compared to
other indicators. Panelists expressed interest in tracking for the DVT/PE in surgical patients.
They noted that preventative techniques should decrease the rate of this indicator. Both
recommended the same changes to the indicator. The surgical panel also suggested reporting of
pulmonary embolism and deep vein thrombosis separately in the software. This indicator was
retained in the Accepted provider level indicator set.
This indicator includes children, which was not specifically discussed by our panelists. It
should be noted that in the absence of specific thrombophilic disorders, postoperative
thromboembolic complications in children are most likely to be secondary to venous catheters
rather than venous stasis in the lower extremities.

106
Postoperative Sepsis

This indicator is intended to flag cases of nosocomial Postoperative sepsis. It is closely


related to a complications indicator developed as part of the Complications Screening Program. 7
In order to better screen out cases of sepsis that are present on admission this indicator limits its
definition of sepsis to secondary diagnoses (meaning sepsis was not labeled as the principal
diagnosis). In addition this indicator excludes patients that have principal diagnoses of infection,
as it is likely that these patients may have developed sepsis due to these infections, and patients
which had a length of stay less than 3 days, as it is unlikely that nosocomial sepsis may have
developed in such a short time. This indicator limits the population at risk to patients only with
certain medical conditions, as these patients are not at as high a risk for sepsis as other patients
(e.g., patients that have undergone procedures of a contaminated structure). Finally, this indicator
excludes patients who are particularly susceptible to non-preventable sepsis, namely patients
with potential immunocompromised states (e.g., Acquired Immune Deficiency Syndrome
(AIDS), cancer, transplant).
Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM code for [sepsis] in any secondary diagnosis field per
100 discharges in the population at risk.
Denominator All [elective] [surgical] discharges.

Exclude patients with a principal diagnosis of [infection], or any code for


[immunocompromised] state, or [cancer].

Include only patients with a length of stay of more than three days.

Exclude all obstetric admissions (MDC 14 and 15).


Post-Conference Call Panel Ratingsa

Question Median Agreement status


Overall rating 8 Indeterminate agreement
Not present on admission 8 Agreement
Preventability 6.5 Agreement
Due to medical error 6 Indeterminate agreement
Charting by physicians 8 Agreement
Bias (lower rating is favorable) 3 Indeterminate agreement
a
Medical Complications 1 Multi-specialty Panel

Changes to the indicator. The original definition of this indicator, based on Iezzoni et
al.’s CSP,7 limited the population at risk to patients in certain MDCs and DRGs for which it was
judged that sepsis would be a potentially preventable complication. Panelists felt that this
population at risk was too broad, and may include patients that either had sepsis present on
admission, or patients with conditions predisposing patients to sepsis. In addition, this definition
excluded some patients for which sepsis would be preventable. Panelists agreed that limiting this
indicator to all surgery patients undergoing elective surgery was a better way to capture patients

107
for which sepsis is a potentially preventable complication, primarily through pre-surgical
screening and appropriate prophylactic therapy.
Concerns not addressable through changes. Panelists expressed few additional
concerns regarding this indicator during the conference call and the subsequent evaluation. Some
concern was expressed over the varying clinical definitions of "sepsis." Providers may have
different thresholds and methods of diagnosing a patient as septic, leading to some bias for this
indicator. Some panelists also expressed that this complication was less of a concern than other
complications rated, and that it would be very rare in the population at risk. Finally, two panelists
expressed concern about increased inappropriate antibiotic use resulting from the implementation
of this indicator.

Summary
Panelists rated the overall usefulness of this indicator favorably, although they were less
sure that this complication was reflective of medical error. Given the overall rating, this indicator
was retained in the Accepted provider level indicator set.
This indicator includes children, which was not specifically discussed by the panel. It
should be noted that high-risk neonates are at particularly high risk for catheter-related
infections.

Postoperative Wound Dehiscence in Abdominopelvic Surgical Patients

This indicator is intended to flag cases of wound dehiscence in patients who have
undergone abdominal and pelvic surgery. The area level indicator is intended to capture all cases
of wound dehiscence, not only those occurring in-hospital. The hospital level indicator is
restricted to secondary diagnoses, and is intended to capture cases occurring during the same
hospitalization.
Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for reclosure of postoperative disruption of
abdominal wall (54.61) in any secondary procedure field per 100 discharges.
Denominator All [abdominopelvic] surgical discharges.

Exclude all obstetric admissions (MDC 14 and 15).


Post-Conference Call Panel Ratings
a
Question Median Agreement status Median Agreement status
(MS) (MS) (S) (S)
Overall rating 7.5 Indeterminate 7 Indeterminate
Not present on admission 7.5 Indeterminate 8 Agreement
Preventability 6 Agreement 7 Indeterminate
Due to medical error 6 Agreement 5 Indeterminate
Charting by physicians 7 Agreement 8 Indeterminate
Bias (lower rating favorable) 4 Indeterminate 7 Indeterminate

108
a
Multi-specialty panel – Surgical Complications 2
Surgical panel – Surgical Complications 2
Multi-specialty Panel Results
Changes to the indicator. Panelists felt that the diagnosis code for postoperative wound
disruption would include both minor and severe wound dehiscence, without a means of
distinguishing between the two. Panelists felt that a majority would be clinically insignificant
minor dehiscences, and preferred to limit the indicator to cases in which a procedure was
performed.
Panelists felt that cancer patients should not be excluded, as most of these patients are not
at a significant increased risk for the development of non-preventable wound dehiscence.
Concerns not addressable by changes. Panelists reported that the risk of developing
wound dehiscence varies with patient factors such as age and comorbidities. If these factors
varied systematically by institution, this indicator could be subject to some bias.

Surgical Panel Results

Changes to the indicator. Panelists suggested the removal of the diagnosis code for
postoperative wound disruption for similar reasons as the multi-specialty panel. As a result, the
only code left was limited to abdominal and pelvic surgical patients, and the population at risk
was modified to reflect this.
The surgical panel suggested that trauma, cancer, and immunocompromised patients be
included as they were interested in tracking these patients, and felt that these patients would not
add a sufficient amount of false positives to raise concern. These groups could be examined more
closely on further evaluation of this indicator.
Concerns not addressable by changes. Like the multi-specialty panel, the surgical
panel noted that patient health is an important factor underlying the risk of developing
postoperative wound dehiscence. Patients with comorbidities and older patients may be at higher
risk.

Summary Across Panels


Both panels suggested similar indicators, although the surgical panel suggested that the
indicator include trauma, cancer, and immunocompromised patients. The surgical panel
definition was retained in the Accepted provider level indicator set. An area level analog of this
indicator was included in the Accepted area level indicator set.

Technical Difficulty With Procedure

This indicator is intended to flag cases of complications that arise due to technical
difficulties in medical care, specifically those involving an accidental puncture or laceration. It is
based on an indicator developed as part of the Complications Screening Program.7
Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM code denoting [technical difficulty] (e.g., accidental
cut, puncture, perforation or laceration during a procedure) in any secondary
diagnosis field per 100 discharges.
Denominator All [medical] and [surgical] discharges.

109
Exclude all obstetric admissions (MDC 14 and 15).
Post-Conference Call Panel Ratings
a
Question Median Agreement status
Overall rating 7 Agreement
Not present on admission 8 Agreement
Preventability 7 Agreement
Due to medical error 6 Indeterminate agreement
Charting by physicians 6 Indeterminate agreement
Bias (lower rating is favorable) 5 Indeterminate agreement
a
Procedural Complications 1 Multi-specialty Panel

Changes to the indicator. The original definition of this indicator included several
complications that could arise from difficulty in performing a procedure, including failure of
sterile precautions, performance of an inappropriate operation, emphysema arising from a
procedure, cataract fragments in the eye following cataract surgery, and air embolism. However,
panelists felt that most of these codes were of questionable clinical significance, variably
reported, and not of interest for inclusion in this indicator. As a result, panelists suggested
retaining only the two codes for accidental puncture, cut, perforation or hemorrhage during a
procedure.
Concerns not addressable through changes. Panelists noted that even with the retained
codes, reporting is likely to be variable. Some panelists felt that only major situations are likely
to be coded, and that this may be appropriate. However, it is unclear how the culture of quality
improvement in a hospital would affect the coding of this complication. Some physicians may be
reluctant to record the occurrence of this complication for fear of punishment. Panelists also
noted that some of these occurrences are not preventable. However, panelists noted that a high
rate may be indicative of poor quality of care.

Summary
Panelists rated the overall usefulness of this indicator favorably, although they were less
sure that this complication was reflective of medical error. Given the overall rating, this indicator
was retained in the Accepted provider level indicator set.
This indicator includes children, which was not specifically discussed by the panel. It
should be noted that the smaller anatomy of children may increase the technical complexity of
procedures.

Transfusion Reaction

This indicator is intended to flag cases of major reactions due to transfusions (ABO and
Rh). The area level indicator is intended to capture all cases of transfusion reactions, not only
those occurring in-hospital. The hospital level indicator is restricted to patients who have a
secondary diagnosis of transfusion reaction, as is intended to flag cases occurring during
hospitalization.

110
Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for [transfusion reaction] in any secondary
diagnosis field per 100 discharges.
Denominator All [medical] and [surgical] discharges.

Post-Conference Call Panel Ratings


a
Question Median Agreement status Median Agreement status
(MS) (MS) (S) (S)
Overall rating 8 Agreement 7.8 Agreement
Not present on admission 7 Agreement 7.5 Agreement
Preventability 7 Disagreement 8 Indeterminate
Due to medical error 7 Indeterminate 5.3 Disagreement
Charting by physicians 8 Indeterminate 7.5 Agreement
Bias (lower rating favorable) 6 Disagreement 2.5 Agreement
a
Multi-specialty Panel – Surgical Complications 3
Surgical Panel – Surgical Complications 3

Multi-specialty Panel Results


Changes to the indicator. Panelists expressed concern that the code 999.8, “other
transfusion reaction,” was nebulous and may include reactions caused by minor antigens in
patients with complex hematologic histories who may have been sensitized by multiple prior
transfusions. These complications were seen as less preventable than Rh or ABO incompatability
reactions, and clinically different. For this reason this panel suggested that this code be removed
from this indicator.
Panelists also noted that while trauma patients may be at higher risk for developing
transfusion reactions, as it may be occasionally appropriate to use blood without cross-matching,
reactions in these patients should be monitored and may be preventable. For this reason panelists
suggested that trauma patients be added to the population at risk, but that this subgroup should be
examined closely.
Concerns not addressable through changes. No other concerns were reported by this
panel.

Surgical Panel Results


Changes to the indicator. The surgical panel suggested the same changes to this
indicator as the multi-specialty panel for similar reasons.
Concerns not addressable through changes. No other concerns were reported by this
panel.

Summary Across Panels


Both panels rated the overall usefulness of this indicator highly and suggested similar
changes to the definition. The indicator is part of the Accepted provider level indicator set. An
area level analog of this indicator was included in the Accepted area level indicator set.

111
This indicator only includes those events which actually result in additional medical care.
Thus, near misses and errors in which no harm or little harm results are not included in this
indicator. Some minor reactions may be missed, although the panel suggested that these minor
reactions are less clearly due to medical error than the Rh or ABO reactions included in the
indicator.

Accepted Obstetric Indicators


Birth Trauma – Injury to Neonate

This indicator is intended to flag cases of birth trauma for infants born alive in a hospital.
It excludes patients born pre-term, as birth trauma in these patients may be less preventable than
for full-term infants.

Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for [birth trauma] in any diagnosis
field per 100 liveborn births.
Denominator All [liveborn] infants.

Exclude infants with a subdural or cerebral hemorrhage (subgroup of birth


trauma coding) AND any diagnosis code of [pre-term infant] (denoting a
birth weight of less than 2,500 g and less than 37 weeks gestation).

Exclude infants with injury to skeleton (767.3, 767.4) AND any diagnosis
code of osteogenesis imperfecta (756.51).

Post-Conference Call Panel Ratingsa


Question Median Agreement status
Overall rating 8 Agreement
Not present on admission 8 Agreement
Preventability 7 Indeterminate agreement
Due to medical error 6 Disagreement
Charting by physicians 7 Indeterminate agreement
Bias (lower rating is favorable) 4 Indeterminate agreement
a
Obstetric Complications of Delivery 1 Panel

Changes to the indicator. Panelists felt that injury to the brachial plexus often includes
injuries that are transient and minor, and therefore may be reported variably. Thus, they
suggested removing this code.
Panelists suggested two specific exclusions. First, they suggested that pre-term infants
with low birth weight be excluded from the population at risk for intracranial hemorrhage, due to
concern that some of these injuries would not be preventable in pre-term infants, who have very
fragile bridging veins and may also be at risk for hypoxic injury. Second, they suggested that
infants with osteogenesis imperfecta be excluded from the population at risk for injury to
skeleton, as these complications are not preventable in these infants.

112
Concerns not addressable through changes. Panelists noted that some infants are
prone to birth injuries, such as babies with shoulder dystocia or large babies. Panelists suggested
that predicting these types of deliveries is difficult, and such complications in these babies are
often not preventable. Panelists also felt that patients with no or little prenatal care should be
treated differently than those with prenatal care. However, these patients cannot be accurately
identified using administrative data.
Summary
Panelists felt that this indicator was very useful. Although it may not indicate medical
error, it does capture potentially preventable complications. It should be noted that panelists were
particularly conflicted about the ability of this indicator to detect medical error, with some
panelists feeling that it clearly does and others that it clearly does not. Given the relatively high
overall rating, this indicator was retained as part of the Accepted provider level indicator set.
Obstetric Trauma (All Delivery Types Reviewed in One Indicator)
This indicator is intended to flag cases of potentially preventable trauma during delivery
in women delivering during the index hospitalization.
Final Definition: Obstetric Trauma - Vaginal With Instrument
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for [obstetric trauma] in any diagnosis or procedure
field per 100 instrument assisted vaginal deliveries.
Denominator All [vaginal delivery] discharges with any procedure code for [instrument assisted
delivery].
Final Definition: Obstetric Trauma - Vaginal Without Instrument
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for [obstetric trauma] in any diagnosis or
procedure field per 100 instrument assisted vaginal deliveries.
Denominator All [vaginal delivery] discharges.

Exclude [instrument assisted delivery].


Final Definition: Obstetric Trauma - Cesarean Section
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for [obstetric trauma] in any diagnosis or
procedure field per 100 cesarean deliveries.
Denominator All [cesarean delivery] discharges.

113
Post-Conference Call Panel Ratings
a
Question Median Agreement status
Overall rating 7 Indeterminate agreement
Not present on admission Not applicable Not applicable
Preventability 7 Agreement
Due to medical error 5 Disagreement
Charting by physicians 8 Agreement
Bias (lower rating is favorable) 4 Indeterminate agreement
a
Obstetric Complications of Delivery 1 Panel

Changes to the indicator. The original definition of this indicator included both 3rd and
4th degree lacerations. Panelists, citing some evidence, felt that 3rd degree lacerations are variably
reported, and thus rates would be more reflective of reporting than of the actual rate. If reporting
were standardized, panelists were interested in retaining 3rd degree lacerations, but as
standardization cannot be guaranteed with administrative data, this indicator was limited to 4 th
degree lacerations as well as other major lacerations.
Panelists noted that the risk of trauma varies substantially by delivery type, and that
indications for different modes of delivery may vary systematically between hospitals. Thus,
panelists suggested that this indicator be split into 3 different indicators – vaginal delivery
without instrument, instrument assisted delivery, and cesarean section.
Concerns not addressable through changes. Panelists noted that while this indicator is
of use (with one panelist dissenting), it is not a pure indicator of medical error. Many cases of
trauma will not be preventable, but an unusually high rate would be worth investigating for
potential quality problems. Specifically, panelists noted that overuse of episiotomy, may be
associated with high rates of obstetrical trauma.
Panelists noted that the obstetrical trauma rate is best interpreted in the context of
additional data. Notably, since providers may shift more patients to cesarean sections rather than
perform instrument assisted deliveries, which may increase trauma rates, a provider’s cesarean
section rate should be monitored simultaneously. In addition, providers may want to interpret
this indicator in the context of epidural anesthesia rate and perinatal mortality.

Summary
Panelists rated the overall usefulness of this indicator favorably, although they suggested
that this indicator be stratified. Panelists rated this indicator as one entity, although it was
eventually split into three indicators: vaginal delivery with instrument, vaginal delivery without
instrument, and cesarean section. Given the high overall rating, all three indicators were retained
as part of the Accepted provider level indicator set. Also, a JCAHO 3rd and 4th degree laceration
indicator was tested in the empirical analyses as part of the Experimental indicator set.

Experimental Indicators

Aspiration Pneumonia

114
This indicator is intended to flag cases of perioperative aspiration pneumonia. It is based on an
indicator developed as part of the Complications Screening Program,7 although this indicator
adds two “E-codes”. This indicator limits aspiration pneumonia codes to secondary diagnosis
codes in order to eliminate aspiration pneumonia that was present on admission. It further
excludes patients with a primary diagnosis of seizure, trauma, drug overdose or poisoning, as
these patients may have aspiration pneumonia or a precursor condition present on admission.

Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for [aspiration pneumonia] in any secondary
diagnosis field per 100 surgical discharges.
Denominator All [elective] [surgical] discharges.

Exclude patients with a principal diagnosis of [seizure], [trauma], [drug


overdose], or [poisoning].

Exclude all obstetric admissions (MDC 14 and 15).

Post-Conference Call Panel Ratings


a
Question Median Agreement status Median Agreement status
(MS) (MS) (S) (S)
Overall rating 6 Indeterminate 6.5 Indeterminate
Not present on admission 7 Agreement 8 Indeterminate
Preventability 6 Indeterminate 6 Indeterminate
Due to medical error 6 Disagreement 5.3 Indeterminate
Charting by physicians 7 Indeterminate 5.3 Agreement
Bias (lower rating favorable) 5 Indeterminate 3 Indeterminate
a
Multi-specialty panel – Surgical Complications 3
Surgical panel – Surgical Complications 3

Multi-specialty Panel Results

Changes to the indicator. The panel suggested that the population at risk may be too
broad, as patients undergoing emergent or urgent surgery may not have adequate time before
surgery to screen patients for risk factors, including having food matter in the stomach. These
patients are more susceptible to aspirating perioperatively. For this reason, this panel suggested
the population at risk be limited to patients undergoing elective surgery only.
Concerns not addressable through changes. Panelists expressed concern about the
diagnosis of this complication. Different physicians diagnose pneumonia differently, with some
relying on clinical factors such as chest x-ray and sputum analysis, and others requiring
broncoscopy to verify the diagnosis. In addition, some physicians may not label the pneumonia
as due to “aspiration” but simply as pneumonia. Panelists noted that such differences may lead to
bias for this indicator.
Panelists also noted that the preventability of aspiration pneumonia varies depending on
the timing of the aspiration. Aspirations occurring during surgery and in the recovery room are
often preventable using preoperative interventions. Pneumonia resulting from these aspirations

115
may be further preventable through administration of medications peri-operatively. However,
aspirations that occur later in a hospitalization, for instance in an intensive care unit while a
patient is intubated, are less preventable. Because it is impossible to distinguish the timing of the
complication using administrative data, this concern cannot be addressed through changes to the
indicator definition.

Surgical Panel Results

Changes to the indicator. The surgical panel suggested limiting the population at risk to
patients undergoing elective surgery for similar reasons as the multi-specialty panel. They also
added that even with the exclusions of trauma, seizure, drug overdose and poisoning patients that
it is impossible to tell whether patients admitted emergently or urgently aspirated before
admission or perioperatively.
Concerns not addressable through changes. The surgical panel also expressed concern
regarding the diagnosis of aspiration pneumonia for similar reasons as the multi-specialty panel.
Also like the multi-specialty panel, the surgical panel expressed concern about the varied
preventability of this complication. They suggested, in addition, that the timing of the aspiration
be tracked carefully, if at all possible. They expanded that elderly and highly medicated patients
are more likely to aspirate later in a hospitalization.

Summary Across Panels


Both panels expressed equivocation about this indicator. While the idea of tracking
preventable aspiration pneumonia was of interest, the panels expressed skepticism about whether
or not it can be done with administrative data. Both panels suggested the same revisions to this
indicator, which are incorporated in the definition of this indicator. The overall rating of this
indicator did not meet criteria for full acceptance, and thus this indicator was retained only in the
Experimental indicator set.

CABG Following PTCA

This indicator is intended to flag cases where CABG follows a PTCA in the same
hospitalization, presumably due to complications of that procedure. This indicator was adapted
from several published studies, which used CABG after PTCA to examine operator proficiency
in relation to procedure volume. 127-134, 160
Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for [CABG] in any procedure field per 100
discharges with PTCA in any procedure field.

CABG must occur on the same day or the day after the PTCA procedure.
Denominator All discharges with ICD-9-CM code for [PTCA] in any procedure field.
Post-Conference Call Panel Ratings
a
Question Median Agreement status
Overall rating 7 Agreement
Not present on admission Not reported Not reported

116
Preventability Not reported Not reported
Due to medical error Not reported Not reported
Charting by physicians Not reported Not reported
Bias (lower rating is favorable) Not reported Not reported
a
Procedural Complication 1 Multi-specialty Panel
Summary
Overall this indicator was rated as useful, although the panelists were interested in having
more cardiologists consulted. The only cardiologist on the panel rated the indicator as very poor.
As the other panelists do not perform or care for PTCA patients, and since we were unable to
review this indicator with a panel of cardiologists, we assigned this indicator as to the
Experimental indicator set, requiring further review. The remaining results from the multi-
specialty panel are not reported due to panelists’ concerns about rating this indicator.
The denominator for this indicator includes children that receive PTCA, however, this is
rare, except in the setting of underlying coronary artery anomalies or cardiac transplantation.

Decubitus Ulcer in High Risk Patients


(See “Decubitus ulcer” in Accepted indicators section. This Experimental indicator was not rated
by panelists.)
In-Hospital Fractures Possibly Related to Falls
(See “In-hospital hip fracture” in Accepted indicators section.)
Intraoperative Physical Injuries
(Re-named to: “Intraoperative nerve compression injuries,” after exclusion of corneal abrasions
and lip lacerations)
This indicator is intended to flag cases of minor physical trauma caused by the handling
of patients in the peri-operative period, particularly the unconscious and/or anesthetized patient.
Trauma patients are excluded as these patients may have such complications on admission.

Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM code for [nerve compression injuries] AND a
diagnosis code of 997.09 in any secondary diagnosis field per 100 surgical
discharges.
Denominator All [surgical] discharges.

Exclude patients with a principal diagnosis of [trauma].

Exclude patients with a principal diagnosis of [disorders of the peripheral


nervous system] or [dorsopathies].

Post-Conference Call Panel Ratings


a
Question Median Agreement status Median Agreement status
(MS) (MS) (S) (S)
Overall rating 8 Agreement 8 Agreement
Not present on admission 7 Agreement 8 Agreement

117
Preventability 8 Agreement 8 Agreement
Due to medical error 7 Agreement 5 Disagreement
Charting by physicians 7 Agreement 5 Indeterminate
Bias (lower rating favorable) 5 Disagreement 4 Indeterminate
a
Multi-specialty panel – Surgical Complications 3
Surgical panel – Surgical Complications 1

Multi-specialty Panel Results

This indicator was suggested by the multi-specialty panel in lieu of the complications of
anesthesia. It was not rated in the initial evaluation, and was briefly discussed for
operationalization reasons during the conference call. The panelists suggested that lip
lacerations, corneal abrasions and brachial plexopathy be used as complications of surgery.

Surgical Panel Results

Changes to the indicator. The surgical panel felt that superficial injuries to the cornea
were not of interest to track, as they are temporary and clinically less significant injuries. In
addition, this panel suggested that potentially minor lip lacerations be eliminated, leading to the
elimination of the code for uncomplicated open wound to the lip.
The surgical panel suggested that additional nerve compression injuries, such as injuries
to the ulnar nerve, as they felt that these injuries are important to track as well.
Concerns not addressable through changes. Panelists felt that if these injuries could
be accurately detected, it would be of great interest to track. They noted that these injuries, while
they often resolve, are distressing to patients, and rather preventable. Panelists did suggest
however, that some of these injuries would not be reliably charted by the physician.

Summary Across Panels

Both panels agreed that the indicator captured complications that affected the patient, and
that were likely to be preventable with careful patient handling. The indicator was slated for the
Accepted indicator set, but further information about specification based on coding input raised
concerns. For example, lip laceration could not be reliably detected through administrative data,
leading to the renaming of this indicator to better reflect the remaining codes, nerve compression
injuries. In addition, corneal abrasions were included in the specification rated by the panelists,
but ophthalmology specialists would need to be consulted to assess the face validity of including
this complication. Concerns about charting from the panelists, along with coding conventions
related to a relatively new pertinent code used in the indicator (997.09) resulted in demoting the
indicator to the Experimental indicator set.
Recent evidence has suggested that patient factors, such as previous subclinical nerve
dysfunction, may play a large role in nerve compression injuries.167 In exploring this indicator,
attention should be paid to the potential preventability of these complications. In addition, these
conditions are much less common among children than among adults.

118
Malignant Hyperthermia

This indicator is intended to flag cases of malignant hyperthermia. Cases of trauma are
excluded, as these patients may be more susceptible to complications.

Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for malignant hyperthermia (995.86) in any
diagnosis field per 100 surgical discharges.
Denominator All [surgical] discharges.

Exclude all obstetric admissions (MDC 14 and 15).


Post-Conference Call Panel Ratingsa

Question Median Agreement status Median Agreement status


(MS) (MS) (S) (S)
Overall rating 7 Agreement 7.5 Indeterminate
Not present on admission 8 Agreement 8.8 Agreement
Preventability 7 Indeterminate 5.5 Indeterminate
Due to medical error 6 Disagreement 3.3 Indeterminate
Charting by physicians 8 Agreement 8.5 Agreement
Bias (lower rating favorable) 2 Agreement 1.5 Agreement
a
Multi-specialty panel – Surgical Complications 3
Surgical panel – Surgical Complications 3

Multi-specialty Panel Results


Changes to the indicator. No changes were suggested for this indicator.
Concerns not addressable through changes. This indicator was created by the panel
during the conference call. As a result panelists only commented on this indicator through
written comments. Some panelists noted that this complication is only preventable if a family or
personal history of malignant hyperthermia is detected preoperatively. If the question is not
asked, or the history ignored, then the complication is undoubtedly due to medical error.
However, when the family history is not known or reported by the patient when asked, then the
complication is not preventable. Therefore, this rare complication would need to be examined on
a case by case basis.

Surgical Panel Results


Changes to the indicator. No changes were suggested for this indicator.
Concerns not addressable through changes. Panelists expressed similar concern about
two opposing aspects of this indicator, with the complication almost entirely preventable or
impossible to prevent based on prior knowledge of family history. They also noted that this rare
complication must be considered on a case by case basis.
Panelists also noted that a more appropriate denominator would be all procedures in
which anesthesia is used. However, it is impossible to define the denominator as all procedures

119
with anesthesia using administrative data. Thus some complications may be missed, as a result of
limiting the population at risk to surgical cases.

Summary Across Panels


The overall usefulness of this indicator was rated relatively highly by both panels, with
the caveat that some cases are not entirely preventable. Panelists appeared to have conflicting
opinions about this indicator, although the final rating did not reflect disagreement. While most
panelists agreed that when a family history is known and proper screening and/or preventative
measures are not taken, that this is a clearly preventable complication. However, the frequency
of this complication occurring under those circumstances is likely to be rare. More frequently, a
family history is unknown or unclear, and in these cases there is no link to quality of care. It has
been suggested that death due to malignant hyperthermia may be a better measure than malignant
hyperthermia alone, however, this idea was not reviewed by the panels, nor empirically
examined. This code was implemented in 1998, and thus this indicator could not be analyzed
empirically using available data. For this reason this indicator was assigned to the Experimental
indicator set.

Postoperative Acute Myocardial Infarction (AMI)

This indicator is intended to flag cases of postoperative AMI. It is similar to an indicator


developed as part of the Complications Screening Program.7 Codes denoting a “subsequent
episode of care” for AMI are not included. This indicator limits AMI codes to secondary
diagnosis codes in order to eliminate AMIs that were present on admission. It includes only
patients undergoing elective surgery, and excludes patients who are undergoing cardiac surgery,
as these patients may be more likely to develop an AMI perioperatively.
Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for [Acute Myocardial Infarction] in any
secondary diagnosis field per 100 non-cardiac surgical discharges.

Denominator [Elective], [surgical] discharges.

Exclude patients undergoing [cardiac surgery].

Exclude all obstetric admissions (MDC 14 and 15).


Post-Conference Call Panel Ratings
a
Question Median Agreement status Median Agreement status
(MS) (MS) (S) (S)
Overall rating 4 Indeterminate 7 Indeterminate
Not present on admission 7 Indeterminate 8 Agreement
Preventability 5 Indeterminate 6 Disagreement
Due to medical error 4 Indeterminate 5 Indeterminate
Charting by physicians 7 Indeterminate 8 Agreement

120
Bias (lower rating favorable) 5 Disagreement 6 Indeterminate
a
Multi-specialty panel – Surgical Complications 1
Surgical panel – Surgical Complications 1

Multi-specialty Panel Results


Changes to the indicator. Panelists felt that the risk of acute myocardial infarction
varies greatly depending on the comorbidities of the patient, the type of procedure, and the
urgency of the procedure. While preventative interventions (e.g., use of beta-blockers in high
risk patients) may decrease the postoperative AMI rate, these interventions may be impossible to
implement for urgent cases, where there is not adequate time for appropriate screening and risk
stratification. In addition, beta-blockers may be inappropriate for trauma patients. Due to these
concerns, the panel felt it was best to limit the population at risk to elective surgical patients,
who could be appropriately assessed before surgery.
Concerns not addressable through changes. Panelists expressed concerns over the
preventability of this complication in some patients. Some patients may be appropriately
screened, and assessed, but may have some risk factors. However, the benefits of surgery may
outweigh the risk of AMI. Panelists advocated that some established algorithms of AMI risk,
such as that adopted by the American Society of Anesthesiologists, may be helpful in
appropriately risk adjusting this indicator. However, the clinical detail required for these
algorithms is not available in administrative data. As a result, this panel strongly encouraged the
use of this indicator only for internal reporting, noting the caveat that many AMIs may not have
been preventable. Some panelists felt that examining the appropriate use of beta-blockers
directly would be a more appropriate indicator.
In addition to the known risk factors in patients, unknown coronary artery disease may
predispose a patient to having a non-preventable postoperative AMI.

Surgical Panel Results

Changes to the indicator. The surgical panel questioned the exclusion of MDC 5, as
this MDC included vascular surgery patients. Unlike patients undergoing cardiac surgery, for
whom it is difficult to establish whether or not an AMI actually occurred, AMI in vascular
patients can be established. Panelists felt that vascular surgery patients were an important
population at risk for this complication, and thus should not be excluded. The exclusion of MDC
5 was removed, and cardiac surgery patients were excluded using the existing exclusion criteria
based on DRGs and ICD-9-CM codes.
The surgical panel advocated for the limitation of the population at risk to elective
surgery for similar reasons as the multi-specialty panel. However, they noted that many of the
AMIs in this risk group would not be preventable, since they would be unexpected.
Concerns not addressable through changes. The surgical panel also expressed concern
over the variable preventability of this complication. They noted that the preventability of this
complication depends on the risk factors of the patient. Interventions exist to reduce the chance
of AMI in patients with known cardiac artery disease. However, some patients may have
unknown disease, or other unknown risk factors. These patients could not receive preventative
interventions. In addition, the panel noted that older patients are at higher risk, and advocated for
stratification of older patients.

Summary Across Panels

121
The two panels reached different conclusions regarding the usefulness of this indicator
(i.e., rejected by multi-specialty panel, accepted by surgical panel). Neither panel was considered
to carry more weight because of their unique knowledge of the complication. As a result, the
panel scoring was combined, which resulted in this indicator being assigned to the Experimental
indicator set. In addition, the multi-specialty panel did not discuss the removal of the exclusion
of MDC 5. However, the objection to the exclusion appeared clinically sound. For this reason it
was retained in the final definition.
Many patients experiencing postoperative AMI have pre-existing subclinical or clinical
coronary artery disease. These diseases are rare in children.

Postoperative Iatrogenic Complications


(All complications reviewed in one indicator)

This indicator is intended to flag cases of postoperative iatrogenic complications. It is


closely related to an indicator developed as part of the Complications Screening Program. 7 This
indicator limits complication codes to secondary diagnosis codes in order to eliminate
complications that were present on admission.

Final Definition: Postoperative Iatrogenic Complications - Nervous System Complications


Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes of [iatrogenic nervous system complications]
in any secondary diagnosis field per 100 surgical discharges.
Denominator All [surgical] discharges.

Exclude all obstetric admissions (MDC 14 and 15).

Final Definition: Postoperative Iatrogenic Complications - Cardiac Complications


Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes of 997.1 in any secondary diagnosis field per
100 surgical discharges.
Denominator All [surgical] discharges.

Exclude all obstetric admissions (MDC 14 and 15).

Final Definition: Postoperative Iatrogenic Complications –Digestive System Complications


Quality Measure Number of events per 100 discharges of population at risk
Numerator Secondary dx codes of iatrogenic complication of digestive system (997.4)
Denominator [Surgical] patients

Final Definition: Postoperative Iatrogenic Complications – Respiratory Complications


Quality Measure Number of events per 100 discharges of population at risk
Numerator Secondary dx code of iatrogenic complication of respiratory system (997.3)
Denominator [Surgical] patients

Final Definition: Postoperative Iatrogenic Complications – Urinary Complications


Quality Measure Number of events per 100 discharges of population at risk
Numerator Secondary dx code of iatrogenic complications of urinary system (997.5)
Denominator [Surgical] patients

122
Final Definition: Postoperative Iatrogenic Complications – Vascular Complications
Quality Measure Number of events per 100 discharges of population at risk
Numerator Secondary dx code of iatrogenic peripheral vascular complication (997.2)
Denominator [Surgical] patients

Post-Conference Call Panel Ratings


a
Question Median Agreement status
Overall rating Not reported Not reported
Not present on admission Not reported Not reported
Preventability Not reported Not reported
Due to medical error Not reported Not reported
Charting by physicians Not reported Not reported
Bias (lower rating is favorable) Not reported Not reported
a
Procedural Complications 1 Multi-specialty Panel

After the panelists rated this indicator, the project team received additional pertinent
details about coding conventions for iatrogenic complications coded with 997.xx. These
conventions would have been important to the discussion of the indicator, and would have likely
influenced the ratings by panelists. As a result, the actual ratings are not reported. The indicator
also included 6 distinct clinical areas that could be defined separately: urinary, digestive,
respiratory, vascular, cardiac, and nervous system. Empirical analysis of patients who receive
these codes was used to determine that four of the six were capturing clinically minor
complications that may not be of interest to track. The remaining two areas, cardiac and nervous
system, appeared to be identifying cases of potentially serious clinical complications. Thus,
cardiac and nervous system iatrogenic complications were retained on the experimental indicator
list for further empirical evaluation. However, it would have not been appropriate to include
these two indicators in the Accepted indicator set since a clinical panel did not fully assess their
face validity. Thus, these two indicators were assigned to the Experimental set, and all others
were not considered further.

Reopening of Surgical Site

This indicator is intended to flag cases where a surgical site is reopened. It is closely
related to an indicator developed as part of the Complications Screening Program.7 This indicator
limits reopening codes to secondary procedure codes in order to eliminate scheduled reopening
of surgical sites. To further ensure that the reopening of a surgical site is associated with a
principal procedure, the reopening must occur at least one day after the principal procedure.

Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for [reopening of a surgical site] in any
secondary procedure field per 100 surgical discharges.

123
Reopening of surgical site must occur at least one day after the principal
procedure.

Revision of vascular procedure 39.49 must occur within 24 hours of principal


procedure.
Denominator All [surgical] discharges.

Post-Conference Call Panel Ratingsa


Question Median Agreement status Median Agreement status
(MS) (MS) (S) (S)
Overall rating 6 Indeterminate 7 Indeterminate
Not present on admission 7 Agreement 7 Indeterminate
Preventability 7 Indeterminate 7 Indeterminate
Due to medical error 6 Indeterminate 6 Indeterminate
Charting by physicians 7.5 Agreement Agreement
Bias (lower rating favorable) 3.5 Agreement 5 Indeterminate
a
Multi-specialty panel – Surgical Complications 2
Surgical panel – Surgical Complications 2

Multi-specialty Panel Results

Changes to the indicator. Panelists felt the codes for revision of the heart or a vascular
procedure were inherently different from other reopening of surgical site codes. Therefore these
codes were removed from the definition. Panelists also felt that trauma patients may undergo
reopening of surgical sites as a planned procedure. For this reason they suggested that trauma
patients be excluded from this indicator. Finally, this panel felt that immunocompromised
patients may undergo reopening of surgical site that is not preventable due to wound infection or
other complications. Therefore these patients were excluded.
Concerns not addressable by changes. Panelists felt that the preventability of this
indicator depends on the reason for reopening. In addition, panelists felt that patient factors such
as comorbidities or immunocompromised state may increase the likelihood that a patient would
develop this complication.

Surgical Panel Results

Changes to the indicator. Panelists suggested the removal of the code for a correction
procedure on the heart, for similar reasons as the multi-specialty panel. However, they rejected
the removal of the code for revision of vascular procedure, instead opting for the limitation to
procedures occurring within 24 hours of the principal procedure. It was felt that these early
complications are most likely preventable, due to poor technique or poor patient selection.
Concerns not addressable by changes. Panelists noted that some procedures are
purposely staged procedures, and that these procedures should be removed. However, it is
impossible to remove all staged procedures using ICD-9-CM codes. In addition, some patients
may be at higher risk of reopening, such as when a patient undergoes the removal of failed
hardware after an orthopedic surgery.

124
SummaryAcross Panels
The definition of this indicator relies on ICD-9-CM codes which are defined as
reopenings that cannot be defined using another ICD-9-CM code. Thus, reopenings that result in
a more complicated procedure than simply a reopening of the surgical site would not be captured
by this indicator. Panelists were not aware of this caveat when rating this indicator, and it was
felt then that their ratings did not truly reflect the actual nature of this indicator. In addition,
panelists requested that planned reopenings such as staged procedures be excluded. The
operationalization of this suggestion was beyond the scope of this study, as it would have
required a full review of ICD-9-CM procedure codes. Thus, this indicator was retained only in
the Experimental indicator set.

Suture of Laceration

This indicator is intended to flag cases of lacerations during a surgical procedure, which
result in a suturing procedure. It is closely related to a indicator developed as part of the
Complications Screening Program,7 although it does add codes for the suture of laceration of
diaphragm, blood vessel, small intestine, and anus. This indicator limits suture of laceration
codes to secondary procedure codes in order to isolate those lacerations that can truly be linked
to a surgical procedure. For the same reason, this indicator eliminates all sutures of lacerations
that take place before the principal procedure.

Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for [suture of laceration] in any secondary
procedure field per 100 surgical discharges.

Suture of laceration must occur on the same day or after the principal procedure.
Denominator All [surgical] discharges.

Exclude patients with any diagnosis code for [foreign body] or [trauma].

Exclude all obstetric admissions (MDC 14 and 15).

Post-Conference Call Panel Ratings


a
Question Median Agreement status Median Agreement status
(MS) (MS) (S) (S)
Overall rating 8 Agreement 5 Indeterminate
Not present on admission 7 Agreement 7 Agreement
Preventability 8 Agreement 6 Indeterminate
Due to medical error 7 Indeterminate 6 Indeterminate
Charting by physicians 8 Indeterminate 6 Indeterminate
Bias (lower rating favorable) 4 Indeterminate 5 Indeterminate
a
Multi-specialty panel – Surgical Complications 2
Surgical panel – Surgical Complications 2

125
Multi-specialty Panel Results

Changes to the indicator. Panelists expressed concern that lacerations vary in


morbidity. Some lacerations, minor in nature, would be considered routine during a procedure,
and may not be reported, depending on the detail of the surgical notes. Some surgeons, however,
may report these minor lacerations leading to bias in reporting of lacerations. Panelists agreed
that some more serious lacerations are important complications to track. To ensure that
lacerations are consistently reported and are of sufficient morbidity to cause concern, this panel
suggested that the indicator be limited to lacerations that require a return to the operating room.
Administrative data do not allow for tracking returns to the operating room that occur on the
same day of the principal procedure. The only option to implement the suggestion would be to
limit suture of laceration codes to those occurring the day following the procedure or later.
Concerns not addressable by changes. No additional concerns were raised during the
conference call of surgical panels.

Surgical Panel Results

Changes to the indicator. Unlike the multi-specialty panel, the surgical panel disagreed
with the exclusion requiring a return to the operating room, because this required that the suture
of laceration occur one day after or following. They felt that this exclusion would limit the
number of flagged complications to a very small number making the indicator less useful.
The panel noted that the listed lacerations do not include lacerations that may occur
during all procedures. As a result, they suggested several types of lacerations that should be
included in the indicator, including obstetric and gynecological lacerations. Obstetric lacerations
are included in another indicator. For this reason these codes were not added. However
gynecological lacerations were added as were urological and nerve suture of laceration codes.
Concerns not addressable by changes. The surgical panel also noted that many
lacerations occurring during surgery are trivial in nature. They thought that these lacerations are
less likely to be recorded by the physician, and are less important to track. Many panelists felt
that the exclusion of the trivial lacerations from this indicator would be desirable, as this
restriction would limit complications to those causing significant morbidity for the patient.
Panelists noted that patient characteristics and procedure type greatly affect risk of a
laceration occurring. Lacerations may occur as an expected complication of the procedure,
during complex procedures on complicated structures, such as some types of hand surgery. It
was also noted that re-surgery or repeat surgery is the major risk factor for suture of laceration,
due to a build up of scar tissue. They noted that this case-mix difference is not addressable by
limiting the indicator to elective surgery. Since re-surgery cannot be adjusted for using
administrative data, panelists recommended that re-surgery rates be examined when using this
indicator.

Summary Across Panels


The two panels arrived at slightly different definitions. The first panel required a return to
the operating room, which was rejected by the second all surgeon panel. Empirical analysis
revealed that this restriction significantly lowers the number of cases. Since the second panel had
more expertise, the surgical panel’s definition was retained for further analysis. The surgical

126
panel rated the overall usefulness of this indicator relatively low and the multi-specialty panel
rated this definition very highly, so this indicator was assigned to the Experimental indicator set.

Experimental Obstetric Indicators


Obstetric Wound Complications - Cesarean Section Delivery

This indicator is intended to flag cases of potentially preventable delivery wound


complications in women delivering by cesarean section during the index hospitalization.

Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for [cesarean wound complications] in any
diagnosis field per 100 deliveries.
Denominator All [cesarean delivery] discharges.
Post-Conference Call Panel Ratings
a
Question Median Agreement status
Overall rating 7.5 Agreement
Not present on admission 8.5 Agreement
Preventability 6.5 Indeterminate agreement
Due to medical error 2.5 Indeterminate agreement
Charting by physicians 7 Indeterminate agreement
Bias (lower rating is favorable) 5 Agreement
a
Obstetric Complications 2 Panel

Changes to the indicator. This indicator was originally presented as a combined


indicator of all obstetric wound complications (cesarean and vaginal). Panelists felt that wound
complications of cesarean delivery differed substantially from those of vaginal delivery in both
cause and preventability. For this reason they suggested that these complications be split into two
separate indicators, and that the more useful indicator would be limited to cesarean deliveries.
Concerns not addressable through changes. Panelists expressed concern that the
severity and layer of the wound dehiscence could not be determined using this indicator. Thus
both superficial disruptions and deep fascial disruptions are combined into one indicator. If
possible, panelists felt that the deeper wound disruptions should be tracked more closely than
superficial disruptions. However, this is not possible with the current coding conventions.
Panelists noted that wound complications are less preventable in some subgroups, such as
patients with overall poor tissue health, diabetics, and those having had a prior c-section, and that
these risk factors are more common in patients with lower socioeconomic status. Thus, panelists
expressed concern that some bias may be present for this indicator based on patient case mix.

127
Finally, some panelists felt that the use of this indicator could lead to the inappropriate
overuse of antibiotics.

Summary
Panelists rated the overall usefulness of this indicator favorably. However, they rated the
extent to which this indicator reflected medical error as very poor. Because these indicators are
intended to identify potential patient safety problems, the lack of literature supporting this
indicator and the panel’s equivocality regarding the indicator, this indicator was assigned to the
Experimental indicator set.

Obstetric Wound Complications - Vaginal Delivery

This indicator is intended to flag cases of potentially preventable delivery wound


complications in women delivering during the index hospitalization.

Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for [perineal wound complications] in any
diagnosis field per 100 deliveries.
Denominator All [vaginal delivery DRGs].

Post-Conference Call Panel Ratings


a
Question Median Agreement status
Overall rating 6.5 Indeterminate agreement
Not present on admission 8 Agreement
Preventability 4 Indeterminate agreement
Due to medical error 3 Indeterminate agreement
Charting by physicians 6 Indeterminate agreement
Bias (lower rating is favorable) 5 Indeterminate agreement
a
Obstetric Complications 2 Panel

Changes to the indicator. This indicator was originally presented as a combined


indicator of all obstetric wound complications (cesarean and vaginal). Panelists felt that wound
complications of cesarean delivery differed substantially from that of vaginal delivery in both
cause and preventability. For this reason they suggested that these complications be split into two
separate indicators. For patients who deliver vaginally, panelists agreed that diagnosis codes for
vulval and perineal hematoma should be added as they felt that these complications may be
preventable.
Concerns not addressable through changes. Panelists felt that some case mix bias may
result from differing preventability of this complication. Patients having poor tissue health, poor
nutrition, underlying conditions such as diabetes, or undergoing operative vaginal delivery would
be more susceptible to this complication. Panelists also noted that many perineal wound
disruptions are not apparent until after hospital discharge. Thus a large percentage of these
wound disruptions would be missed using inpatient administrative data. Finally, panelists

128
expressed concern that the use of this indicator may lead to a higher cesarean section rate, as
physicians avoid operative delivery or episiotomies.

Summary
Panelists were uncertain about the usefulness of this indicator and they clearly noted that
this complication is not reflective of medical error. Because of the ambiguity of this indicator,
this indicator was retained in the Experimental indicator set for further investigation.

Other Obstetric Complications


Uterine Rupture

This “other obstetric complications” indicator is intended to flag cases of potentially


preventable delivery complications in women delivering during the index hospitalization. The
“Uterine rupture” indicator became a separate indicator based on panel input, and is intended to
flag cases of uterine rupture in women who have undergone a trial of labor.

Final Definition: Other Obstetric Complications


Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for [other obstetrical complications] in any
diagnosis field per 100 deliveries.
Denominator All [deliveries].

Final Definition: Uterine rupture


Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM codes for [rupture of uterus during or after labor]
in any diagnosis field per 100 deliveries with trial of labor.
Denominator All deliveries with a [trial of labor].

Post-Conference Call Panel Ratings


a
Question Median Agreement status
Overall rating 6.5 Indeterminate Agreement
Not present on admission 8 Agreement
Preventability 5 Indeterminate Agreement
Due to medical error 5 Indeterminate Agreement
Charting by physicians 8 Agreement
Bias (lower rating is favorable) 5 Indeterminate Agreement
a
Obstetric Complications 2 Panel

Changes to the indicator. Panelists suggested that the rate of uterine rupture be adjusted
for vaginal birth after cesarean section (VBAC) rate, as these patients are well documented to be
at higher risk of uterine rupture. To address the intent of this suggestion, a separate indicator was
specified to measure the rate of uterine rupture only for patients who have a trial of labor.
Panelists rated the “Other obstetric complications” indicator, with uterine rupture included, but

129
adjusted for VBAC rate. The implementation of the “Uterine rupture” indicator occurred after
the panelists’ final evaluation.
Concerns not addressable through changes. Panelists expressed concern that the
preventability of these heterogeneous and relatively rare complications varies by the
complication. They noted that a majority of these complications are not easily preventable,
although some are minimized if a diagnosis is made and treatment promptly started. They noted
that patient comorbidities and factors influence some of these complications, and that referral
centers receive more of these patients than other centers.
Panelists were concerned that differences in coding may affect this indicator. For
instance, some benign uterine ruptures, so called uterine windows, may be coded, when they are
clinically insignificant. Panelists were not interested in tracking these minor complications, but
the restrictions of administrative data make tracking only severe complications impossible.

Summary
Panelists were uncertain about the usefulness of this indicator and they clearly noted that
this complication is not reflective of medical error. Because of the ambiguity of this indicator,
this indicator was retained in the Experimental indicator for further investigation. Also stemming
from this indicator was a separate uterine rupture indicator. Although panelists requested that
uterine rupture be combined with other complications, such that this currently widely discussed
complication would not be singled out, the requested risk adjustment for trial of labor after
cesarean was not easily operationalized when uterine rupture was combined with other
complications for which this risk adjustment was inappropriate. The uterine rupture indicator
was also retained in the Experimental indicator set.

Post-partum Urinary Tract Infection (UTI)

This indicator is intended to flag cases of potentially preventable puerperal urinary tract
infections in women delivering during the index hospitalization. This indicator excludes patients
with infection of the amniotic cavity, as infection in these patients is more likely to be present on
admission or non-preventable. This indicator was suggested by one of the obstetric complication
panels.

Final Definition
Quality Measure Number of events per 100 discharges of population at risk
Numerator Discharges with ICD-9-CM code of 646.62 or 646.64 in any diagnosis per 100
deliveries.
Denominator All [cesarean delivery] and [vaginal delivery] discharges

Post-Conference Call Panel Ratings


a
Question Median Agreement status
Overall rating 7 Indeterminate agreement
Not present on admission 5 Indeterminate agreement
Preventability 7 Indeterminate agreement
Due to medical error 3.5 Indeterminate agreement

130
Charting by physicians 7 Indeterminate agreement
Bias (lower rating is favorable) 3.5 Indeterminate agreement
a
Obstetric Complications 2 Panel

Changes to the indicator. This indicator was suggested and created by the panel, due to
the interest in tracking post-partum urinary tract infections.
Concerns not addressable through changes. Several concerns about this indicator
were raised, although most panelists remained interested in tracking this complication, since its
use may decrease unnecessary catheterization. Panelists felt that some hospitals may have a
higher rate of these complications due to patient case mix. Specifically, they noted that patients
with other infections or overall poor health are more likely to develop these complications. These
factors vary systematically with socioeconomic status. Also, patients that undergo operative
delivery or regional anesthesia may be at higher risk of developing post-partum UTI. Further,
they noted that many of these complications develop after discharge. Thus, there may be
significant underreporting resulting from the exclusive use of inpatient data. Finally, panelists
expressed concern that the use of this indicator would lead to the inappropriate overuse of
antibiotics.

Summary
Panelists rated the overall usefulness of this indicator favorably. However, they rated the
extent to which this indicator reflected medical error as very poor. Because these indicators are
intended to identify potential patient safety problems, the lack of literature supporting this
indicator and the panels equivocality regarding the indicator, this indicator was assigned to the
Experimental indicator set.

Third or Fourth Degree Obstetric Laceration


(This indicator was not reviewed. See “Obstetric trauma” in Accepted indicators section for
discussion.)

Uterine Rupture
(See “Other obstetric complications.”)

Section 3E. Comparative Empirical Results


Extensive empirical analyses were conducted on indicators accepted by the clinical
panels as having met minimum criteria for face validity (i.e., Accepted Hospital Level Indicators,
Accepted Area Level Indicators). These analyses were intended to provide additional
information about indicators, rather than as decision making tools regarding the validity of these
indicators. Additional research exploring the validity of these indicators is discussed in Chapter
4. The analyses included in this report are intended to provide guidance for future research and
use of these indicators, and include statistical measures of reliability, bias, relatedness of
indicators and persistence over time, in addition to adjusting for demographics, DRG and
comorbidities. MSX methods, correlation analysis and factor models investigated relationships

131
among the set of accepted indicators in order to identify potential underlying constructs (e.g.,
processes of care or structural characteristics) common to some or all of the indicators. 1
Less extensive empirical analyses were conducted on the Experimental Hospital Level
Indicators, including statistical measures of reliability and bias, with adjustments for
demographics, DRG and comorbidities. Because there was no a priori reason to suspect an
underlying construct common to these heterogeneous measures, no attempt was made to identify
one. Therefore each of the experimental indicators are meant to be evaluated separately and
subjected to further investigation and refinement. Although there are exceptions, in general the
experimental indicators tend to have less systematic hospital level variation than the accepted
indicators, but do not appear to be more or less biased.
All of the findings on bias reflect the level of information available for risk adjustment
using HCUP SID data, and may therefore not apply to data sets that have more clinically detailed
data elements. The presence of “high bias” mentioned in this section suggests that risk
adjustment, using administrative data elements, is necessary to interpret hospital level
differences in the rates of these indicators. However, for all indicators, the risk adjustment that is
possible using HCUP data may or may not be adequate to correct potential bias.
The text in this section makes reference to numbered tables that can be found in
Appendix G. The figures and tables contained in this section graphically or categorically
summarize the numerical results in the Appendix G tables.
The empirical evidence presented here is intended to guide future use and development of
these PSIs. As such, the relevance on any particular piece of empirical evidence will depend on
the purpose of the analysis being conducted. However, among the accepted non-obstetric
hospital level indicators, five of the measures that appear to perform well on several different
dimensions, including reliability, bias, relatedness of indicators, and persistence over time, are
the following: “Complications of anesthesia,” “Postoperative wound dehiscence,” “Postoperative
hemorrhage or hematoma,” “Death in low mortality DRGs,” and “Postoperative hip fracture.”
The other 11 non-obstetric accepted indicators often perform well, and provide useful
information for their intended purpose. The obstetric indicators (“Birth trauma,” “Obstetric
trauma - vaginal delivery with instrumentation,” “Obstetric trauma - vaginal delivery without
instrumentation,” “Obstetric trauma – cesarean section,”) also tend to perform well, though
partly because of the higher rates and consequently large amount of variation among providers in
these indicators; and partly because only age and gender risk adjustment was applied, so that the
indicators showed little apparent bias.

Accepted Hospital Level Indicators

An analysis of the overall rates of PSIs in the National SID found that the least frequent
PSI is Transfusion Reaction, with only 16 cases in Florida and 129 cases in the National SID in
1
The empirical analyses reported, except for raw rates, reflect a prior version of the indicator definitions (e.g.,
specified software) than specified in Appendices D and E. In this prior version of the software used in this report
three differences were present. First, for the indicator “Postoperative hemorrhage or hematoma,” procedure codes
for control of hemorrhage and hematoma were combined into a single category, applied to either diagnosis, resulting
in a 20% increase in this indicator’s rate compared to the final definition. Second, “Postoperative hip fracture”
included pediatric patients, a group seldom experiencing this condition. Third, in the comorbidity software, when
fifth digits specified the presence of more than one comorbidity, only one comorbidity was assigned (renal failure, if
present, or congestive heart failure, if renal failure was not present). It is anticipated that these minor changes would
not affect the overall results of these analyses.

132
1997. The most frequent PSIs are “Obstetric trauma – vaginal delivery without instrumentation”
and “Failure to rescue,” with 120,858 and 135,085 cases in the National SID, respectively. The
total number of adverse events (numerator), the total number of patients at risk (denominator),
and the overall rate in Florida and the National SID for each accepted patient safety indicator can
be found in Appendix G Table 1. The rates for the Florida SID used for initial testing, and the
National SID were generally similar.
The mean hospital rates for each indicator in the National SID are depicted in Figure 1
below. A comparison of the National SID mean hospital rates and the Florida SID show that
these rates are similar (see Appendix G Table 2), although the standard deviation and skew
statistic (which is a measure of the symmetry of the hospital level distribution) are greater in the
National SID than in Florida, especially for the relatively rare PSI. This is likely true for most
individual states; the greater number of the hospitals in the National SID increases the detection
of occurrence for infrequent events. Also noteworthy in this analysis is that some indicators have
a substantial number of hospitals that do not have any discharges in the denominator. For the
obstetric indicators in particular, about one-fourth of hospitals have no deliveries at risk.

Figure 1. Summary of Mean Hospital Level Rates

0.25

0.20
Mean Rate

0.15

0.10

0.05

0.00
TRANSFUSION REACTION
FOREIGN BODY LEFT IN
COMPLICATIONS OF ANESTHESIA
IATROGENIC PNEUMOTHORAX
POSTOP PHYSIO METAB DERANG
DEATH IN LOW MORT DRGS
POSTOP HIP FRACTURE
INFECTION DUE TO MED CARE
POSTOP HEMORR OR HEMAT
TECH DIFFICULTY W PROC
POSTOP WOUND DEHIS
POSTOP RESP FAILURE
OB TRAUMA C-SECTION
POSTOP PE OR DVT
BIRTH TRAUMA
POSTOP SEPSIS
DECUBITUS ULCER
OB TRAUMA VAGINAL WO INSTR
FAILURE TO RESCUE
OB TRAUMA VAGINAL W INSTR

PSI

The rates vary considerably across measures, from a high of 20.3% for “Obstetric trauma
– vaginal delivery with instrumentation” to a low of 0.001% for “Transfusion reaction” (which

133
represents 129 cases in the National SID in 1997). “Obstetric trauma – vaginal delivery without
instrumentation” and “Failure to rescue” also have much higher rates than the other PSI, which
are generally 2% or less.
The apparent standard deviations, as shown in Figure 2, (unadjusted for risk or reliability)
also vary considerably among the measures, from a high of 14.2 percentage points for “Obstetric
trauma - vaginal delivery with instrumentation” (relative to a mean of 20.3 percentage points) to
a low of less than 0.1 percentage points for “Iatrogenic pneumothorax,” “Transfusion reaction”
and “Foreign body left during procedure.” The non-obstetric measures with the greatest amount
of hospital level variation in absolute magnitude are “Failure to rescue,” “Postoperative sepsis”
and “Decubitus ulcer.” Among the obstetric indicators, “Obstetric trauma (with and without
instrumentation)” has the most variance. Relative to the mean hospital level rate, the measures
with the greatest hospital level variation are “Postoperative physiological and metabolic
derangement,” and “Death in low mortality DRGs.” In other words, some of these measures
have low rates of occurrence, so the absolute magnitude of the variance is small, but the degree
of spread in the rates is relatively large.

Figure 2. Summary of Standard Deviations in Hospital Level Rates

0.25
Standard Deviation

0.20

0.15

0.10

0.05

0.00
TRANSFUSION REACTION
FOREIGN BODY LEFT IN
IATROGENIC PNEUMOTHORAX

INFECTION DUE TO MED CARE


TECH DIFFICULTY W PROC
POSTOP HEMORR OR HEMAT
POSTOP RESP FAILURE
POSTOP HIP FRACTURE
COMPLICATIONS OF ANESTHESIA
POSTOP WOUND DEHIS

POSTOP PHYSIO METAB DERANG


DEATH IN LOW MORT DRGS
POSTOP PE OR DVT
OB TRAUMA C-SECTION
DECUBITUS ULCER
POSTOP SEPSIS
BIRTH TRAUMA

OB TRAUMA VAGINAL WO INSTR


FAILURE TO RESCUE
OB TRAUMA VAGINAL W INSTR

PSI

The hospital level variation tends to be skewed toward the right, meaning that there is a
long right-hand tail of hospitals with higher rates (see Appendix G, Table 3). The most highly

134
skewed measures are “Complications of anesthesia,” “Postoperative physiological and metabolic
derangement,” and “Death in low mortality DRGs,” with a median skew statistic for all
indicators of 10.0. Examples of the distributions may be found in Appendix G, Figures 1 and 2.
These figures show the distribution of hospital level rates for “Decubitus ulcer” (with a median
rate of 1.6%, a mean rate of 2.1% and skew statistic of 3.57) and “Birth trauma” (with a median
rate of 0.25%, a mean rate of 0.94% and a skew statistic of 11.85). Hospitals with zero rates are
excluded from the figures, which comprise 10% and 25% for “Decubitus ulcer” and “Birth
trauma,” respectively.

Risk Adjustment

Three levels of risk adjustment were applied to the measures using a logistic model.
First, the hospital level measures were adjusted for age, gender and age-gender interactions. The
age groups are the standard age categories used by the National Center for Health Statistics
(NCHS) in their descriptive statistics, namely 0, 1-4, 5-14, 15-24, 25-34, 35-44, 45-54, 55-64,
65-74, 75-84 and 85+. Next, the measures were adjusted for age, gender, and modified DRG
category. The categories were modified to combine separate DRGs with and without
complications, and to exclude the super-MDC DRGs (e.g., Tracheostomies). Finally, the
measures were adjusted for age, gender, DRG and comorbidity, using a modified version of the
AHRQ comorbidity software. Details are provided in Section 2E Empirical Methods.
Overall, age-gender risk adjustment tended to increase the level of apparent hospital level
variation by about 2% (see Appendix G, Table 3). Given the low rates of occurrence,
“Transfusion reaction” and “Foreign body left in during procedure” were not risk adjusted for
technical reasons, although there may be conceptual reasons to risk adjust these indicators. The
impact was greatest on “Postoperative respiratory failure,” “Postoperative hemorrhage or
hematoma,” “Postoperative wound dehiscence,” and “Death in low mortality DRGs,” and
minimal on most other indicators. The rates tend to be slightly more skewed, meaning that
differences in the age-gender mix were masking differences in rates, but several measures are
slightly more skewed, meaning that some of the higher rates could be accounted for by
differences in the age-gender mix of the population at-risk.
In addition to age-gender risk adjustment, DRG and comorbidity risk adjustment was
performed (see Appendix G Table 4). The obstetric measures are not adjusted for DRG. The
“Death in low mortality DRGs” indicator is also not adjusted for DRG. Rather, the indicator is
stratified by DRG group, namely medical (adult and pediatric), surgical (adult and pediatric),
neonatal, obstetric and psychiatric (See Appendix G, Table 1). Relative to age-gender
adjustment, the overall impact of DRG adjustment was greater, decreasing hospital level
variation by 4.1%. Comorbidity adjustment decreased variation by 1.6%. Most of the variation
among hospitals explained by the risk adjustment was accounted for by DRG, with incremental
amounts accounted for by the comorbidity categories, although comorbidity adjustment was
relatively more important for some indicators. DRG risk adjustment had the biggest impact on
“Technical difficulty with procedure,” “Failure to rescue,” “Infection due to medical care,” and
Postoperative PE or DVT.” Comorbidity risk adjustment had the biggest impact on
“Postoperative respiratory failure,” “Infection due to medical care,” “Decubitus ulcer,” and
“Postoperative sepsis.” Variation in “Postoperative hemorrhage or hematoma” and “Death in
low mortality DRGs” actually increased slightly.

135
Reliability Adjustment

The effect of the reliability adjustment was examined by the statistics on the signal
standard deviation, signal share and signal ratio (see Appendix G, Table 5). Hospitals with fewer
than three patients in the denominator were not included in the reliability adjustment.
Multivariate methods (taking into account correlations among indicators in order to extract
additional 'signal') were applied to most of the accepted indicators. The exceptions were “Death
in low mortality DRGs” and “Failure to rescue.” Only univariate smoothing methods were
applied to these two indicators. Overall, the reliability adjustment reduced the hospital level
variation dramatically. On average, over one-half of the apparent hospital level variation, even
after risk adjustment, was estimated to be attributable to noise. The measures that were affected
the most by reliability adjustment in terms of reduction in the hospital level standard deviation
were “Postoperative physiological and metabolic derangement,” “Postoperative sepsis,” and
“Postoperative hemorrhage or hematoma.” The measures that were affected the least were
“Birth trauma,” “Iatrogenic pneumothorax” and “Technical difficulty with procedure.” (For
examples of the distribution of indicators see Appendix G, Figures 3 and 4.) These figures show
the distribution of hospital rates for “Decubitus ulcer” and “Birth trauma” after risk and
reliability adjustment.

MSX Statistics

The MSX statistics give estimates of the degree of total hospital level variation accounted
for by signal and noise, and the degree of total variation (hospital and patient) accounted for by
signal. Signal standard deviation is an estimate of the systematic variation (‘signal’) among
hospitals (See Figure 3). The higher the signal standard deviation, the greater the opportunity to
identify hospital characteristics associated with higher (or lower) rates. The non-obstetric
measures with the most signal are “Failure to rescue,” “Decubitus ulcer” and “Postoperative PE
or DVT.” Among the obstetric measures, “Obstetric trauma - vaginal delivery (with and without
instrumentation)” and “Birth trauma” have the most signal. For “Decubitus ulcer,” the signal
variance represents a difference of 60 adverse events (20 to 80 with a mean of 50) per hospital
between the bottom and top hospitals in the middle two-thirds of the distribution. The measures
with the least signal are “Postoperative hemorrhage or hematoma,” “Infection due to medical
care” and “Iatrogenic pneumothorax. The measures “Transfusion reaction” and “Foreign body
left during procedure” have no signal, meaning no detectable systematic hospital level variation.
The signal share (see Figure 4) is a measure of the share of total variation (hospital and
patient) accounted for by the signal (hospital). The higher the share is, the relatively more
important the hospital in accounting for the rate. The lower the share is, the less important the
hospital, and the more important other potential factors (e.g., patient characteristics). The non-
obstetric measures with the higher signal share are “Death in low mortality DRGs,” “Decubitus
ulcer” and “Failure to rescue.” “Birth trauma” and “Obstetric trauma - vaginal delivery (with
and without instrumentation)” have the highest share among the obstetric indicators. The overall
low levels of the share of total variation accounted for by hospitals is an indication that there are
many other factors that influence these rates besides the hospital.
Finally, signal ratio is a measure of how much of the observed variation is signal and how
much is noise (see Figure 5). The ratio is affected both by the amount of signal and by the

136
amount of noise. In other words, the signal ratio will be high even in the absence of much signal,
if the amount of noise is also low. For the PSIs, the ratios tend to be high even with little signal
because the hospital sample sizes are very large for most of the indicators, which makes the
hospital estimates precise (i.e., low noise). The higher the signal ratio, the more likely that
observed differences in risk adjusted rates reflect true differences in hospital performance. The
lower the signal ratio, the more likely that observed differences in risk adjusted rates reflect a
large degree of noise. Non-obstetric indicators with the highest signal ratio are “Death in low
mortality DRGs,” “Decubitus ulcer” and “Iatrogenic pneumothorax.” Among the obstetric
indicators, “Birth trauma - injury to neonate” and “Obstetric trauma - vaginal delivery without
instrumentation” have the highest ratio. Indicators with the lowest signal ratio are “Postoperative
hemorrhage or hematoma,” “Postoperative sepsis” and “Postoperative wound dehiscence.”

Figure 3. Summary of Signal Standard Deviation in Hospital Level Rates

0.10
Signal Standard Deviation

0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0.00
POSTOP HEMORR OR HEMAT
INFECTION DUE TO MED CARE

IATROGENIC PNEUMOTHORAX
POSTOP PHYSIO METAB DERANG
POSTOP HIP FRACTURE
COMPLICATIONS OF ANESTHESIA
POSTOP WOUND DEHIS

POSTOP RESP FAILURE


TECH DIFFICULTY W PROC
POSTOP SEPSIS
DEATH IN LOW MORT DRGS
POSTOP PE OR DVT

OB TRAUMA C-SECTION
DECUBITUS ULCER
FAILURE TO RESCUE
OB TRAUMA VAGINAL WO INSTR
BIRTH TRAUMA

OB TRAUMA VAGINAL W INSTR


FOREIGN BODY LEFT IN
TRANSFUSION REACTION

PSI

Minimum Bias

The effect of age, gender, DRG and comorbidity risk adjustment on the relative ranking
of hospitals, compared to no risk adjustment, was assessed using five measures of impact. Both
the unadjusted and risk adjusted measures were adjusted for reliability, in order to remove the
impact of noise on the assessment of potential bias. Also, even if risk adjustment reduces the
apparent level of hospital level variation, the relative rank may not be affected if the distribution

137
of the adjusters does not vary systematically across hospitals. A large impact on the relative
ranking means that the measures are biased based on the patient characteristics we observe from
the administrative data. Minimal or no impact means that the measures are not biased based on
the characteristics we observe (although there might be characteristics that we do not observe
using administrative data that are related to the patient’s risk of experiencing an adverse event).
The first measure is a relative rank correlation statistic (a measure of the impact of
adjustment on the assessment of relative hospital performance). The second measure is the
average absolute magnitude of the change in unadjusted – adjusted rate for each hospital (a
measure of the relative importance of adjustment). The third and fourth measures are the
percentage of hospitals that remain in the top (or bottom) 10% of the distribution after
adjustment (measures of the impact on the highest and lowest hospitals). The last measure is the
percentage of hospitals that change more than two deciles in the distribution after adjustment (a
measure of the impact throughout the distribution). According to the rank correlation, the
indicators most affected in terms of the relative ranking of hospitals are “Failure to rescue,”
“Decubitus ulcer,” “Technical difficulty with procedure,” “Postoperative PE or DVT,” “Death in
low mortality DRGs,” “Iatrogenic pneumothorax,” “Postoperative sepsis” and “Postoperative
respiratory failure.” The least affected indicators are “Birth trauma - injury to neonate,”
“Obstetric trauma - vaginal delivery without instrumentation” and “Complications of
anesthesia.” DRG risk adjustment could not be applied to the obstetric indicators, because
obstetric DRGs are divided only by the mode of delivery and the presence or absence of
complications or comorbidities. Also, comorbidity adjustment may not be as applicable to the
obstetric population, and in some specific instances (see Appendix D) could not be applied to
obstetric indicators, as applicable ICD-9-CM codes were not available.

Figure 4. Summary of Signal Share in Hospital Level Rates

138
TRANSFUSION REACTION
FOREIGN BODY LEFT IN
BIRTH TRAUMA
OB TRAUMA VAGINAL W INSTR
DEATH IN LOW MORT DRGS
OB TRAUMA VAGINAL WO INSTR
FAILURE TO RESCUE
DECUBITUS ULCER

Figure 5. Summary of Signal Ratio in Hospital Level Rates


OB TRAUMA C-SECTION
COMPLICATIONS OF ANESTHESIA

PSI

139
POSTOP PE OR DVT
POSTOP HIP FRACTURE
POSTOP WOUND DEHIS
POSTOP RESP FAILURE
TECH DIFFICULTY W PROC
POSTOP PHYSIO METAB DERANG
POSTOP SEPSIS
IATROGENIC PNEUMOTHORAX
INFECTION DUE TO MED CARE
POSTOP HEMORR OR HEMAT
0.15

0.12

0.09

0.06

0.03

0.00
Signal Share
1.00
0.90
0.80
0.70
Signal Ratio

0.60
0.50
0.40
0.30
0.20
0.10
0.00
POSTOP HEMORR OR HEMAT
POSTOP SEPSIS
POSTOP WOUND DEHIS

POSTOP RESP FAILURE


POSTOP PHYSIO METAB DERANG
OB TRAUMA C-SECTION
FAILURE TO RESCUE
POSTOP HIP FRACTURE
INFECTION DUE TO MED CARE
POSTOP PE OR DVT

OB TRAUMA VAGINAL W INSTR


COMPLICATIONS OF ANESTHESIA
TECH DIFFICULTY W PROC
IATROGENIC PNEUMOTHORAX
DECUBITUS ULCER
OB TRAUMA VAGINAL WO INSTR
DEATH IN LOW MORT DRGS

BIRTH TRAUMA
FOREIGN BODY LEFT IN
TRANSFUSION REACTION
PSI

In terms of absolute magnitude of the change in adjusted rate, the impact is greatest for
“Failure to rescue,” “Technical difficulty with procedure,” and “Death in low mortality DRGs.”
Along with “Decubitus ulcer,” “Failure to rescue,” “Technical difficulty with procedure” and
“Death in low mortality DRGs” also have the greatest impact at the upper tail of the distribution,
meaning that accounting for these patient characteristics accounts for the very high rates of these
indicators for some hospitals.
Overall, if one were to create a simple score based on the five measures of potential bias
(e.g., ranking the indicators 1 to 20 for each bias measures, and summing the ranks), the most
biased measures would be “Failure to rescue,” “Technical difficulty with procedure,” “Decubitus
ulcer” and “Postoperative PE or DVT.” The least biased measures would be “Postoperative
hemorrhage and hematoma” and “Complications of anesthesia.” This is summarized in Table 18.
Obstetric measures in general also demonstrate little bias, although these indicators were
subjected to less risk adjustment than the other indicators. However, these categories are not
definitive. Each bias measure stands on its own as a measure of performance, depending on the
purpose of the analysis. Also, as mentioned in the introduction, more clinically detailed
information than is available in the HCUP SID may yield different conclusions. What is certain
is that unadjusted rates for the ‘high’ bias measures are likely to be misleading.

Table 18. Summary of Minimum Bias in Hospital Level Rates

140
High Bias Medium Bias Low Bias
Failure to rescue Postoperative hip fracture Postoperative hemorrhage
or hematoma
Technical difficulty with Iatrogenic pneumothorax Complications of
procedure anesthesia
Decubitus ulcer Postoperative physiological
and metabolic derangement
Postoperative PE or DVT Infection due to medical
care
Death in low mortality DRGs Postoperative wound
dehiscence
Postoperative sepsis
Postoperative respiratory
failure

Relatedness of Indicators

To investigate the relationship between indicators, we examine the hospital level


Spearman correlations among the measures, and conduct a factor analysis using principal factor
analysis based on the Spearman correlations (with a varimax rotation in order to maximize the
loadings on each factor). The correlations between the measures can be found in Appendix G
Table 7. If a measure is valid, it should be correlated with related measures that reflect similar
aspects of hospital performance or hospital characteristics. For example, “Obstetric trauma –
vaginal delivery without instrumentation” is correlated with “Obstetric trauma – vaginal delivery
with instrumentation” (a correlation of 0.545, p<.0001). For the most part the measures are
positively correlated (p<.05), with the exception of “Postoperative respiratory failure” and
“Failure to rescue,” which are negatively correlated with several other indicators. “Technical
difficulty with procedure” is positively correlated with several other measures, including
“Infection due to medical care” (0.306, p<.0001) and “Iatrogenic pneumothorax” (0.318,
p<.0001). It is not expected that all indicators would be strongly correlated with each other, as
different aspects of quality may be reflected by each indicator.
Two factor analyses were conducted to examine the relationship and possible underlying
“factors.” The first analyses combined obstetric and non-obstetric indicators. This factor analysis
reflects the correlation results and suggests that there are two “factors” or underlying constructs
common among all the PSI. Appendix G, Table 8 shows the factor loadings and share of
variation explained for each factor and for each PSI. There are two factors that explain almost
all of the systematic variation among the PSIs (the remaining, unexplained variation is unique to
each PSI). The first factor tends to be associated with the obstetric indicators and the surgical
indicators, while the second factor tends to be associated with medical indicators, although two
post-operative PSIs are included. The indicators with the highest loadings on the first factor,
which explains about 10-20% of the variation for those PSIs and over one-half of the systematic
variation among all PSIs, include “Infection due to medical care,” “Technical difficulty with
procedure,” and “Obstetric trauma – vaginal delivery (with and without instrumentation),”
“Decubitus ulcer,” “Postoperative respiratory failure,” ” and “Postoperative sepsis” indicators
load most heavily on the second factor, which explains about one-third of the systematic
variation. A second factor analysis was conducted, removing the obstetric indicators. The
removal of the obstetric indicators did not result in an obvious change to the factor results.

141
Overall, there is significant hospital level variation common among the patient safety
indicators, and that variation is concentrated into two independent dimensions. Some underlying
construct is potentially identifiable. However, most of the variation is unique to each PSI,
meaning that to a large degree the indicators each measure an independent dimension of
performance.

Persistence of Rates Over Time

Persistence was examined using the Florida SID from 1995-1997 (See Appendix G,
Table 8). Two important points emerged from this examination. First, the rates are consistent
from year to year, suggesting that at least for the years considered no fundamental changes in
coding or practice confound comparison across years. The exception is “Postoperative
hemorrhage or hematoma” which relies on ICD-9-CM codes adopted in October, 1996. Second,
hospital performance is consistent from year to year for many of the indicators. “Decubitus
ulcer,” “Technical difficulty with procedure,” “Obstetric trauma - vaginal delivery without
instrumentation,” and “Infection due to medical care,” all have year to year correlations in excess
of 0.70 for 1995-96 and 1996-97. “Decubitus ulcer” and “Technical difficulty with procedure”
have correlations across a two year time period in excess of 0.70. But most of the indicators are
correlated from year to year, meaning that hospitals that are above average tend to remain above
average, at least over a three year period.

Experimental Hospital Level Indicators

Analyses of the experimental indicators show that the least frequent PSI is “Intra-
operative nerve compression injury,” with only 7 cases in Florida and 102 cases in the National
SID in 1997. The most frequent PSIs are “Postoperative iatrogenic complication – cardiac,” and
“3rd or 4th degree obstetric laceration,” with 83,502 and 99,383 cases in the National SID,
respectively. The total number of adverse events (numerator), the total number of patients at risk
(denominator), and the overall rate in Florida and the National SID for each experimental PSI
can be found in Appendix G Table 9. The rates vary considerably across measures, from a high
of 6.1% for “Decubitus ulcer in high risk patients” to a low of 0.001% for “Intra-operative nerve
compression injury” (which represents 7 cases in the National SID in 1997). Like the accepted
PSIs, the rates between the Florida and National SID are similar.
The apparent standard deviations (unadjusted for reliability) also vary considerably
among the measures, from a high of 6.5 percentage points for “Decubitus ulcer in high risk
patients” (relative to a mean of 6.2 percentage points) to a low of less than 0.37 percentage
points for “Uterine rupture” and “Intra-operative nerve compression injury.” “Malignant
Hyperthermia,” which relies on an ICD-9-CM code that was not in use in 1997 was not assessed.
The measures with the greatest amount of hospital level variation in absolute magnitude are
“Decubitus ulcer in high risk patients,” “3rd or 4th degree obstetric laceration” and “In-hospital
fractures related to falls.”
Also like the accepted PSIs, the hospital level variation tends to be skewed toward the
right, meaning that most hospitals are slightly less than the mean, with a long right-hand tail of
hospitals with higher rates. The most highly skewed measures are “In-hospital fractures possibly
related to falls,” “Wound complication of vaginal delivery,” “Uterine rupture,” and “Aspiration
pneumonia,” with a median skew statistic among all indicators of 9.2 which primarily reflects the

142
low rates of occurrence, meaning that most providers have rates near zero, giving little latitude
for a left-hand tail to the distribution.

Risk Adjustment

Overall, age-gender risk adjustment tended to reduce the level of apparent hospital level
variation by about 0.4% (see Appendix G, Table 11). Given the low rate of occurrence, “Intra-
operative nerve compression injury” was not included in the risk adjustment. The impact was
greatest on “Postoperative iatrogenic complication – nervous system” and “Reopening of a
surgical site,” and least on “Post-Operative AMI.” The rates tend to be slightly more skewed,
meaning that differences in the age-gender mix of the population at-risk masked some of the
difference in rates.
Relative to age-gender adjustment, the overall impact of DRG adjustment on the hospital
level variation was much greater, reducing variation by about 3.8% (see Appendix G, Table 12).
Comorbidity adjustment decreased the apparent variation among hospitals by 1.1%. DRG risk
adjustment had the biggest impact on “Postoperative iatrogenic complications – cardiac,”
“Decubitus ulcer in high risk patients” and “Reopening of a surgical site.” Comorbidity risk
adjustment had the biggest impact on “Decubitus ulcer in high risk patients,” “Other obstetric
complications” and “Reopening of a surgical site.”

Reliability Adjustment

The effect of the reliability adjustment, based only on univariate smoothing methods, was
examined along with the statistics on the signal standard deviation, signal share and signal ratio
(See Appendix G, Table 13). Hospitals with fewer than three patients in the denominator were
not included in the reliability adjustment. Overall, the reliability adjustment reduced the hospital
level variation dramatically. On average, one-half of the apparent hospital level variation, even
after risk adjustment, was estimated to be attributable to noise. The measures that were affected
the most by reliability adjustment were “Uterine rupture,” “In-hospital fractures possibly related
to falls” and “Wound complication of vaginal delivery.” “Aspiration pneumonia,”
“Postoperative AMI” and “Intra-operative nerve compression injury” had no signal, meaning no
systematic hospital level variation. The measures that were impacted the least were “3rd or 4th
degree obstetric laceration,” “Other obstetric complications” and “Postoperative iatrogenic
complication – cardiac.”

Univariate Smoothing Statistics

Like the MSX statistics, the univariate smoothing statistics give estimates of the degree
of total hospital level variation accounted for by signal and noise, and the degree of total
variation (hospital and patient) accounted for by signal. Signal standard deviation is an estimate
of the systematic variation (‘signal’) among hospitals. The measures with the most signal are
“Decubitus ulcer in high risk patients,” “3rd or 4th degree obstetric laceration” and
“Postoperative iatrogenic complications - cardiac.” The measures with the least signal are
“Uterine rupture” and “Wound complication of vaginal delivery,” in addition to “Aspiration
pneumonia,” “Postoperative AMI” and “Intra-operative nerve compression injury” which had no
signal.

143
The signal share is a measure of the share of total variation (hospital and patient)
accounted for by the signal. The measures with the higher signal share are “3rd or 4th degree
obstetric laceration,” “Decubitus ulcer in high risk patients” and “Postoperative iatrogenic
complications - cardiac.” The overall low level of the share of total variation accounted for by
hospitals is an indication that there are many other factors that influence these rates besides the
hospital.
Finally, signal ratio is a measure of how much of the observed variation is signal and how
much is noise. The higher the signal ratio, the more likely that observed differences in risk
adjusted rates reflect true differences in hospital performance. Indicators with the highest signal
ratio are “3rd or 4th degree obstetric laceration,” “Postoperative iatrogenic complication –
cardiac” and “Other obstetric complication.” Indicators with the lowest signal ratio are “Uterine
rupture,” “Wound complication of vaginal delivery” and “CABG after PTCA.”

Minimum Bias

Bias was measured using the same techniques as were used in the analyses of the
accepted indicators (See Appendix G, Table 14). The same caveats apply to the experimental
indicators as the accepted indicators. According to the rank correlation, the indicators most
affected in terms of relative rank are “Postoperative iatrogenic complications – cardiac,”
“Decubitus ulcer in high risk patients” and “Reopening of a surgical site.” The least affected
indicators are “CABG after PTCA” and “3rd or 4th degree obstetric laceration,” which was not
included in the DRG risk adjustment, because obstetric DRGs are divided only by the mode of
delivery and the presence or absence of complications or comorbidities. “CABG after PTCA” is
similar.
Overall, if one were to create a simple score based on the five measures of potential bias
(ranking each indicator 1 to 17, and summing the ranks), the most biased measures are
“Postoperative iatrogenic complications – cardiac,” “Decubitus ulcer in high risk patients,”
“Reopening of a surgical site” and “Postoperative iatrogenic complication - nervous system.”
The least biased measures are “CABG after PTCA” and “3rd or 4th degree obstetric laceration.”
Similar to the accepted indicators, caveats about interpretation of bias are necessary. In addition,
the experimental indicators are not considered a related set, so comparisons across indicators are
not as appropriate as in the case of accepted indicators where they are at least related based on
their more likely detection of potentially preventable adverse events.

Accepted Area Indicators

Unadjusted and adjusted area level rates were also calculated for the area level indicators
(see Appendix G, Table 15). The unit of analysis is the MSA or county (in rural areas). These six
indicators are accepted patient safety indicators that were modified into area indicators to assess
the total incidence of the adverse event within geographic areas. The modification generally was
to use principal rather than secondary diagnosis codes, and to use the area population as the
denominator. The number of additional adverse events identified using the area definition is
listed in Table 19.

144
Table 19. Additional Cases Identified by Area Level Indicators
Number of adverse events
Hospital Area
Indicator Definition Definition % Increase
Iatrogenic pneumothorax 16,815 19,892 16.8%
Transfusion reaction 131 142 8.1%
Infection due to medical care 27,457 49,419 58.8%
Wound dehiscence 2,401 2,609 8.3%
Foreign body left in during 1,631 1,943 17.5%
procedure
Technical difficulty with 46,707 50,659 8.1%
procedure

The rates vary considerably across measures, from a high a 23.5 per 100,000 population
for “Infection due to medical care” to a low of 0.08 per 100,000 for “Transfusion reactions”
(which represents 142 cases in the National SID in 1997) (See Appendix G, Table 15).
The apparent standard deviations (unadjusted for reliability) also vary considerably
among the measures, from a high of 43.7 per 100,000 for “Technical difficulty with procedure”
(relative to a mean of 23.5 per 100,000) to a low of less than 2.1 per 100,000 for “Foreign body
left in during procedure” and “Transfusion reaction.” The measures with the greatest amount of
area level variation in absolute magnitude are “Technical difficulty with procedure,” “Infection
due to medical care,” and “Iatrogenic pneumothorax.”

Risk Adjustment

Only age and gender risk adjustment, with age-gender interactions, was applied to the
area measures. The age groups are the standard age categories used by the Census Bureau in
their descriptive statistics, namely 0-4, 5-9, 10-14, 15-19, 20-24, 25-29, 30-34, 35-39, 40-44, 45-
49, 50-54, 55-59, 60-64, 65-69, 70-74, 75-79, 80-84, and 85+.
Overall, age-gender risk adjustment tended to increase the level of apparent hospital level
variation by about 8% (See Appendix G, Table 15). A similar increase was noted for all six area
level indicators. The rates tend to be slightly more skewed after adjustment for age and gender,
meaning that the age and gender distribution among the counties was obscuring some of the true
differences in rates.

145
146
Chapter 4. Conclusions
This project took a four pronged approach to the identification, development and
evaluation of PSIs. First, literature was reviewed for general background about patient safety
measures that are or could be specified from administrative data. Second, a diverse group of
clinicians assessed the face validity of potential PSIs, using an adaptation of the RAND/UCLA
Appropriateness methods. Third, professionals who abstract the medical records to assign ICD-
9-CM codes and other resources on coding were consulted for specific concerns about whether
the intent of an indicator could be implemented well based on current coding guidelines. Finally,
the most promising measures were statistically analyzed using routinely collected discharge data
from hospitals in order to determine rates, examine effects of risk and reliability adjustments, and
to make comparisons among the indicators.
When examining the results of this report, it is useful to return to the original framework
in which two types of potential indicators were discussed. The first type of indicator is that
which is likely to reflect medical error. These indicators are difficult to define using
administrative data. Few adverse events are clear cut enough for this designation, with most
having a variety of causes in addition to potential medical error leading to the adverse event,
including underlying patient health and factors that do not vary systematically. As expected,
physician panelists rated few indicators as very likely to reflect medical error. Six indicators
were rated as such by most panelists: “Decubitus ulcer,” “Iatrogenic pneumothorax,”
“Transfusion reaction,” “Complications of anesthesia,” “Foreign body accidentally left during
procedure,” and “In-hospital fracture.” However, two of these indicators could not be defined
using administrative data exactly as the panel specified in order to reduce contamination with
less preventable complications (“Iatrogenic pneumothorax,” and “In-hospital fracture”), and two
suffer from serious concerns regarding coding, presence on admission and heterogeneous
severity included within the code (“Decubitus ulcer” and “Complications of anesthesia”). Thus,
only two indicators remained that could be defined as “most likely to reflect medical error,”
those being “Transfusion reaction” and “Foreign body left in during a procedure.” As is expected
for indicators of this type, these indicators proved to be very rare with less than 1 per 10,000
cases at risk. Application of statistical tests of precision was limited by the fact that these
indicators had no systematic variation. This confirms that these indicators are best used as case-
finding indicators, or as area indicators to examine prevalence of these errors, as the rates of
these indicators are mostly driven by non-systematic variation.
All other indicators that were rated as acceptable by panelists, fall into that more broad
category of indicators which do not clearly identify medical error, but may reflect some quality
concerns, including a potential for medical error. In general these indicators fall somewhere on a
spectrum of preventability, with not every case being avoidable given optimal quality of care.
Some indicators have a higher degree of preventability than others, but factors such as provider
case mix and non-systematic variation may influence the overall preventability inherent in an
indicator. For this reason it is impossible to “rank” these indicators as “more likely to reflect
medical error” to “less likely to reflect medical error”, although panelists’ ratings of
preventability may provide some guidance from one source of face validity. In addition, the
source of “error” may vary by provider and over time, reinforcing the screening use of these
indicators – some may be primarily caused by human error and others by system problems.
Because of these variations within each indicator, a single case “flagged” by any of these

147
indicators may or may not have been preventable through optimal care, and thus these indicators
are less efficient as case finding tools.
Despite the relative difficulty of these indicators in identifying specific cases where
medical error may have occured, they can be rather useful when examining rates of events.
Inasmuch as rates are somewhat stable over time and represent systematic differences, these
differences are likely to reflect true differences in the occurrence of a complication in patient
populations. Individual complexities of each case influence the overall rate of a complication
much less than the specific outcome for that case, and thus, non-systematic differences in patient
complexity are more likely to be “washed out.” Systematic differences due to causes besides true
quality problems (e.g., case mix or coding practices) remain a concern for these indicators, as
such bias may cause good quality providers to appear poor. Adequate risk adjustment, or
refraining from comparing dissimilar providers would aid in this problem, but perfect methods
are unlikely even with the best of data. In addition, while these indicators demonstrated some
systematic variation, much of the variation between providers remains at the discharge level.
This means that small differences between providers, even with perfect risk adjustment, may not
actually reflect true differences in performance for these indicators. However, larger differences
and differences that persist over time are more likely to reflect true differences, and are useful in
identifying probable areas of concern for further investigation. Simply put, because of the nature
of these indicators, they should not be used as a metric of absolute performance (e.g., for grading
of providers or public reporting that compares providers). However, these indicators may be
particularly useful as a low cost screen for potential quality and safety problems. Where a
provider has a higher rate for a particular indicator than a benchmark, an extraction of additional
information on the patients flagged by the indicator would likely lead to either of two positive
outcomes – 1.) reassurance that there is not a quality problem, but a data gathering inadequacy
that perhaps could be improved at the local or national level to improve the ability to detect
quality problems, or 2.) identification of the source of the high rate that requires improvement in
processes or systems of care, which would benefit the quality of care for future patients.
During the course of the study, it became apparent that the obstetric indicators should be
viewed differently than the other non-obstetric indicators. In general, these indicators had a
higher rate, more variation, and thus higher precision. Risk adjustment available for these
indicators was minimal, and thus, systematic bias related to case mix could not be assessed.
Finally, examination of the panel results and comparison of decisions made by non-obstetric
panels with those made by the obstetric panels suggested that the obstetric indicators included
complications expressly rejected by the other panels. The complications may have less
association with medical error or process failures, although this assertion cannot be verified with
this study.
For the best-performing subset of PSIs, this project has demonstrated that rates of adverse
events differs substantially and significantly across hospitals. The literature review and the
findings from the clinical panels provide evidence to suggest that a number of discharge-based
PSIs may be useful screens for organizations, purchasers, and policymakers to identify potential
safety problems at the hospital level, as well as to document systematic area level differences in
potential patient safety problems.

148
Potential Uses of PSIs

At the national or state level, these indicators could be used to monitor the frequency of
potential patient safety problems, to determine whether the rates are increasing or decreasing
over time, and to explore large variations among settings of care. As noted by panelists, not all
indicators are equally poised to identify potential patient safety problems. This report was
intended to provide evidence on the development and face validity of these indicators, and the
evidence available does not allow for fine tuned classifications of indicators which are very
likely to detect patient safety problems from those that are less likely. Future research will
provide additional evidence that will inform the best uses of these indicators.
While the indicators were primarily developed at the hospital level, some were also
implemented to provide an analogous area level measure, and analyses show that additional
cases are in fact identified that correspond to care received at one institution, and the potentially
iatrogenic complication addressed in another hospital. Clearly, the locus of control and the
ability to study the potential underlying causes for an adverse event is simpler in the case of the
hospital level PSIs. However, trends over time in area rates, as well as aggregations of the
hospital level rates are likely to reveal points of leverage outside of individual institutions. No
measure is ideally suited to every purpose. Methods of aggregating across groups of PSIs still
need to be tested. This report provides the background for “safe” use of a tool that has the
potential to guide prevention of medical error, reductions of potentially preventable
complications, and quality improvement in general. Table 20 summarizes additional information
on uses of the PSIs.
Because the PSIs are intended for use as an initial, efficient screen to target areas for
further data exploration, the primary goal is to find indicators that guide those interested in
quality improvement and patient safety to areas where there are systematic differences between
hospitals or geographic areas. These systematic differences may relate to underlying processes or
structures that an organization could change to improve patient care and safety. These errors may
be attributed to human error on the part of physicians or nurses, or system deficiencies or both.
On the other hand, the systematic differences will sometimes correspond to coding practices,
patient characteristics not captured by administrative data, or other factors. These will be dead
ends to some degree. In the application of these PSIs, users will have an opportunity to
determine how well patient safety problems are identified at the level of groups of patients.
Sharing experiences with these PSIs, researchers and health care practitioners will have a chance
to build on the information highlighted in this report about each indicator, as well as the set of
PSIs.
Thus, application of these indicators to a variety of settings and additional data gathering
will accomplish two vital next steps for patient safety. First, these attempts will shed light on
which indicators and under what circumstances PSIs provide useful information. Second, in
those cases where potentially preventable errors are identified with relative ease through these
tools, health care providers and managers will have an opportunity to implement potential
preventative strategies ranging from technologies to processes to new ways of organizing care.
The effectiveness of these strategies can be assessed at many levels, including the effects on the
PSI rates.

149
Table 20. Use of Patient Safety Indicators
User Inappropriate Use Scenario Appropriate Use Scenario Potential Uses
Case-finding indicators
Provider A hospital uses the transfusion reaction A hospital identifies a case of transfusion reaction occurring in-hospital. Identification of events for
indicator to punish a physician involved in They undertake a root-cause analysis to highlight potential problems that further investigation.
the incident. may be resolved in order to prevent future events.
PROBLEM: Flagging of the case does not
necessarily guarantee that a medical error
has occurred at the physician or system
level. Further such punishment may
reduce voluntary reporting of errors.
Public Health A public health organization uses provider A state health department uses the area level indicator for foreign body to Surveillance of events.
level indicators for use in formal survey the incidence of such events in that state.
evaluation of providers in area.
PROBLEM: Flagging of cases does not
ensure medical error and such use may
decrease reporting.
Research Researchers compare rates of case-based Researchers use these indicators to identify cases in a large database where Flagging of cases for use in
indicators to identify providers with more events related to medical error may have occurred. They examine the research studies.
medical error to those with less. characteristics of patients flagged compared to matched patients not
PROBLEM: Lack of signal between flagged.
providers makes such comparisons
unreliable.
Rate-based indicators
150 Provider A hospital uses an indicator to identify A teaching hospital observes that their rate of decubitus ulcer is consistently Surveillance of rates for
differences in rates between physicians higher than the peer group average for other teaching hospitals in their internal quality improvement
within the hospital. region. After ruling out such explanations as differences in coding or investigations.
PROBLEM: The number of cases by screening practices, and assuring that case mix is comparable to other
physician is likely to be zero or very small. teaching hospitals, the hospital uses resources such as peer-reviewed
Even if such rates are used for purely literature and government reports to identify processes of care or system
quality improvement initiatives, physician failures that may account for the higher rate.
level rates for most indicators are likely to
be unreliable.
Public Health A state health department publishes the A state health department uses the area level infection due to medical care Surveillance of rates.
rate for each indicator by provider in a indicator to examine the overall rate of this indicator in the state. They Examination of area rates
report to highlight quality concerns by compare the result of the area level indicator to the provider level indicator to over time, by region, by
provider. determine how many of these complications occur post-discharge or on an hospital type.
PROBLEM: These indicators are not outpatient basis, and are serious enough to require hospitalization later.
designed to be used for public reporting
by provider, and such use may lead to
incorrect conclusions about provider
quality.
Research Researchers use quality indicators as a Researchers use quality indicators to examine the relationship between high Use with other measures of
definitive measurement of quality. rates on PSIs with high rates on other quality measures, such as mortality quality to determine
PROBLEM: Many factors besides quality measures. relationships of PSIs with
may contribute to rate differences. structural, process or other
aspects of care.
Relationship of This Project to Other Quality Initiatives

This report is one of many efforts to clarify the problem of patient safety in the
national health care system. Together these efforts are likely to provide a more complete
picture of medical error. Other indicator or measurement sets have been developed, some
of which were used in the development of this measure set. Table 21 describes these
measures and their relationship to the PSIs.
Another USCF-Stanford Evidence-based Practice Center report evaluated the
practices that may improve patient safety in a hospital setting. Some practices evaluated
in the report are designed to reduce the events measured in some indicators. Table 22
outlines the overlap between these reports. As users of the PSIs identify potential safety
problems, references to scientific evaluations such as Making Health Care Safer: A
Critical Analysis of Patient Safety Practices2 will be vital in determining appropriate
interventions and potential failures in processes.

151
Table 21. Relationship of PSIs to Other Indicator Sets
Description Relationship to PSIs
VA National Surgical Quality An ongoing QI program by VA since Data collection utilizes standardized definitions which include
Improvement Program 1994. Standardized data collection on clinical criteria in some cases. Although definitions differ, some
(NSQIP)148 adverse events following surgery. indicators are similar to the PSIs . Adverse events have been added
over the years. Data on post operative pneumonia, AMI, neurologic
deficit, renal failure, DVT, PE, wound dehiscence, and systemic
sepsis capture some of the same complications as potential PSIs,
but operationalizations are vastly different.
Miller et al PSIs (published A set of 12 PSIs and a summary PSIs were designed as case finding tools for the most part. PSIs
in Health Services measure designed to maximize were used as a starting point for the PSIs in this report, although
Research)17 potential of identifying medical error final definitions differ between the two sets. Some PSIs were
through administrative data. rejected by the panels. Details are available in Appendix H.
Complications Screening A set of indicators designed to flag The CSP indicators that have been shown to be adequate in
Program7 complications that occur in-hospital identifying in hospital complications were used as a starting point
(e.g., in-hospital hip fracture, post- for the PSIs in this report, although final definitions differ between
operative pneumonia). This set has the two sets. Some CSP indicators were rejected by the panel.
been validated and studied widely. Details are available in Appendix H.
National Quality Forum’s A set of case-finding tools designed The NQFs reportable events are based on detailed clinical
(NQF) reportable events5 to flag cases of potential medical information, unlike the PSIs. Most of the reportable events are not
error. These events are defined to be identifiable using administrative data. Definitions of foreign body
serious adverse events resulting in accidentally left during a procedure, transfusion reaction, and
152 death or disability (e.g., wrong site decubitus ulcer are included, but differ from PSI definitions.
surgery, serious medication error).
National Quality Report A Congressionally mandated report The NQR is separate from the PSIs, although some PSIs are likely
(NQR)168 outlining the nationwide state of to be considered for the report. The report will cover additional
healthcare quality. This report will not topics besides patient safety, and will utilize a variety of data
compare providers. The first set of sources.
indicators and the accompanying
report are due in 2003.
Table 22. Indicator Level Practices Included in Making Health Care Safer a

Indicator name Corresponding chapter in practices report Practices reviewed


Complications of anesthesia None None
Death in low mortality DRGs None None
Decubitus ulcer Prevention of Pressure Ulcers in Older Patients Pressure relieving devices
(Chapter 27)
Failure to rescue None None
Foreign body accidentally left during The Retained Surgical Sponge (Chapter 22) Sponge and instrument counts
procedure
Iatrogenic pneumothorax Ultrasound Guidance of Central Vein Ultrasound guidance of central vein catheterization
Catheterization (Chapter 21)
Infection due to medical care Prevention of Intravascular Catheter-Associated Maximum barrier precautions during central venous catheter
Infections (Chapter 16) insertion, use of central venous catheters coated with antibacterial or
antiseptic agents, use of chlorhexidine gluconate at the central
venous catheter insertion site, other practices.
Postoperative hip fracture Prevention of Falls in Hospitalized or ID bracelets for high-risk patients, interventions that decrease the use
Institutionalized Older People (Chapter 26) of physical restraints, bed alarms, special floor materials to reduce
injuries, hip protectors.
Postoperative hemorrhage or hematoma None None
Postoperative physiological and metabolic None None
derangement
Postoperative respiratory failure None None
Postoperative pulmonary embolism or Prevention of Venous Thromboembolism (Chapter Graduated elastic stockings, intermittent pneumatic compression, low
deep venous thrombosis 31) dose unfractionated heparin, low molecular weight heparin, warfarin
and aspirin.
Postoperative wound dehiscence Prevention of Surgical Site Infections (Chapter 20) (Wound dehiscence only accounts for some of the outcomes
153 considered in this chapter.)
Prophylactic antibiotics, perioperative normothermia, supplemental
perioperative oxygen, perioperative glucose control.
Postoperative sepsis None None
Technical difficulty with procedure None None
Transfusion reaction None (Mentioned in context of Chapter 43. None
Prevention of Misidentifications, a major cause of
transfusion reactions)
Birth trauma – injury to neonate None None
Obstetric trauma (all delivery types) None None
Obstetric wound complications – c-section Prevention of Surgical Site Infections (Chapter 20) Reviewed in the context of all surgical wounds. See notation for
wound dehiscence.
Post-partum urinary tract infection Prevention of Nosocomial Urinary Tract Infections Reviewed in the context of all hospitalized patients.
(Chapter 15)
a
This table outlines practices reviewed in the EPC Evidence Report, Making Health Care Safer: A Critical Review of Patient Safety Practices.2 This report was written
independently of indicator development, therefore chapters listed may only briefly address the adverse event described by the indicator, and may not examine practices for the
entire population at risk.
Limitations and Future Research

The methodology of this report included several key choices that led to some
limitations. The goal of this study was to identify and evaluate indicators that could be
constructed using administrative data, because these data are readily available and less
costly than more detailed clinical data. We chose to limit our search to indicators that
could be operationalized currently, instead of identifying indicators which have the
potential for being operationalized with administrative data in the future. As a result,
those patient safety concerns addressed in this indicator set are only a subset of the most
prevalent, important or preventable problems. Many important concerns cannot currently
be monitored well using administrative data (e.g., adverse drug events). As administrative
data improves, many more important and potentially more useful indicators are likely to
emerge.
Just as administrative data limited specific indicators chosen, the use of
administrative data tends to favor specific types of indicators. The PSIs evaluated in this
report contain a large proportion of surgical indicators, rather than medical or psychiatric.
This is not to imply that patient safety is not a concern outside of surgery, rather, these
indicators tend to be more feasible to define using administrative data for surgical
populations. Medical complications are often difficult to distinguish from comorbidities
that are present on admission.13 In addition medical populations tend to be more
heterogeneous than surgical, especially elective surgical populations, making it difficult
to account for case-mix. Panelists often felt that indicators were more likely to reflect
preventable events when limited to elective surgical admissions. As data become better,
the addition of patient safety indicators for the medical and psychiatric populations will
be critical.
The intended purpose of these indicators guided the choices made in specifying
them. Specifically, tradeoffs between specificity (e.g., the likelihood that the indicator
will not flag cases that do not qualify as a patient safety event) and sensitivity (e.g., the
likelihood that the indicator will flag cases that do qualify as a patient safety event) were
considered in conjunction with the use or misuse of these indicators as they move into the
public sector. Many complications included in these indicators are more likely in some
specified subpopulation. For instance, decubitus ulcers are more likely in patients with
paralysis. Since they are more likely to occur, complications in these populations may
also be less preventable or be more likely to be present on admission. Nonetheless,
interventions to prevent complications may be particularly important in these high risk
groups – it is these very patients for which providers need to be particularly vigilant in
preventing that complication from occurring. The inclusion of high risk patients, given
the limitations of these indicators, would ultimately mean a decrease in the specificity of
these indicators, or the ability to have a high yield of patients in whom true safety
problems are present. However, to exclude these patients, as was done for many
indicators, would sacrifice the sensitivity of these indicators, or the ability to identify as
many patients as possible for whom true safety problems may be present.
The evaluation of indicators included in this report reflects only part of the
validity testing needed. The structured panel review was intended to assess the face
validity of the indicators. However, limitations of such a review should be noted. Several
panels were utilized in the review of the indicators; thus panel level differences may be

154
present, leading to differences in the evaluation of indicators. Further, panelists were not
required to support opinion with empirical evidence from the literature, thus panelists’
review represents the opinions of these clinicians. Also, panelists may have interpreted
the questions about characteristics of the indicators differently, which is particularly
problematic for small sample sizes. Finally, although children were included in the
population at risk for most indicators, clinicians that care for children were not included
in the non-obstetric panels. Team members that specialize in pediatrics (PSR, MM)
advised regarding the applicability of these indicators along the way. However, further
panelist review and research into the applicability of these indicators to children is
necessary. The empirical analyses were intended to demonstrate the precision and bias of
the indicator; these tests are more descriptive then evaluative in nature. The tests of
precision are affected by the frequency of an event; thus higher frequency indicators tend
to have higher precision. This does not imply that these indicators are in fact superior to
other indicators. In addition, bias tests were not intended to rule out all potential bias, as
indicators that are not affected by risk adjustment may be biased in a way that is not
captured by the limited risk adjustment utilized in this study. This is a particular problem
for obstetric indicators, where risk adjustment often only accounted for the age of the
mother, as other appropriate risk adjustment factors were generally not available in the
data.
These initial evaluations of these indicators demonstrated that they are promising,
both in terms of face validity and relative precision. Further research should continue to
explore the validity of these indicators, such as the construct validity of these indicators.
This research should validate the indicators using other data, such as detailed chart data.
Validation should focus on the sensitivity and specificity of these indicators in detecting
the occurrence of a complication, the extent to which failures in processes of care at the
system or individual level are captured using these indicators, the relationship of these
indicators with other measures of quality, such as mortality, and explorations of bias and
risk adjustment. A recent study examined the relationship between ICD-9-CM identified
complications and those identified through standardized clinical data collection. 148
Similar efforts, comparing these PSIs with other measures of patient safety using other
data sources will shed additional light on the comparative validity of these indicators.
Research may also utilize additional data elements, such as “present on admission
coding” available in some states to identify the ability of these indicators to detect
complications occurring in-hospital. All validity research must include thoughtful
deliberations about the standard of validity for these types of indicators. Given that these
indicators are intended for screening purposes, a lower standard of construct validity (the
ability of these indicators to detect patient safety problems) may be appropriate than
indicators intended as definitive measures.
In addition to research aimed at validating these PSIs, future research should
focus on the appropriate and practical application of these indicators. Effort should be put
forth in establishing appropriate and potentially flexible benchmarks for the PSIs, such as
means, medians, modes, or points of inflection (i.e., point where the slope of the
distribution changes) of peer group, regional or statewide providers. Careful attention
should also be paid to the understanding of these indicators by clinicians and other end
users to ensure that data are appropriately interpreted and fully utilized.

155
The future of patient safety measurement depends in part on the improvement of
administrative data. The addition of timing variables may prove particularly useful. In
identifying complications it is necessary to determine whether or not a complication was
present on admission, or occurred during the hospitalization. While some of the
complications that are present on admission may indeed reflect adverse events of care in
a previous hospitalization or outpatient care, many may reflect comorbidities instead of
complications. Some states have included a “sixth digit,” present on admission
designation. These are promising for use in quality indicators. Additional timing
distinctions were mentioned during the panel discussions. Specifically, for some
complications, occurring in close temporal proximity to surgery or admission was more
or less desirable than timing that was more remote. For instance, panelists suggested that
aspirations leading to pneumonia that occurred during or immediately after surgery were
potentially preventable complications, but that aspirations that occur later in the
hospitalization were less preventable. Thus, while administrative data do not currently
contain such distinctions, the timing of an adverse event may prove to be a useful data
element.
The second area of data improvement would be to allow the linking of hospital
data over time and with outpatient data. Many complications may not occur or be
diagnosed until after discharge, especially when length of stays are relatively short.
Presumably these complications either result in another admission, or are diagnosed and
treated on an outpatient basis. For example, the area-level indicators “Infection due to
medical care” identified almost twice as many complications as the provider-level
indicator, suggesting that many infections occur after discharge or following outpatient
care and eventually result in hospitalization. Currently, these complications are not
detected by the provider-level PSIs, potentially producing misleading results. The
inclusion of complications that occur after discharge would increase the sensitivity of the
PSIs.
As highlighted during the structured panel review, it is essential that users
understand the limitations and benefits of these indicators in practical use. Clarification
about data, vigilance in ensuring the proper use of these indicators, updating indicators to
reflect new evidence and practices, and continuous, open communication between
clinicians, medical coders and users of these indicators will be essential for their
continued success.
The current development and evaluation effort will best be augmented by a
continuous communication loop between users of these measures, researchers interested
in improving these measures, and policy makers with influence over the resources aimed
at data collection. Surely, some indicators will be more useful than others, based on
further information and research about them. The conclusions of the companion technical
report on quality indicators from the EPC, and published by AHRQ
[http://www.achq.gov/data/hcup/qirefine.htm], offers further pertinent detail about future
research and activities aimed at improvements in the ability to measure the consequences
– intended and unintended—of medical care.

156
9. MEDLINE [database online]. In.
References Bethesda (MD): National Library of Medicine.

1. Kohn L, Corrigan J, Donaldson M, 10. Iezzoni LI, Daley J, Heeren T, Foley


Committee on Quality of Health Care in America SM, Fisher ES, Duncan C, et al. Identifying
IoM, editors. To Err Is Human: Building a Safer complications of care using administrative data.
Health System. Washington, D.C.: National Med Care 1994;32(7):700-15.
Academy Press; 1999.
11. Iezzoni LI, Daley J, Heeren T, Foley
2. Shojania KG, Duncan BW, McDonald SM, Hughes JS, Fisher ES, et al. Using
KM, Wachter RM. Making Health Care Safer: A administrative data to screen hospitals for high
Critical Analysis of Patient Safety Practices. complication rates. Inquiry 1994;31(1):40-55.
Evidence Report/Technology Assessment No. 43
(Prepared by the University of California at San 12. Kalish RL, Daley J, Duncan CC, Davis
Francisco-Stanford Evidence-based Practice RB, Coffman GA, Iezzoni LI. Costs of potential
Center under Contract No. 290-97-0013). complications of care for major surgery patients.
Rockville, MD: Agency for Healthcare Research Am J Med Qual 1995;10(1):48-54.
and Quality; 2001. Report No.: AHRQ
Publication No. 01-E058. 13. Lawthers A, McCarthy E, Davis R,
Peterson L, Palmer R, Iezzoni L. Identification of
3. Davies S, Geppert J, McClellan M, in-hospital complications from claims data: is it
McDonald KM, Romano PS, Shojania KG. valid? Medical Care 2000;38(8):785-795.
Refinement of the HCUP Quality Indicators.
Technical Review Number 4. Rockville, MD: 14. McCarthy EP, Iezzoni LI, Davis RB,
(Prepared by UCSF-Stanford Evidence-based Palmer RH, Cahalane M, Hamel MB, et al. Does
Practic Center under Contract No. 290-97-0013) clinical evidence support ICD-9-CM diagnosis
Agency for Healthcare Research and Quality; coding of complications? Med Care
2001. Report No.: 01-0035. 2000;38(8):868-876.

4. Measuring the Quality of Health Care: 15. Weingart SN, Iezzoni LI, Davis RB,
A statement of the National Roundtable on Palmer RH, Cahalane M, Hamel MB, et al. Use
Healthcare Quality Division of Healthcare of administrative data to find substandard care:
Services: National Academy Press; 1999. validation of the complications screening
program. Med Care 2000;38(8):796-806.

5. Envisioning the National Health Care 16. Iezzoni LI, Davis RB, Palmer RH,
Quality Report. Washington D.C.: Institute of Cahalane M, Hamel MB, Mukamal K, et al.
Medicine; 2001. Does the Complications Screening Program flag
cases with process of care problems? Using
6. Brennan TA, Leape LL, Laird NM, explicit criteria to judge processes. Int J Qual
Hebert L, Localio AR, Lawthers AG, et al. Health Care 1999;11(2):107-18.
Incidence of adverse events and negligence in
hospitalized patients. Results of the Harvard 17. Miller M, Elixhauser A, Zhan C, Meyer
Medical Practice Study I. N Engl J Med G. Patient Safety Indicators: Using
1991;324(6):370-6. Administrative Data to Identify Potential Patient
Safety Concerns. Health Services Research
7. Iezzoni LI, Foley SM, Heeren T, Daley 2001;36(6 Part II):110-132.
J, Duncan CC, Fisher ES, et al. A method for
screening the quality of hospital care using 18. Green L, Lewis F. Measurement and
administrative data: preliminary validation Evaluation in Health Education and Health
results. QRB Qual Rev Bull 1992;18(11):361-71. Promotion. Mountain View, CA: Mayfield
Publishing Company; 1998.
8. EMBASE. In. The Netherlands:
Elsevier Science Publishers B.V.

157
19. Fitch K, Bernstein SJ, Aguilar MD, 29. Andrews LB, Stocking C, Krizek T,
Burnand B, LaCalle JR, Lazaro P, et al. The Gottlieb L, Krizek C, Vargish T, et al. An
RAND/UCLA Appropriateness Method User's alternative strategy for studying adverse events
Manual.: RAND; 2001. in medical care. Lancet 1997;349(9048):309-13.

20. Campbell SM, Roland MO, Shekelle 30. Rosen AK, Geraci JM, Ash AS, McNiff
PG, Cantrill JA, Buetow SA, Cragg DK. KJ, Moskowitz MA. Postoperative adverse
Development of review criteria for assessing the events of common surgical procedures in the
quality of management of stable angina, adult Medicare population. Med Care 1992;30(9):753-
asthma, and non-insulin dependent diabetes 65.
mellitus in general practice. Qual Health Care
1999;8(1):6-15. 31. Silber JH, Williams SV, Krakauer H,
Schwartz JS. Hospital and patient characteristics
21. Campbell SM, Roland MO, Quayle JA, associated with death after surgery. A study of
Buetow SA, Shekelle PG. Quality indicators for adverse occurrence and failure to rescue. Med
general practice: which ones can general Care 1992;30(7):615-29.
practitioners and health authority managers agree
are important and how useful are they? J Public 32. Silber JH, Rosenbaum PR, Schwartz JS,
Health Med 1998;20(4):414-21. Ross RN, Williams SV. Evaluation of the
complication rate as a measure of quality of care
22. Sanderson C, Dixon J. Conditions for in coronary artery bypass graft surgery. JAMA
which onset or hospital admission is potentially 1995;274(4):317-23.
preventable by timely and effective ambulatory
care. J Health Serv Res Policy 2000;5(4):222-30. 33. Rosen A, Ash A, McNiff K, Moskowitz
M. The importance of severity of illness
23. Hofer TP, Hayward RA, Greenfield S, adjustment in predicting adverse outcomes in the
Wagner EH, Kaplan SH, Manning WG. The Medicare population. J Clin Epidemiol
unreliability of individual physician "report 1995;48:631-643.
cards" for assessing the costs and quality of care
of a chronic disease JAMA 1999;281(22):2098- 34. Geraci JM, Ashton CM, Kuykendall
105. DH, Johnson ML, Wu L. In-hospital
complications among survivors of admission for
24. Christiansen CL, Morris CN. Improving congestive heart failure, chronic obstructive
the statistical approach to health care provider pulmonary disease, or diabetes mellitus. J Gen
profiling. Ann Intern Med 1997;127(8 Pt 2):764- Intern Med 1995;10(6):307-14.
8.
35. Hartz AJ, Kuhn EM, Kayser KL, Pryor
25. Thomas EJ, Orav EJ, Brennan TA. DP, Green R, Rimm AA. Assessing providers of
Hospital ownership and preventable adverse coronary revascularization: a method for peer
events. J Gen Intern Med 2000;15(4):211-9. review organizations. Am J Public Health
1992;82(12):1631-40.
26. Thomas EJ, Brennan TA. Incidence and
types of preventable adverse events in elderly 36. Geraci JM, Ashton CM, Kuykendall
patients: population based review of medical DH, Johnson ML, Souchek J, del Junco D, et al.
records. BMJ 2000;320(7237):741-4. The association of quality of care and occurrence
of in-hospital, treatment-related complications.
27. Mills D, editor. Report on the Medical Med Care 1999;37(2):140-8.
Insurance Feasibility Study. San Francisco, CA:
California Medical Association; 1977. 37. Daley J, Khuri SF, Henderson W, Hur
K, Gibbs JO, Barbour G, et al. Risk adjustment
28. Hiatt HH, Barnes BA, Brennan TA, of the postoperative morbidity rate for the
Laird NM, Lawthers AG, Leape LL, et al. A comparative assessment of the quality of surgical
study of medical injury and medical malpractice. care: results of the National Veterans Affairs
N Engl J Med 1989;321(7):480-4. Surgical Risk Study. J Am Coll Surg
1997;185(4):328-40.

158
38. Bates DW, Cullen DJ, Laird N, Petersen 48. DiPiro JT, Martindale RG, Bakst A,
LA, Small SD, Servi D, et al. Incidence of Vacani PF, Watson P, Miller MT. Infection in
adverse drug events and potential adverse drug surgical patients: effects on mortality,
events. Implications for prevention. ADE hospitalization, and postdischarge care. Am J
Prevention Study Group. Jama 1995;274(1):29- Health Syst Pharm 1998;55(8):777-81.
34.
49. HCIA-Sachs LLC. Hospital Compliance
39. Classen DC, Pestotnik SL, Evans RS, and DRG Upcoding. 2000.
Lloyd JF, Burke JP. Adverse drug events in
hospitalized patients. Excess length of stay, extra 50. 100 Top Hospitals: Benchmarks for
costs, and attributable mortality [see comments]. Success. Baltimore, MD: HCIA Sachs, LLC;
Jama 1997;277(4):301-6. 2000.

40. Mitchell JB, Ballard DJ, Whisnant JP, 51. Keeler E, Kahn K, Bentow S. Assessing
Ammering CJ, Matchar DB, Samsa GP. Using quality of care for hospitalized Medicare patients
physician claims to identify postoperative with hip fracture using coded diagnoses from the
complications of carotid endarterectomy. Health Medicare Provider Analysis and Review File.
Serv Res 1996;31(2):141-52. Springfield, VA: NTIS; 1991.

41. Myers ER, Steege JF. Risk adjustment 52. Brennan TA, Hebert LE, Laird NM,
for complications of hysterectomy: limitations of Lawthers A, Thorpe KE, Leape LL, et al.
routinely collected administrative data. Am J Hospital characteristics associated with adverse
Obstet Gynecol 1999;181(3):567-75. events and substandard care. JAMA
1991;265(24):3265-9.
42. Romano PS, Campa DR, Rainwater JA.
Elective cervical discectomy in California: 53. Allison J, Kiefe C, Weissman N, Person
postoperative in-hospital complications and their S, Rouscult M, Canto J, et al. Relationship of
risk factors. Spine 1997;22(22):2677-92. Hospital Teaching Status With Quality of Care
and Mortality for Medicare Patients with Acute
43. Ghali WA, Hall RE, Ash AS, Rosen MI. JAMA 2000;284(10):1256 - 1262.
AK, Moskowitz MA. Evaluation of complication
rates after coronary artery bypass surgery using 54. Kahn KL, Rogers WH, Rubenstein LV,
administrative data. Methods Inf Med Sherwood MJ, Reinisch EJ, Keeler EB, et al.
1998;37(2):192-200. Measuring quality of care with explicit process
criteria before and after implementation of the
44. Iezzoni LI, Heeren T, Foley SM, Daley DRG-based prospective payment system. JAMA
J, Hughes J, Coffman GA. Chronic conditions 1990;264(15):1969-73.
and risk of in-hospital death. Health Serv Res
1994;29(4):435-60. 55. Rubenstein LV, Kahn KL, Reinisch EJ,
Sherwood MJ, Rogers WH, Kamberg C, et al.
45. DesHarnais SI, McMahon LF, Jr., Changes in quality of care for five diseases
Wroblewski RT, Hogan AJ. Measuring hospital measured by implicit review, 1981 to 1986.
performance. The development and validation of JAMA 1990;264(15):1974-9.
risk-adjusted indexes of mortality, readmissions,
and complications. Med Care 1990;28(12):1127- 56. Meehan TP, Fine MJ, Krumholz HM,
41. Scinto JD, Galusha DH, Mockalis JT, et al.
Quality of care, process, and outcomes in elderly
46. DesHarnais S, McMahon LF, Jr., patients with pneumonia. JAMA
Wroblewski R. Measuring outcomes of hospital 1997;278(23):2080-4.
care using multiple risk-adjusted indexes. Health
Serv Res 1991;26(4):425-45. 57. Meehan TP, Hennen J, Radford MJ,
Petrillo MK, Elstein P, Ballard DJ. Process and
47. Brailer DJ, Kroch E, Pauly MV, Huang outcome of care for acute myocardial infarction
J. Comorbidity-adjusted complication risk: a new among Medicare beneficiaries in Connecticut: a
outcome quality measure. Med Care quality improvement demonstration project. Ann
1996;34(5):490-505. Intern Med 1995;122(12):928-36.

159
58. Krumholz HM, Radford MJ, Wang Y,
Chen J, Heiat A, Marciniak TA. National use and
effectiveness of beta-blockers for the treatment
of elderly patients after acute myocardial 67. Hartz AJ, Kuhn EM, Pryor DB,
infarction: National Cooperative Cardiovascular Krakauer H, Young M, Heudebert G, et al.
Project [Published erratum appears in JAMA Mortality after coronary angioplasty and
1999;281(1):37]. JAMA 1998;280(7):623-9. coronary artery bypass surgery (the national
Medicare experience). Am J Cardiol
59. Krumholz HM, Radford MJ, Ellerbeck 1992;70(2):179-85.
EF, Hennen J, Meehan TP, Petrillo M, et al.
Aspirin in the treatment of acute myocardial 68. Silber J, Rosenbaum P, Ross R.
infarction in elderly Medicare beneficiaries. Comparing the contributions of groups of
Patterns of use and outcomes. Circulation predictors: Which outcomes vary with hospital
1995;92(10):2841-7. rather than patient characteristics? J Am Stat
Assoc 1995;90:7-18.
60. Daley J, Forbes MG, Young GJ, Charns
MP, Gibbs JO, Hur K, et al. Validating risk- 69. Silber JH, Rosenbaum PR. A spurious
adjusted surgical outcomes: site visit assessment correlation between hospital mortality and
of process and structure. National VA Surgical complication rates: the importance of severity
Risk Study. J Am Coll Surg 1997;185(4):341-51. adjustment. Med Care 1997;35(10 Suppl):OS77-
92.
61. Krumholz HM, Rathore SS, Chen J,
Wang Y, Radford MJ. Evaluation of a consumer- 70. Khuri S, Najjar S, Daley J, Krasnicka B,
oriented internet health care report card: the risk Hossain M, Henderson W, et al. Comparison of
of quality ratings based on mortality data. Jama Surgical Outcomes Between Teaching and
2002;287(10):1277-87. Nonteaching Hospitals in the Department of
Veterans Affairs. Annals of Surgery
62. Thomas JW, Holloway JJ, Guire KE. 2001;234(3):370-383.
Validating risk-adjusted mortality as an indicator
for quality of care. Inquiry 1993;30(1):6-22. 71. Goldman L, Caldera DL, Nussbaum SR,
Southwick FS, Krogstad D, Murray B, et al.
63. Ashton CM, Kuykendall DH, Johnson Multifactorial index of cardiac risk in noncardiac
ML, Wray NP, Wu L. The association between surgical procedures. N Engl J Med
the quality of inpatient care and early 1977;297(16):845-50.
readmission. Ann Intern Med 1995;122(6):415-
21. 72. Detsky AS, Abrams HB, Forbath N,
Scott JG, Hilliard JR. Cardiac assessment for
64. Benbassat J, Taragin M. Hospital patients undergoing noncardiac surgery. A
readmissions as a measure of quality of health multifactorial clinical risk index. Arch Intern
care: advantages and limitations. Arch Intern Med 1986;146(11):2131-4.
Med 2000;160(8):1074-81.
73. Wong T, Detsky AS. Preoperative
65. Weissman JS, Ayanian JZ, Chasan- cardiac risk assessment for patients having
Taber S, Sherwood MJ, Roth C, Epstein AM. peripheral vascular surgery. Ann Intern Med
Hospital readmissions and quality of care. Med 1992;116(9):743-53.
Care 1999;37(5):490-501.
74. Forrest JB, Rehder K, Cahalan MK,
66. Hartz AJ, Kuhn EM. Comparing Goldsmith CH. Multicenter study of general
hospitals that perform coronary artery bypass anesthesia. III. Predictors of severe perioperative
surgery: the effect of outcome measures and data adverse outcomes. Anesthesiology 1992;76(1):3-
sources. Am J Public Health 1994;84(10):1609- 15.
14.
75. Klotz HP, Candinas D, Platz A, Horvath
A, Dindo D, Schlumpf R, et al. Preoperative risk
assessment in elective general surgery. Br J Surg
1996;83(12):1788-91.

160
76. Smetana GW. Preoperative pulmonary 86. Hannan EL, Racz MJ, Jollis JG,
evaluation. N Engl J Med 1999;340(12):937-44. Peterson ED. Using Medicare claims data to
assess provider quality for CABG surgery: does
77. Lee TH, Marcantonio ER, Mangione it work well enough? Health Serv Res
CM, Thomas EJ, Polanczyk CA, Cook EF, et al. 1997;31(6):659-78.
Derivation and prospective validation of a simple
index for prediction of cardiac risk of major 87. Pine M, Norusis M, Jones B, Rosenthal
noncardiac surgery [In Process Citation]. GE. Predictions of hospital mortality rates: a
Circulation 1999;100(10):1043-9. comparison of data sources. Ann Intern Med
1997;126(5):347-54.
78. Keita-Perse O, Gaynes RP. Severity of
illness scoring systems to adjust nosocomial 88. Pine M, al e. International Journal for
infection rates: a review and commentary. Am J Quality in Health Care 1998;10(6):491-501.
Infect Control 1996;24(6):429-34.
89. Smith DW, Pine M, Bailey RC, Jones
79. Magovern JA, Sakert T, Magovern GJ, B, Brewster A, Krakauer H. Using clinical
Benckart DH, Burkholder JA, Liebler GA, et al. variables to estimate the risk of patient mortality.
A model that predicts morbidity and mortality Med Care 1991;29(11):1108-29.
after coronary artery bypass graft surgery. J Am
Coll Cardiol 1996;28(5):1147-53. 90. Green J, Wintfeld N, Sharkey P,
Passman LJ. The importance of severity of
80. Brook RH, Park RE, Chassin MR, illness in assessing hospital mortality. JAMA
Kosecoff J, Keesey J, Solomon DH. Carotid 1990;263(2):241-6.
endarterectomy for elderly patients: predicting
complications. Ann Intern Med 91. Hartz AJ, Guse C, Sigmann P, Krakauer
1990;113(10):747-53. H, Goldman RS, Hagen TC. Severity of illness
measures derived from the Uniform Clinical
81. Goldstein LB, McCrory DC, Landsman Data Set (UCDSS). Med Care 1994;32(9):881-
PB, Samsa GP, Ancukiewicz M, Oddone EZ, et 901.
al. Multicenter review of preoperative risk
factors for carotid endarterectomy in patients 92. Coding Clinic for ICD-9-CM
with ipsilateral symptoms. Stroke 1990;7(2):24.
1994;25(6):1116-21.
93. Romano P. Can Administrative Data be
82. McCrory DC, Goldstein LB, Samsa GP, Used to Ascertain Clinically Significant
Oddone EZ, Landsman PB, Moore WS, et al. Postoperative Complications. American Journal
Predicting complications of carotid of Medical Quality Press.
endarterectomy. Stroke 1993;24(9):1285-91.
94. Clagett GP, Anderson FA, Jr., Geerts
83. Musser DJ, Nicholas GG, Reed JF, 3rd. W, Heit JA, Knudson M, Lieberman JR, et al.
Death and adverse cardiac events after carotid Prevention of venous thromboembolism. Chest
endarterectomy. J Vasc Surg 1994;19(4):615-22. 1998;114(5 Suppl):531S-560S.

84. Block PC, Peterson ED, Krone R, 95. Hooker JA, Lachiewicz PF, Kelley SS.
Kesler K, Hannan E, O'Connor GT, et al. Efficacy of prophylaxis against
Identification of variables needed to risk adjust thromboembolism with intermittent pneumatic
outcomes of coronary interventions: evidence- compression after primary and revision total hip
based guidelines for efficient data collection. J arthroplasty. J Bone Joint Surg Am
Am Coll Cardiol 1998;32(1):275-82. 1999;81(5):690-6.

85. Hannan EL, Kilburn H, Jr., Lindsey 96. Maxwell GL, Myers ER, Clarke-
ML, Lewis R. Clinical versus administrative data Pearson DL. Cost-effectiveness of deep venous
bases for CABG surgery. Does it matter? Med thrombosis prophylaxis in gynecologic oncology
Care 1992;30(10):892-907. surgery. Obstet Gynecol 2000;95(2):206-14.

161
97. O'Brien BJ, Anderson DR, Goeree R. 107. Sawaya GF, Grady D, Kerlikowske K,
Cost-effectiveness of enoxaparin versus warfarin Grimes DA. Antibiotics at the time of induced
prophylaxis against deep-vein thrombosis after abortion: the case for universal prophylaxis
total hip replacement. CMAJ 1994;150(7):1083- based on a meta-analysis. Obstet Gynecol
90. 1996;87(5 Pt 2):884-90.

98. Palmer AJ, Koppenhagen K, Kirchhof 108. Song F, Glenny AM. Antimicrobial
B, Weber U, Bergemann R. Efficacy and safety prophylaxis in colorectal surgery: a systematic
of low molecular weight heparin, unfractionated review of randomized controlled trials. Br J Surg
heparin and warfarin for thrombo-embolism 1998;85(9):1232-41. Published erratum appears
prophylaxis in orthopaedic surgery: a meta- in Br J Surg 1999;86(2):280.
analysis of randomised clinical trials.
Haemostasis 1997;27(2):75-84. 109. Gillespie WJ, Walenkamp G. Antibiotic
prophylaxis for surgery for proximal femoral and
99. Koch A, Bouges S, Ziegler S, Dinkel H, other closed long bone fractures (Cochrane
Daures JP, Victor N. Low molecular weight Review). In: The Cochrane Library, Issue 2,
heparin and unfractionated heparin in thrombosis 2000. Oxford: Update Software; 2000.
prophylaxis after major surgical intervention:
update of previous meta-analyses. Br J Surg 110. Smaill F, Hofmeyr GJ. Antibiotic
1997;84(6):750-9. prophylaxis for cesarean section (Cochrane
Review). In: The Cochrane Library, Issue 2.
100. Morrison RS, Chassin MR, Siu AL. The Oxford: Update Software.; 2000.
medical consultant's role in caring for patients
with hip fracture. Ann Intern Med 1998;128(12 111. Perioperative total parenteral nutrition
Pt 1):1010-20. Published erratum appears in Ann in surgical patients. The Veterans Affairs Total
Intern Med 1998;129(9):755. Parenteral Nutrition Cooperative Study Group. N
Engl J Med 1991;325(8):525-32.
101. Planes A, Vochelle N, Fafola M.
Venous thromboembolic prophylaxis in 112. Campos AC, Meguid MM. A critical
orthopedic surgery: knee surgery. Semin Thromb appraisal of the usefulness of perioperative
Hemost 1999;25(Suppl 3):73-7. nutritional support. Am J Clin Nutr
1992;55(1):117-30.
102. Lassen MR, Backs S, Borris LC,
Kaltoft-Sorenson M, Coff-Ganes H, Jeppesen E. 113. Bastow MD, Rawlings J, Allison SP.
Deep-vein thrombosis prophylaxis in orthopedic Benefits of supplementary tube feeding after
surgery: hip surgery. Semin Thromb Hemost fractured neck of femur: a randomised controlled
1999;25(Suppl 3):79-82. trial. Br Med J (Clin Res Ed)
1983;287(6405):1589-92.
103. Barker FG, 2nd. Efficacy of
prophylactic antibiotics for craniotomy: a meta- 114. Ferguson TB, Jr., Coombs LP, Peterson
analysis. Neurosurgery 1994;35(3):484-92. ED. Preoperative beta-blocker use and mortality
and morbidity following CABG surgery in North
104. Tanos V, Rojansky N. Prophylactic America. Jama 2002;287(17):2221-7.
antibiotics in abdominal hysterectomy. J Am
Coll Surg 1994;179(5):593-600. 115. Mangano DT, Layug EL, Wallace A,
Tateo I. Effect of atenolol on mortality and
105. Waddell TK, Rotstein OD. cardiovascular morbidity after noncardiac
Antimicrobial prophylaxis in surgery. Committee surgery. Multicenter Study of Perioperative
on Antimicrobial Agents, Canadian Infectious Ischemia Research Group [published erratum
Disease Society. CMAJ 1994;151(7):925-31. appears in N Engl J Med 1997 Apr
106. Deacon JM, Pagliaro AJ, Zelicof SB, 3;336(14):1039]. N Engl J Med
Horowitz HW. Prophylactic use of antibiotics for 1996;335(23):1713-20.
procedures after total joint replacement J Bone
Joint Surg Am 1996;78(11):1755-70.

162
116. Poldermans D, Boersma E, Bax JJ, 125. Hofer TP, Bernstein SJ, DeMonner S,
Thomson IR, van de Ven LL, Blankensteijn JD, Hayward RA. Discussion between reviewers
et al. The effect of bisoprolol on perioperative does not improve reliability of peer review of
mortality and myocardial infarction in high-risk hospital quality. Med Care 2000;38(2):152-61.
patients undergoing vascular surgery. Dutch
Echocardiographic Cardiac Risk Evaluation 126. Kovner C, Gergen PJ. Nurse staffing
Applying Stress Echocardiography Study Group. levels and adverse events following surgery in
N Engl J Med 1999;341(24):1789-94. U.S. hospitals. Image J Nurs Sch
1998;30(4):315-21.
117. Bertrand M, Legrand V, Boland J, al. e.
Randomized multicenter comparison of 127. Ritchie JL, Phillips KA, Luft HS.
conventional anticoagulation versus antiplatelet Coronary angioplasty. Statewide experience in
therapy in unplanned and elective coronary California. Circulation 1993;88(6):2735-43.
stenting: the Full Anticoagulation Versus Aspirin
and Ticlopidine (FANTASTIC) Study. 128. Ritchie JL, Maynard C, Chapko MK,
Circulation 1998;98:1597-1603. Every NR, Martin DC. Association between
percutaneous transluminal coronary angioplasty
118. Leon M, Baim D, Popma J, al. e. A volumes and outcomes in the Healthcare Cost
clinical trial comparing three anti-thrombotic and Utilization Project 1993-1994. Am J Cardiol
regimens following coronary artery stenting. N 1999;83(4):493-7.
Engl J Med. 1998;339:1665-1671.
129. Ho V. Evolution of the volume-outcome
119. Ashton CM, Kuykendall DH, Johnson relation for hospitals performing coronary
ML, Wray NP. An empirical assessment of the angioplasty. Circulation 2000;101(15):1806-11.
validity of explicit and implicit process-of-care
criteria for quality assessment. Med Care 130. Maynard C, Every NR, Chapko MK,
1999;37(8):798-808. Ritchie JL. Institutional volumes and coronary
angioplasty outcomes before and after the
120. Camacho LA, Rubin HR. Assessment introduction of stenting. Eff Clin Pract
of the validity and reliability of three systems of 1999;2(3):108-13.
medical record screening for quality of care
assessment. Med Care 1998;36(5):748-51. 131. Jollis JG, Peterson ED, DeLong ER,
Mark DB, Collins SR, Muhlbaier LH, et al. The
121. Gibbs J, Clark K, Khuri S, Henderson relation between the volume of coronary
W, Hur K, Daley J. Validating risk-adjusted angioplasty procedures at hospitals treating
surgical outcomes: chart review of process of Medicare beneficiaries and short-term mortality.
care. Int J Qual Health Care 2001;13(3):187-96. N Engl J Med 1994;331(24):1625-9.

122. Dubois RW, Rogers WH, Moxley JHD, 132. Jollis JG, Peterson ED, Nelson CL,
Draper D, Brook RH. Hospital inpatient Stafford JA, DeLong ER, Muhlbaier LH, et al.
mortality. Is it a predictor of quality? N Engl J Relationship between physician and hospital
Med 1987;317(26):1674-80. coronary angioplasty volume and outcome in
elderly patients. Circulation 1997;95(11):2485-
123. Bates DW, O'Neil AC, Petersen LA, 91.
Lee TH, Brennan TA. Evaluation of screening
criteria for adverse events in medical patients. 133. Hannan EL, Racz M, Ryan TJ,
Med Care 1995;33(5):452-62. McCallister BD, Johnson LW, Arani DT, et al.
Coronary angioplasty volume-outcome
124. Thomas EJ, Studdert DM, Brennan TA. relationships for hospitals and cardiologists.
The reliability of medical record review for JAMA 1997;277(11):892-8.
estimating adverse event rates. Ann Intern Med
2002;136(11):812-6.

163
134. McGrath PD, Wennberg DE, Malenka 143. Silber JH, Rosenbaum PR, Williams
DJ, Kellett MA, Jr., Ryan TJ, Jr., JR OM, et al. SV, Ross RN, Schwartz JS. The relationship
Operator volume and outcomes in 12,998 between choice of outcome measure and hospital
percutaneous coronary interventions. Northern rank in general surgical procedures: implications
New England Cardiovascular Disease Study for quality assessment. Int J Qual Health Care
Group. J Am Coll Cardiol 1998;31(3):570-6. 1997;9(3):193-200.

135. McGrath PD, Wennberg DE, Dickens 144. Johantgen M, Elixhauser A, Bali JK,
JD, Jr., Siewers AE, Lucas FL, Malenka DJ, et Goldfarb M, Harris DR. Quality indicators using
al. Relation between operator and hospital hospital discharge data: state and national
volume and outcomes following percutaneous applications. Jt Comm J Qual Improv
coronary interventions in the era of the coronary 1998;24(2):88-105. Published erratum appears in
stent. Jama 2000;284(24):3139-44. Jt Comm J Qual Improv 1998;24(6):341.

136. Cebul RD, Snow RJ, Pine R, Hertzer 145. Iezzoni L, Lawthers A, Petersen L,
NR, Norris DG. Indications, outcomes, and McCarthy E, Palmer R, Cahalane M, et al.
provider volumes for carotid endarterectomy. Project to validate the Complications Screening
JAMA 1998;279(16):1282-7. Program: Health Care Financing Administration;
1998 March 31. Report No.: HCFA Contract
137. Needleman J, Buerhaus PI, Mattke S, 500-94-0055.
Stewart M, Zelevinsky K. Nurse Staffing and
Patient Outcomes in Hospitals. Boston, MA: 146. Hawker GA, Coyte PC, Wright JG, Paul
Health Resources Services Administration; 2001 JE, Bombardier C. Accuracy of administrative
February 28. Report No.: 230-99-0021. data for assessing outcomes after knee
replacement surgery. J Clin Epidemiol
138. Lichtig LK, Knauf RA, Milholland DK. 1997;50(3):265-73.
Some impacts of nursing on acute care hospital
outcomes. J Nurs Adm 1999;29(2):25-33. 147. Faciszewski T, Johnson L, Noren C,
Smith MD. Administrative databases'
139. Hannan EL, Bernard HR, O'Donnell JF, complication coding in anterior spinal fusion
Kilburn H, Jr. A methodology for targeting procedures. What does it mean? Spine
hospital cases for quality of care record reviews. 1995;20(16):1783-8.
Am J Public Health 1989;79(4):430-6.
148. Best W, Khuri S, Phelan M, Hur K,
140. Nursing-Sensitive Quality Indicators for Henderson W, Demakis J, et al. Identifying
Acute Care Settings and ANA's Safety & Quality Patient Preoperative Risk Factors and
Initiative. In: American Nurses Association; Postoperative Adverse Events in Administrative
1999. Databases: Results from the Department of
Veterans Affairs National Surgical Quality
141. Geraci JM, Ashton CM, Kuykendall Improvement Program. J Am Coll Surg
DH, Johnson ML, Wu L. International 2002;194(3):257-266.
Classification of Diseases, 9th Revision, Clinical
Modification codes in discharge abstracts are 149. Dicker R, Han L, Macone J. Quality of
poor measures of complication occurrence in Care Surveillance Using Administrative Data,
medical inpatients. Med Care 1997;35(6):589- 1996. In: Quality Resume No. 2. Baltimore, MD:
602. HCFA; 1998.

142. Berlowitz D, Brand H, Perkins C. 150. White RH, Romano PS, Zhou H,
Geriatric Syndromes as Outcome Measures of Rodrigo J, Bargar W. Incidence and time course
Hospital Care: Can Administrative Data Be of thromboembolic outcomes following total hip
Used? JAGS 1999;47:692-696. or knee arthroplasty. Arch Intern Med
1998;158(14):1525-31.

164
151. Barbour GL. Usefulness of a discharge 160. Malenka DJ, McGrath PD, Wennberg
diagnosis of sepsis in detecting iatrogenic DE, Ryan TJ, Jr., Kellett MA, Jr., Shubrooks SJ,
infection and quality of care problems. Am J Jr., et al. The relationship between operator
Med Qual 1993;8(1):2-5. volume and outcomes after percutaneous
coronary interventions in high volume hospitals
152. Massanari RM, Wilkerson K, Streed in 1994-1996: the northern New England
SA, Hierholzer WJ, Jr. Reliability of reporting experience. Northern New England
nosocomial infections in the discharge abstract Cardiovascular Disease Study Group. J Am Coll
and implications for receipt of revenues under Cardiol 1999;34(5):1471-80.
prospective reimbursement. Am J Public Health
1987;77(5):561-4. 161. Fisher ES, Whaley FS, Krushat WM,
Malenka DJ, Fleming C, Baron JA, et al. The
153. Belio-Blasco C, Torres-Fernandez-Gil accuracy of Medicare's hospital claims data:
MA, Echeverria-Echarri JL, Gomez-Lopez LI. progress has been made, but problems remain.
Evaluation of two retrospective active Am J Public Health 1992;82(2):243-8.
surveillance methods for the detection of
nosocomial infection in surgical patients. Infect 162. Weiss J, Nannini A, Fogerty S, al e. Use
Control Hosp Epidemiol 2000;21(1):24-7. of hospital discharge data to monitor uterine
rupture - Massachusetts, 1990-1997. MMWR
154. Taylor B. Common bile duct injury 2000;49(12):245-248.
during laparoscopic cholecystectomy in Ontario:
does ICD-9 coding indicate true incidence? 163. Romano P. Unpublished report to the
CMAJ 1998;158(4):481-5. California Office of Statewide Health Planning
and Development; 2001 March.
155. Valinsky LJ, Hockey RL, Hobbs MS,
Fletcher DR, Pikora TJ, Parsons RW, et al. 164. Lydon-Rochelle M, Holt V, Easterling
Finding bile duct injuries using record linkage: a T, Martin D. Risk of uterine rupture during labor
validated study of complications following among women with a prior cesarean delivery. N
cholecystectomy. J Clin Epidemiol Engl J Med 2001;345:3-8.
1999;52(9):893-901.
165. Gregory KD, Korst LM, Cane P, Platt
156. Hughes C, Harley E, Milmoe G, Bala R, LD, Kahn K. Vaginal birth after cesarean and
Martorella A. Birth trauma in the head and neck. uterine rupture rates in California. Obstet
Arch Otolaryngol Head Neck Surg Gynecol 1999;94(6):985-9.
1999;125:193-199.
166. Coding Clinic for ICD-9-CM
157. Towner D, Castro MA, Eby-Wilkens E, 1990;7(3):13.
Gilbert WM. Effect of mode of delivery in
nulliparous women on neonatal intracranial 167. Kroll DA, Caplan RA, Posner K, Ward
injury. N Engl J Med 1999;341(23):1709-14. RJ, Cheney FW. Nerve injury associated with
anesthesia. Anesthesiology 1990;73(2):202-7.
158. Handa VL, Danielsen BH, Gilbert WM.
Obstetric anal sphincter lacerations. Obstet 168. Serious Reportable Events in
Gynecol 2001;98(2):225-30. Healthcare. Consensus Report. Washington
D.C.: National Forum for Health Care Quality
159. Needleman J, Buerhaus P, Mattke S, Measurement and Reporting; 2002.
Stewart M, Zelevinsky K. Nurse-staffing levels
and the quality of care in hospitals. N Engl J
Med 2002;346(22):1715-22.

165
166

You might also like