Does A Pre-Intervention Functional Assessment
Does A Pre-Intervention Functional Assessment
PII: S0272-7358(15)30146-X
DOI: doi: 10.1016/[Link].2016.05.003
Reference: CPR 1524
Please cite this article as: Hurl, K., Wightman, J., Virues-Ortega, J. & Haynes, S.N.,
Does a pre-intervention functional assessment increase intervention effectiveness? A
meta-analysis of within-subject interrupted time-series studies, Clinical Psychology Re-
view (2016), doi: 10.1016/[Link].2016.05.003
This is a PDF file of an unedited manuscript that has been accepted for publication.
As a service to our customers we are providing this early version of the manuscript.
The manuscript will undergo copyediting, typesetting, and review of the resulting proof
before it is published in its final form. Please note that during the production process
errors may be discovered which could affect the content, and all legal disclaimers that
apply to the journal pertain.
ACCEPTED MANUSCRIPT
BEHAVIOR FUNCTION AND INTERVENTION EFFECTS
PT
RI
Does a Pre-Intervention Functional Assessment Increase Intervention Effectiveness?
SC
NU
Kylee Hurl
Jade Wightman
MA
University of Manitoba
Javier Virues-Ortega
D
TE
Stephen N. Haynes
P
School of Psychology, The University of Auckland, Tamaki Campus, Private Bag 92019, 261
Acknowledgements: Dr. William Shadish (UC Merced) provided assistance with the
SPSS® macro d-Hedges-Pustejovsky-Shadish version 1.0 (DHPS). Mr. Xiaoshan Wang and Mr.
Canadian Institutes of Health Research Synthesis Grant awarded to Dr. Javier Virues-Ortega
(KRS-132038).
ACCEPTED MANUSCRIPT
Abstract
PT
pre-intervention FBA. We examined 19 studies that included a direct comparison between the
RI
effects of FBA- and non-FBA-based interventions with the same participants. A random effects
meta-analysis of effect sizes indicated that FBA-based interventions were associated with large
SC
reductions in problem behaviors when using non-FBA-based interventions as a reference
NU
intervention (Effect size = 0.85, 95% CI [0.42, 1.27], p < .001). In addition, non-FBA based
interventions had no effect on problem behavior when compared to no intervention (0.06, 95%
MA
CI [-0.21, 0.33], p = .664). Interestingly, both FBA-based and non-FBA-based interventions had
significant effects on appropriate behavior relative to no intervention, albeit the overall effect
D
TE
size was much larger for FBA-based interventions (FBA-based: 1.27, 95% CI [0.89, 1.66], p <
.001 vs. non-FBA-based: 0.35, 95% CI [0.14, 0.56], p = .001). In spite of the evidence in favor
P
methodological standards underlines the need for further comparisons of FBA-based versus non-
AC
FBA-based interventions.
Problem behavior are clinically important learned performances that can cause harm or
PT
are deemed undesirable by the client’s social and legal milieu (Donovan, 2005). A large body of
RI
literature has shown that problem behaviors often show important functional relations with social
and nonsocial environmental variables (Hanley, Iwata, & McCord, 2003; Beavers, Iwata, &
SC
Lerman, 2013). Thus, it is important to identify the variables that influence problem behavior in
NU
order to design effective targeted interventions. Functional behavioral assessment1 (FBA) is a set
Most studies that have implemented a pre-intervention FBA have occurred within the
D
TE
these populations include self-injury, property destruction, aggression, and stereotypy (e.g.,
P
Beavers et al., 2013; Myrbakk & von Tetzchner, 2008; Qureshi & Alborz, 1992). Nonetheless,
CE
the functional analysis, conceptualized as the identification of important, controllable, and causal
AC
formulation, guided by the results of FBA, that is being gradually transferred to an ever growing
Buchanan & Fisher, 2002; Dixon et al., 2004; Haynes, et al., 2011; Moore, Gilles, Mccomas, &
Symons, 2010; Reitman & Passeri, 2008; Sturmey, 2008; Wilder, Masauda, O’Conner, &
identifying functional relations that may be unique for an individual (Haynes, Mumma & Pinson,
2009). It parallels a trend in medicine that increasingly tailors medical interventions to specific
PT
causes of diseases with great symptomatic variability or biologic complexity across persons. For
RI
example, a personalized approach to assessment and treatment is now considered critical in the
treatment of certain forms of cancer, AIDS, chronic pain, and neurological disorders (e.g.,
SC
Garman, Nevins, & Potti, 2007). In the current review we evaluate the effect of FBA-based
NU
interventions relative to no intervention, and the incremental effectiveness of interventions based
on pre-intervention FBA relative to that of interventions based on non-FBA, in studies that have
MA
directly compared the two strategies using the same participants in within-subject designs.
behavioral assessment is an idiographic approach to assessment in that it presumes that there can
CE
be differences across individuals in the variables that control a particular behavior (Haynes,
AC
O’Brien, & Kaholokula, 2011). Unlike most clinical diagnostic systems that are based on
patterns of covariation among behaviors, FBA attempts to identify the functional relations that
Individualized interventions that target the functional relations that may be maintaining
the problem behavior are often based on the findings of a pre-intervention functional analysis
(Iwata & Worsdell, 2005). Given that a pre-intervention FBA requires time and delays the onset
effectiveness relative to interventions that are not based on the results of a pre-intervention FBA.
ACCEPTED MANUSCRIPT
identification of antecedent and consequent events that influence problem behavior: (a) indirect
PT
methods, (b) descriptive analysis, and (c) experimental functional analysis. Indirect methods
RI
include interviews and checklists that inquire about the antecedents, consequences, and the
contexts associated with the problem behavior. Two well-known examples are the Motivation
SC
Assessment Scale (Durand & Crimmins, 1988) and the Functional Analysis Screening Tool
NU
(Iwata, DeLeon, & Roscoe, 2013). Indirect methods gather self- and proxy-reported information
about the behavior and associated functional relations in a cost-efficient manner and have been
MA
found to be moderately valid when experimental approaches to assessments are used as reference
for comparison (e.g., Hall, 2005; Iwata et al., 2013). According to a recent review, indirect
D
TE
assessments and experimental functional analyses were concordant in 65% of 97 published cases
that reported both an experimental and indirect assessment of the same problem behavior
P
(Wightman, Julio, & Virues-Ortega, 2014). However, indirect methods often rely on the report of
CE
A descriptive analysis involves direct observation of the behavior and the events of which
it might be a function as they occur in the natural environment (e.g., Haynes, et al., 2011;
Thompson & Iwata, 2007). The events preceding and following the behavior are recorded across
time, for days or weeks, in order to evaluate their relationship with the target behavior, often by
informs specific hypotheses about the likely functional relations associated with the problem
behavior, including the context in which it occurs. Descriptive analyses avoid some of the errors
associated with indirect methods in that they rely on direct observation. However, events are
ACCEPTED MANUSCRIPT
observed as they naturally occur and, as with indirect methods, it is not possible to draw causal
inferences from coincidental relations between events. Moreover, Wightman et al. (2014)
reported that descriptive and experimental functional analyses produced concordant results in
PT
only 11% of the 27 published cases reporting both a descriptive and an experimental functional
RI
analysis.
SC
antecedents and consequences to the target behavior, usually in a single-subject reversal or
NU
replication design, in order to identify social and non-social factors that may be influencing the
the target behavior. The condition that results in the greatest change in the target behavior for a
D
TE
particular client is presumed to also influence the target behavior in the natural environment.
Experimental functional analyses often incorporate test conditions for social positive
P
reinforcement in the form of attention, social negative reinforcement in the form of escape from
CE
demands, and the manipulation of antecedent conditions and contexts such as various social or
AC
demand situations (Iwata, Dorsey, Slifer, Bauman, & Richman, 1982, 1994).
In the test condition for social positive reinforcement, social attention (e.g., hugs,
compliments, high fives) immediately follows the occurrence of problem behavior. If the rate of
behavior is distinctively higher in this condition compared to the control condition, it is likely
that the problem behavior is at least partially maintained by positive social attention. In the test
condition for social negative reinforcement, the experimenter instructs the individual to engage
in low-preference tasks. The individual is removed from task demands for a brief period of time
following the occurrence of the problem behavior. If the rate of problem behavior is distinctively
ACCEPTED MANUSCRIPT
higher during this condition, it is likely that the problem behavior is at least partially maintained
by escape from demands. Finally, the test for automatic reinforcement often involves observing
the individual in a barren environment. On occasions when the behavior occurs when no other
PT
stimuli is present or fails to show distinctive changes across other conditions, it is assumed that
RI
the sensory feedback produced by the behavior itself may be the primary maintaining factor.
Not all experimental functional analyses strategies produce differentiated results (e.g.,
SC
Finkel, Derby, Weber, & McLaughlin, 2003; Iwata et al., 1994; Hagopian, Rooker, Jessel, &
NU
DeLeon, 2013). If results from the initial assessment are unclear, it is possible to incorporate
further manipulations in order to assess unusual and idiosyncratic but potentially influential
MA
variables. For example, Roscoe, Kindle, and Pence (2010) conducted a series of tests for specific
forms of attention in a client with an undifferentiated EFA and found that access to preferred
D
conversation topics maintained the participant’s problem behavior (see also Schlichenmeyer,
TE
Roscoe, Rooker, Wheeler, & Dube, 2013). The experimental functional analysis is considered
P
the highest standard of FBA assessment of problem behavior because it involves direct
CE
environment, often using interrupted time-series designs, and is consistent with covariation and
processes influencing problem behavior. Even if a behavior is similar in form across clients, the
factors influencing the behavior may differ across individuals. For example, Iwata et al. (1994)
examined 152 functional analyses of self-injurious behavior. They found that 26% of the
ACCEPTED MANUSCRIPT
addressing the individual causes of problem behavior, FBA-based interventions may be more
PT
ecologically valid, thereby facilitating generalization across settings. Moreover, the closer the
RI
setting of the FBA approximates the environment where the FBA-based intervention will be
implemented, the greater the likelihood that the assessment results are accurate and lead to an
SC
effective FBA-based intervention (Lang, Sigafoos, Lancioni, Didden, & Rispoli, 2010; Martens,
NU
Gertz, de Lacy Werder, & Rymanowski, 2010).
One of the circumstances that prompted the development of FBA was the inconsistency
MA
of effects often reported in non-FBA-based interventions (see for example Iwata et al.,
problem behaviors influenced by a narrow set of the putative causal variables addressed by the
intervention. Moreover, two interventions designed to address two different sets of casual
P
variables may have incompatible or counteractive effects. For example, extinction in the form of
CE
withdrawal of social attention may be effective for a problem behavior maintained by social
AC
attention but may have iatrogenic effects for the same problem behavior maintained by escape
(Beavers et al., 2013; Hanley, Iwata, & McCord, 2003), few have examined the incremental
experimental situations. Didden, Korzilius, Oorsouw, and Sturmey (2006) conducted a single-
treatments for individuals with a mild intellectual disability. They found that overall these
ACCEPTED MANUSCRIPT
interventions were effective in reducing challenging behaviors. Their results also indicated that
PT
Miller and Lee (2013) reviewed 82 single-subject design studies that used either an FBA-
RI
based or non-FBA-based intervention for ADHD. The authors reported that FBA-based
interventions had more favorable effects than non-FBA-based interventions (effect sizes
SC
calculated as standard mean differences: 3.9 versus 2.6, for FBA-based and non-FBA based
NU
interventions, respectively). They also found that interventions that used a pre-intervention
about comparative effects from this study because Miller and Lee compared results from studies
that used FBA-based with the results from different studies that used non-FBA based
P
intervention. Thus, the findings of differential effectiveness may have reflected differences in the
CE
two sets of studies in their design, measurement procedures, type and characteristics of problem
AC
This review adopts a more conservative approach to evaluation than these two previous
FBA-based intervention relative to non-FBA based interventions using studies with within-
subject and single-subject experimental designs. The goal of the present review is to examine the
effectiveness of interventions not based on a pre-intervention FBA when both are instituted with
Method
PT
Literature Search
RI
We selected studies that reported the effects of FBA-based and a non-FBA-based
interventions within the same individuals. Two doctoral level graduate students with training in
SC
meta-analysis (KH, JKW) conducted the literature search using PsycINFO (CSA search engine),
NU
Medline (Pubmed), and the Cochrane Central Register of Controlled Trials. Core search terms
used across all databases were: “intervention,” “treatment,” and “behavioral assessment.” The
MA
addition of the terms “functional analysis,” and “functional behavioral assessment-based”
resulted in a more sensitive search in the Medline and Cochrane searches. The literature search
D
TE
was dated February 27, 2014 and was updated before this manuscript was submitted for
publication. Additional studies were identified by screening the reference lists of eligible studies.
P
Studies that met the following inclusion criteria were selected: (a) studies published as
AC
peer-reviewed articles, Master’s theses, and doctoral dissertations, and (b) empirical studies with
intervention, both presented to the same participants. FBA-based interventions were defined as
interventions consistent with the results of a pre-intervention FBA (indirect method, descriptive
analysis, experimental functional analysis). For example, Newcomer and Lewis (2004)
conducted a FBA of Matthew’s verbal aggression. The assessment suggested that Matthew
engaged in aggression in order to escape from peers. As part of the multicomponent intervention
that followed, teachers refrained from grouping Matthew with peers he disliked, as this was an
ACCEPTED MANUSCRIPT
appropriate replacement behaviors and directions on how to use the replacement behaviors to
communicate his needs. Matthew was also given lessons about how to manage teasing and
PT
challenging situations with peers, as this was a frequent antecedent to his aggression. Non-FBA-
RI
based interventions were defined as interventions that were not based on a pre-treatment FBA.
SC
reinforcement system wherein he earned tokens for working and playing cooperatively with
NU
peers (Newcomer & Lewis, 2004). Receiving tokens for working cooperatively is unrelated to
the environmental factors controlling Matthew’s problem behavior (i.e., presence of peers,
MA
escaping from peers). Therefore, in this example, the token system is a non-FBA-based
intervention.
D
TE
We followed the search, data extraction, analysis, and reporting standards of the Meta-
Analysis Reporting Standards (MARS; American Psychological Association, 2009, pp. 251–252)
P
and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses standards
CE
(PRISMA; Moher, Liberati, Tetzlaff, & Altman, 2009). A total of 1763 unique studies were
AC
retrieved from the database searches and 19 studies met inclusion criteria and were included in
the systematic review and meta-analysis. Figure 1 presents the flow diagram of the study
selection process.
Data Extraction
A doctoral level graduate student (KH) extracted the data from the selected studies. All
datasets were reported as separate XY session-by-session time-series graphs for each participant.
The data were retrieved from the studies using a computer application for the extraction of time-
series data reported in a Cartesian space (Rohatgi, 2013). Information on the subjects (age, sex,
ACCEPTED MANUSCRIPT
of intervention) were also extracted. The methods to assess behavior function included both
PT
observation-based methods (functional analysis, descriptive analysis) as well as indirect methods
RI
(questionnaires, interviews). When not directly reported, assessment duration was estimated as
15 min for indirect assessments using questionnaires (e.g., Questions About Behavior Function)
SC
and 30 min for indirect assessments using interviews (e.g., Functional Analysis Interview).
NU
Assessment time was computed as a preliminary approach to explore time-efficiency and cost-
methodological standards of the studies included in the meta-analysis and to evaluate quality
D
TE
Kratochwill et al. (2010, pp. 14-17) to evaluate the methodological quality of the single-subject
P
studies included in the review. Studies were evaluated on the degree to which they met standards
CE
in terms of (a) design and (b) effect demonstration. The design of a study was evaluated on the
AC
degree to which it met standards for systematic intervention, systematic measurement, and inter-
rater reliability, and the strength of the demonstration (e.g., number of data points, number of
reversals). Effect demonstration was evaluated in terms of the consistency of the time-series
trend within each intervention phase and across phases. Two raters with graduate training in
single-subject experimental design (KH, JKW) independently evaluated the quality standards of
Raters assigned an ordinal score to each study according to the following categories.
ACCEPTED MANUSCRIPT
Design: (a) met evidence standards, (b) met evidence standards with reservations, and (c)
Effect demonstration: (a) strong evidence, (b) moderate evidence, and (c) no-evidence.
PT
In order to produce a six-point ordinal quality index for each study, these
RI
abovementioned classifications were summarized as follows: (a) studies that met evidence
SC
standards and presented strong evidence (6 points), (b) studies that met evidence standards and
presented moderate evidence (5 points), (c) studies that met evidence standards with reservations
NU
and presented strong evidence (4 points), (d) studies that meet evidence standards with
reservations and presented moderate evidence (3 points), (e) studies that met evidence standards
MA
and presented no evidence (2 points), (f) studies that met evidence standards with reservations
and presented no evidence (1 point), and (g) studies that did not meet evidence standards (0
D
TE
points).
The most common concern with the studies included in the meta-analysis was the limited
P
number of data points per study phase. The most common effect demonstration concern was the
CE
large within-phase variability of the target behaviors resulting in overlap in the time series of
AC
adjacent phases. All studies provided sufficient information to complete the quality assessment.
For one of the studies it was unclear if inter-rater reliability was collected for at least 20% of the
data points and it was scored as 0 on that item (Repp, Felce, & Barton, 1988). The mean quality
index across studies was 3.2 (SD = 1.9; range, 0 to 5). Table 2 presents a summary of the quality
Inter-rater reliability. In order to assess the reliability of the literature search process,
two of the authors (KH, JKW) independently implemented the inclusion criteria in 25% of the
studies initially identified (n = 440). Inter-rater reliability was computed as the percentage of
ACCEPTED MANUSCRIPT
times that both raters evaluated a publication as either eligible or non-eligible for inclusion.
We also computed inter-rater reliability of the data extraction process. A doctoral level
PT
student (KH) extracted data from all of the studies and a second doctoral level student (JKW)
RI
independently extracted data from 32% of the studies. The inter-rater reliability of the data
extraction process was calculated as the average ratio of the smallest value of a given data point
SC
extracted by either of the two raters divided by the largest value extracted by either of the two
NU
raters. The average inter-rater reliability was 99.36% (range, 80% to 100%).
Inter-rater reliability was also calculated for the extraction of subjects’ information and
MA
methodological characteristics for 51% (n = 29) of the subjects. Inter-rater reliability was 100%
for these variables. Inter-rater reliability for the methodological quality index was calculated as
D
TE
the percentage of included studies for which both coders assigned the same score. The inter-rater
reliability was obtained for 100% of studies and there was agreement for 18 of the 19 studies
P
Data Analysis
AC
We calculated Hedges g effect sizes for each study in order to conduct a series of random
effects meta-analyses. Traditional non-parametric effect sizes were also calculated: non-
computed the Hedges g non-parametric effect size estimator for all single-subject experimental
studies that reported three or more participants (Hedges, Pustejovsky, & Shadish, 2012, 2013).
According to Hedges et al., the baseline observation of a target behavior, Yij, is a function of the
group or parameter level of behavior across participants, μC, the individual level of behavior ηi,
ACCEPTED MANUSCRIPT
and the level of change across observations for an individual εij, where i denotes each participant
PT
Likewise, the statistical model for an observation during the treatment period is:
RI
SC
The model assumes that observations are normally distributed and without time trend.
Thus, the average level of change within a phase and an individual εij is 0 with only first-order
NU
autocorrelation. One further assumption suggests that parameter variance is composed of the
variance of observations within each individual ζ2 and the variance of observations between
MA
individuals η2. Therefore, the effect size parameter can be defined as the standardized mean
difference δ:
D
P TE
The basic approach expressed above is one of the strongest analytic strategies for
CE
interrupted time-series designs but the outcome is susceptible to several sources of error that
AC
would be expected to be randomly distributed across studies: (a) higher-order autocorrelation; (b)
within-phase trend; (c) relative weight of studies eligible for meta-analysis due to the number of
participants; (d) time-series asymmetry across baseline, treatment and participants; and (e)
design-specific asymmetries (e.g., ABAB vs. multiple-baseline design). The extended model for
detrending procedures, and design-specific formulations to account for the biased indicated
We used the SPSS® macro d-Hedges-Pustejovsky-Shadish version 1.0 (DHPS, Marso &
Shadish, 2013) for the purposes of effect size (g) and effect size variance (Var[g]) calculation
ACCEPTED MANUSCRIPT
according to this model. In order to compute effect size estimators, we extracted all individual
data points in the fully reported session-by-session datasets of all studies selected for the meta-
analysis. All outcomes were obtained through behavioral observation and reported either as
PT
responses per minute or percentage of session duration/intervals with problem behavior.
RI
Random effects meta-analysis. The effect sizes obtained from the Hedges-Pustejovsky-
SC
standardized mean differences in a meta-analysis of between-subjects studies (Hedges et al.,
NU
2012). We used Cohen (1988) suggestions for effect size interpretation whereas effect sizes
below 0.44 are considered small, effect sizes from 0.45 to 0.79 are considered moderate, and
MA
effect sizes of 0.80 or above are considered large. While Cohen is often cited in this context, it
provides simply a judgment guide, as there is no empirical basis to classify a continuous effect
D
TE
size into categories to be used across a variety of populations and methods. Effect size
interpretation will focus primarily on the relative differences across types of intervention.
P
We obtained pooled effect sizes and 95% confidence intervals for problem behavior and
CE
random-effects approach to meta-analysis was appropriate to the current study given the
method, treatment design, measurement strategies, and baseline and treatment durations
(Cottrell, Drew, Gibson, Holroyd, & O'Donnell, 2007). The resulting effect size estimators and
estimator variances were then included into an inversed variance weighted random effect meta-
analysis. Meta-analyses calculations were computed with Stata v. 11 (Stata Corporation, College
namely, it evaluates the extent to which an outcome varies across studies (Higgins & Thompson,
2002). Variability can be interpreted either as the result of chance or heterogeneity (i.e.,
PT
methodological differences across studies). I2 is expressed as a percent (Low heterogeneity: I2 =
RI
Medina, Sanchez-Meca, Marin-Martinez, & Botella, 2006). Finally, we computed the Egger’s
SC
test for publication bias (Egger, Smith, Schneider, & Minder, 1997). This test can also identify if
NU
outliers or small sample studies have an excessive influence over the pooled outcome
the highest standard for behavior function ascertainment (see for example Iwata, DeLeon, &
P
Roscoe, 2013; Wightman et al., 2014). In order to examine this potential bias, we conducted a
CE
analysis was or was not used as the maim means of assessment. In addition, we conducted
random effects meta-regression analysis using and the Kratochwill et al. quality index as
continuous moderators, and major clinical diagnosis (ASD vs. other diagnosis) as categorical
moderator.
There were three separate comparisons made for the single-subject meta-analysis: (a)
baseline versus non-FBA-based intervention; (b) baseline versus FBA-based intervention and;
(c) FBA-based versus non-FBA based intervention. These comparisons were computed both for
ACCEPTED MANUSCRIPT
problem and appropriate behavior. However, appropriate behavior was not reported in all the
Based on a preliminary review of studies, we expected to identify at least five studies per
PT
outcome meeting inclusion criteria with effect sizes of 0.5 or higher. We conducted an a priori
RI
power analysis for a random-effects meta-analysis according to the methods proposed by Hedges
and Pigott (2001) and the SAS® macro proposed by Cafri and Kromrey (2009). The analysis
SC
indicated that a meta-analysis of five studies with an average effect size of 0.5, moderate
NU
heterogeneity (50%), and an alpha value of 0.05, would result in an estimated power of 0.98
(one-tailed test) and 0.95 (two-tailed test). In sum, the prospective power analysis established the
MA
feasibility of the proposed study.
that did not restrict eligible studies by sample size. Specifically, we computed effect size
treatment evaluations for one or two participants. The effect size metrics used for this ancillary
CE
individual level. We chose Hedges g, the percentage of non-overlapping data points (PND), and
percentage of all-zero data (PZD). These estimators capture the various facets of behavioral
intervention effects. First, we used Hedges g in order to characterize the effects of the
intervention in the same metric used in the meta-analysis. Hedges g for individual cases was
computed using the R package SSD for R developed by Auerback and Zeitlin (2014). For the
purposes of presenting Hedges g values more succinctly we reported both individual effect sizes
and mean effect sizes and 95% confidence intervals across individuals. Hedges g was not
computed in two individual datasets where standard deviation equaled zero in one or more study
ACCEPTED MANUSCRIPT
phases (Don’s problem behavior, reported by Wilder et al., 2006; and Lisa’s attending behavior,
reported by Kodak et al., 2011). PND is the percentage of treatment sessions that yield data that
are above the maximum baseline data point for interventions intended to increase the target
PT
behavior. Conversely, it is the percentage of treatment sessions that yield data that are below the
RI
lowest baseline data for interventions intended to decrease the target behavior (Scruggs,
Matropieri, & Casto, 1987). Although it provides little information about the overall magnitude
SC
of an intervention, PND evaluates the extent to which an intervention is associated with a change
NU
in behavior beyond the expectation set on the basis of the variability observed during baseline.
Thus, PND was used to determine the extent to which the FBA-based treatment produced a
MA
change in a level beyond that of the variability exhibited in the non-FBA-based intervention.
However, PND has been shown to systematically underestimate treatment effects as the number
D
TE
of treatment observations increase (Allison & Gorman, 1994). PZD is the percentage of
intervention sessions without the target behavior (Scotti, Evans, Meyer, & Walker, 1991). This
P
measure of the extent to which an intervention was associated with a decrease in the occurrence
AC
of the behavior to “0”. Thus, PZD was used to examine the extent to which the problem behavior
was completely eliminated. However, it can produce biased values if the intervention is delayed
or does not completely eliminate the behavior. PZD was only used for the problem behavior
comparison. PND and PZD were selected for being frequently reported effect size estimators in
Results
A total of 19 studies were selected for further analysis. Of these 13 studies were included
in the meta-analysis. A total of 57 participants (80.7% male) and 52 participants (89% male)
participated in the meta-analysis, and non-parametric effect size analyses, respectively. The pool
PT
of participants was composed of preschool children (15.8%), school-aged children (73.7%), and
RI
adolescents and young adults (10.5%). Most participants had a diagnosis of ADHD, ASD, or
intellectual disability (50%). The most common target behavior was disruptive behavior (81.1%).
SC
Tables 1 and 2 present the personal characteristics of study participants.
NU
The methodological quality of the studies varied greatly. According to the Kratochwill’s
quality index, the median study met evidence standards with reservations and presented strong
MA
evidence (Me = 4, Range, 0 to 5). The most common design concern involved reporting limited
samples of behavior (i.e., less than 4 data points per study phase). The most common effect
D
TE
demonstration concerns involved high within-phase variability and high data point overlap
Devlin, Healy, Leader, and Hughes (2011) used sensory integration therapy, and Kearny and
AC
Silverman (1999) used cognitive therapy. However, the majority of the studies used empirically
supported and commonly applied behavioral interventions that were not informed by a pre-
intervention FBA: time-out (Taylor & Miller, 1997), self-monitoring (Vance, 2008), attention in
the form of verbal prompts and reprimands (Payne, Scott, & Conroy, 2007), and advanced notice
for problem behavior associated with transitions (Wilder, Chen, Atwell, Pritchard, & Weinstein,
2006).
Order effects of the presentation of the FBA-based and non-FBA-based interventions was
considered in several of the included studies. Six of the studies used multi-element designs, thus
ACCEPTED MANUSCRIPT
the two interventions were alternated between in rapid succession (Bellone, 2013; Devlin et al.,
2011; Hawkins & Axelrod, 2008; Kodak, Fisher, Clements, Paden, & Dickes, 2011; Repp et al.,
1988; Vance, 2008) . Out of the studies that used AB or alternating treatment designs, there were
PT
seven studies the altered the order of presentation of the FBA-based and non-FBA-based
RI
interventions across the participants (Filter & Horner, 2009; Ingram, Lewis-Palmer, & Sugai,
2005; Murphy, 2010; Mustian, 2011) and six that did not (Ellingson, Miltenberger, Stricker,
SC
Galensky, & Garlinghouse, 2000; Carter & Horner, 2007; Carter & Horner 2009; Kearney &
NU
Silverman, 1999; Newcomer & Lewis, 2004; Taylor & Miller, 1997).
Single-Subject Meta-Analysis
MA
Figures 2 and 3 present the forest plots of the single-subject random effects meta-analysis
for problem and appropriate behavior, respectively. Figure 4 presents a summary of overall
D
TE
effect sizes for problem and appropriate behavior across the three comparisons (baseline vs.
Problem behaviors. The nine studies that compared FBA-based and non-FBA-based
AC
interventions for problem behaviors had an overall effect size of 0.85, 95% CI [0.42, 1.27], p <
.001 in favor of the FBA-based interventions. There was moderate evidence of heterogeneity, I2
= 60%, and there was no evidence of publication bias or small study bias, p = 0.139.
The seven studies that compared FBA-based interventions to baseline for problem
behaviors had a pooled effect size of 0.92, 95% CI [0.56, 1.27], p < .001 in favor of the FBA-
based interventions. There was no evidence of heterogeneity across studies, I2 = 0 and there was
The seven studies that compared non-FBA-based interventions to baseline for problem
behaviors was based on the results of seven studies had a pooled effect size of 0.06, 95% CI [-
0.21, 0.33], p = 0.664. There was a low level of heterogeneity across studies (I2 = 4%) and no
PT
evidence of publication bias or small study bias was established (p = 0.830).
RI
Appropriate behaviors. For appropriate behaviors, the eight studies that compared
SC
0.27, 1.79], p = 0.146 in favor of the FBA-based interventions. There was no evidence of
NU
heterogeneity across studies, I2 = 0 and there was no evidence of publication bias or small study
effect, p = 0.582.
MA
The six studies that compared FBA-based interventions to baseline had a pooled effect
size of 1.27, 95% CI [0.89, 1.66], p < 0.001 in favor of the FBA-based intervention. There was a
D
low level of heterogeneity across studies, I2 = 1, and there was no evidence of publication bias or
TE
The six studies that compared non-FBA-based interventions to baseline had an overall
CE
weighted effect size of 0.35, 95% CI 0.56, 1.27, p = 0.001 in favor of the non-FBA-based
AC
intervention. There was evidence of moderate heterogeneity across studies, I2 = 53. No evidence
chronological age of the participants. Specifically, there was a significantly larger effect for
younger children (9 years old and younger). The meta-regression analyses also yielded
significant differences across all four comparisons for the use of experimental strategies to
ACCEPTED MANUSCRIPT
control for sequence effects. However, the direction of the effect was not consistent across
outcomes and comparisons. Moreover, the magnitude of the differences were from small to
moderate.
PT
There was a trend for interventions that used an experimental functional-analysis to have
RI
a larger reduction in problem behavior than those that used a non-experimental functional
analysis when compared to a non-FBA-based intervention. While effect sizes for interventions
SC
based on experimental functional analysis were numerically higher across all comparison and
NU
outcome, the meta-regression analyses for appropriate behaviors were non-significant. Main
diagnosis (ASD/ID vs. other) and study quality index were non-significant moderators for all
MA
outcomes (Table 3).
Problem behaviors. Table 2 present the effect sizes of all individual participants. The
overall mean Hedges g effect size across individual cases (FBA-based vs. non-FBA-based
P
interventions) was 2.86, SD = 3.01, 95% CI [1.88, 3.84]. The twelve individuals who
CE
of 1.68, SD = 0.85, 95% CI [1.14, 2.23]. The nine participants with descriptive analyses had an
overall mean Hedges g of 1.31, SD = 0.70, 95% CI [0.78, 1.85]. Finally, the 19 participants with
an EFA showed an overall Hedges g of 4.42, SD = 3.85, 95% CI [2.51, 6.33]. Fourteen
participants (34.15%) had no overlapping data points across FBA-based and non-FBA-based
interventions. Namely, they had a PND of 100%. Four participants (9.76%) had a data point in
their non-FBA-based intervention that was just as low or lower than the data points for problem
behavior in the FBA-based intervention (i.e., PND = 0%). Finally, according to the PZD
estimators, five participants (12.19%) continued to remain at zero after the first zero data point in
ACCEPTED MANUSCRIPT
the FBA-based intervention. Approximately half of the participants did not have any zero data
PT
non-FBA-based comparison of individual participants, Table 2 displays the PND and Hedges g
RI
effect size data sorted by assessment procedure. The mean Hedges g value across all participants
was 2.71, SD = 2.05, 95% CI [1.93, 3.49]. The mean Hedges g value for indirect FBA
SC
procedures was based on the results of 4 participants, M = 5.18, SD = 4.29, 95% CI [-1.65,
NU
12.00]. Individuals with descriptive analyses (n = 14) had a mean Hedges g of 2.17, SD = 1.08,
95% CI 1.57, 2.77]. Finally, the 11 participants with an EFA had a mean Hedges g of 2.54, SD =
MA
1.36, 95% CI [1.56, 3.52]. For the PND data, 39.4% (n = 13) of the participants had a 100% - no
FBA-based intervention data points overlapped with non-FBA-based intervention ones. Second,
D
TE
27.27% (n = 9) of the participants had a PND of 0%. Namely, the highest data point in the non-
FBA-based intervention overlapped with all data points in the FBA-based intervention.
P
Cost-Benefit Analysis
CE
The time-efficiency and cost-effectiveness analysis was based on the results of nine
AC
studies (n = 21) with problem behavior as outcome and three studies (n = 6) with appropriate
behavior as outcome. For problem behaviors the mean Hedges g for the FBA-based versus non-
FBA-based intervention comparison was 1.99, SD = 1.20, 95% CI [1.44, 2.53] and the mean FA
duration in hours was 2.35. Therefore, every hour of assessment resulted in a decrement of
problem behavior by 0.85 effect size units. For appropriate behaviors, the mean Hedges g for the
FBA-based versus non-FBA-based intervention comparison for this analysis was 4.27, SD =
3.63, 95% CI [0.43, 8.08] and the mean duration of the FBA in hours was 2.28. Thus, on
average, every hour of assessment was associated with an increment of appropriate behavior of
ACCEPTED MANUSCRIPT
1.87 effect size units. It should be noted that most of the studies included in this review targeted
low-severity behaviors. The clinical and social significance of small reductions in severe
behavior (e.g., self-injury, aggression) may be more beneficial than moderate reductions in less
PT
severe behaviors (e.g., disruptive behavior).
RI
Discussion
Overall, the results indicated that FBA-based interventions were associated with large
SC
reductions in problem behavior and large increments in appropriate behavior, relative to non-
NU
FBA-based interventions. Interestingly, the meta-analysis showed no significant differences
between non-FBA interventions and baseline for problem behavior and appropriate behavior.
MA
Publication bias and small study effects were largely non-significant for the studies meta-
analyzed. Supplementary, random effects meta-regression and sensitivity analyses suggested that
D
TE
the methodological quality of the single-subject experimental designs included in the meta-
analysis and the participants’ diagnosis and age did not moderate the magnitude of the
P
that relied on an experimental functional analysis were associated with greater intervention
AC
assessment. In addition, the presence of strategies to experimentally control for sequence effects
might have slightly affected intervention outcomes. However, controlling for sequence effects
induced only a small effect difference and did not produce a consistent bias across target
experimental controls for sequence effects produced less favorable changes in appropriate
behaviour in FBA-based interventions, whereas the effect on problem behavior was more
The results of the meta-analysis were consistent with the more inclusive effect size
analysis including all individual cases without sample size restrictions. First, the range of Hedges
g values for all the individual cases included in the review were in line with the findings of the
PT
meta-analysis. For example, as suggested by the meta-regression analysis, individual Hedges g
RI
values for FBA-based interventions targeting problem behaviors were much larger in studies
using experimental functional analysis relative to FBA-based interventions using other methods.
SC
very few cases achieve the maximum PZD (12.9%). Thus, for most individuals, problem
NU
behavior was not immediately and completely eliminated by FBA-based interventions. The
analysis of PND suggested large intervention effects in over one third of participants for whom
MA
PND equaled 100%. Although the effects were the same qualitatively, overall, effect size values
were relatively higher in the inclusive analysis compared to the the meta-analysis.
D
TE
The findings from the current meta-analysis are consistent with the view that reducing
problem behavior may facilitate engagement in more socially appropriate behaviors. This may be
P
because alternative appropriate behaviors were directly targeted as part of treatment (i.e.,
CE
the 57 cases reviewed), or simply because lower levels of problem behavior free up time for the
individual to engage in other activities (see for example Virues-Ortega, Iwata, Fahmie, &
Harper, 2013). The magnitude of problem behavior decrements mirrored the increments in
appropriate behavior. All treatment studies that reported effects on both problem and appropriate
behavior reported collectively a high correlation across the two outcomes (r > .80). Only one
individual failed to show the latter pattern (Calvin, reported by Starosta et al., 2010). Of note,
Calvin’s treatment evaluation did not show a reduction in problem behavior after an intervention
included in the treatment phase of the study. The results of Calvin are not completely
unexpected, for descriptive analyses often fail to identify the function of problem behavior. For
example, a recent review indicated that the outcome of a descriptive analysis was concordant
PT
with an experimental functional analysis only in 11% of published cases (Wightman et al.,
RI
2014).
The unexpected finding that non-FBA-based interventions met with negligible results is
SC
inconsistent with the findings reported in other meta-analyses (e.g., Miller & Lee, 2013).
NU
However, none of the meta-analyses in the literature are comparable to ours in meta-analytic
methods and target populations. A number of factors may have contributed to this finding. First,
MA
some of the non-FAB-based interventions were non-evidence-based treatments (e.g., sensory
integration therapy). Thus, a small overall effect was to be expected. Second, two circumstances
D
TE
may drive favorable effects: (a) the non-FBA-based intervention alters the function of the
behavior by chance and (b) the non-FBA-based intervention introduces a non-functional, albeit
P
powerful, arbitrary process with reductive effects upon problem behavior. The probability of the
CE
first circumstance is relatively low. For example, if we assume even probabilities for the five
AC
intervention would be 0.20. Yet, this is a false argument, for some interventions are often non-
functional in essence (e.g., token economy for an arbitrary alternative behavior). As per the
second mechanism, only arbitrary non-FBA-based interventions of certain intensity may be able
to outweigh naturally occurring behavior functions. This is the case for interventions that
none of the non-FBA-based interventions in the studies meta-analyzed made use of extremely
ACCEPTED MANUSCRIPT
powerful reinforcers or punishment-based interventions. More mundane reasons might have also
been at work. For example, a client might have been more likely to be referred to the study after
PT
have been less intense, or implemented with poorer procedural integrity. While there are
RI
parsimonious explanations to the non-effect of interventions that ignore behavior function, we
cannot explore further within the current dataset. Future studies may shed some light on this
SC
issue by providing additional detail on the process of admitting a participant; the choice for non-
NU
FBA-based interventions; and the duration, intensity and integrity with which an intervention is
delivered.
MA
The effect size variability for appropriate behaviors was larger than the variability for
problem behavior. Five individual cases showed negative effect sizes suggesting that for these
D
TE
cases the non-FBA-based intervention was more effective. Variability in effect size across cases
may be in part attributed to the nature of the FBA-based intervention. Specifically, not all
P
interventions aim to establish an alternative response to problem behavior. For example, studies
CE
using extinction as a form of FBA-based intervention are less likely to report increments in
AC
treatment and other interventions that rely on the acquisition of an alternative response (e.g.,
Repp et al., 1998; Taylor & Miller, 1997). Appropriate behaviors were rarely the target behavior
of the FBA. For example, the target behaviors in Bellone (2013) were being away from the work
area, engaging in off-task behavior, and inappropriate vocalizations. The author monitored both
disruptive and on-task behavior. Kodak et al. (2011) produced the only study where the FBA
The benefits of FBA-based interventions seem to outweigh the additional time and
efficiency of FBA-based interventions was established on the basis of a pool of studies that
PT
seldom used experimental approaches to functional assessment, which is the highest standard for
RI
the assessment of behavior functions. According to an ample consecutive case series composed
SC
in over 87% of cases. This value was raised to 93% among the cases that allowed up to two
NU
procedural modifications in order to make analogue conditions in the functional analysis more
relevant to the individual (Hagopian et al., 2013). It might be possible to establish an even higher
MA
cost-effectiveness for FBA-based interventions when the current findings are replicated with
experimental functional analyses alone, and with time-efficient variations of the experimental
D
TE
functional analysis in particular. For example, savings in assessment time of up to 80% have
analysis (Kahng & Iwata, 1999), latency-based functional analysis (Thomasson-Sassi, Iwata,
CE
Neidert, & Roscoe, 2011), screening functional analysis (Querim et al., 2013), and single-
AC
A potential set of limitations to the current analysis result from the heterogeneity of
participants and methods in the studies meta-analyzed. For example, participants varied in age
and diagnoses. The most prevalent diagnoses among the cases included were developmental
disability or autism, and intellectual disability. Approximately half of the participants had no
formal diagnosis reported. A range of other diagnoses and conditions were reported, including
ADHD, anxiety, learning disability, and oppositional defiant disorder. The meta-regression did
ACCEPTED MANUSCRIPT
not reveal a moderating role of diagnosis upon treatment effects (ASD/ID vs. other diagnoses).
The small number of studies within each diagnostic category precluded a detailed analysis.
The methods used for behavior function ascertainment varied within and across studies.
PT
For example, Carter and Horner (2009) used two semi-structured interviews with adults who
RI
knew the child and three 20 min direct observations, while Devlin et al. (2011) used EFA for two
participants and FBA questionnaires for other two. Second, while all intervention methods could
SC
be reliably classified as function- and non-function-based, the specific interventions used varied
NU
greatly across studies. Finally, the methodological quality of the studies included in the analysis
also varied greatly. A sensitivity analysis limited to the studies with greater methodological
MA
quality produced numerically more conservative effect size estimates relative to studies with a
lower methodological quality index. However, these differences were not statistically significant
D
TE
A strength of the present study is that it is the first review and meta-analysis to
P
interventions within the same individuals. All the studies retrieved were single-subject
AC
experimental designs. These studies can be also characterized as within-subject interrupted time-
series studies (Haynes, et al., 2011a). Given the lack of large-n controlled trials, time-series
studies as the ones reported here, can convey strong evidence with a relatively small group of
individuals (Hedges & Pigott, 2001). The resource-intensive nature of behavioral assessment and
intervention for problem behavior in children with disabilities makes it almost impossible to use
randomized controlled trials in this particular research niche (see for example Keenan &
own control, taking advantage of the multiple time samples within each phase, thereby
ACCEPTED MANUSCRIPT
minimizing the high levels of between-subject variability that small-n controlled studies often
encounter. In sum, the present review provides a model for future meta-analysis of single-subject
PT
controlled studies in psychology and medicine.
RI
The current analysis may have some practical implications for practitioners. The overall
effect of FBA-based interventions reported is based upon applied studies conducted in “real
SC
world” settings: most studies took place in classrooms. Assessment and intervention methods
NU
were implemented in the client’s natural environment. Furthermore, our analysis suggests that
FBA-based interventions based on any functional assessment method were superior to non-FBA
MA
interventions in general. In addition, experimental methods for the assessment of behavior
functions such as EFA were used among the studies where FBA-based intervention had more
D
TE
favorable outcomes. These findings are consistent with the view that EFA provides the highest
standard for behavior function ascertainment (Beavers et al., 2013) and underlines the need of
P
Conclusions
AC
The current analyses suggest that (a) FBA-based interventions, relative to non-FBA-
based interventions, had incremental clinically significant effects upon both problem and
appropriate behavior among children and youth with and without developmental and intellectual
disabilities, (b) FBA-based interventions had clearly superior effects upon problem behavior,
negligible effects upon problem behavior and small-magnitude effects on appropriate behaviors,
(d) FBA-based interventions following an experimental functional analysis were associated with
approach to functional assessment, (e) the time and resources needed to conduct a pre-
intervention functional analysis were outweighed by the outcome gains that often followed FBA-
based interventions, and (f) the sensitivity and meta-regression analyses did not provide evidence
PT
suggesting that any of the findings in the current analysis might be attributable to low
RI
methodological quality, publication bias, and small-study effects.
SC
NU
MA
D
P TE
CE
AC
ACCEPTED MANUSCRIPT
Appendix 1
SEARCH STRATEGY
PT
1. PsycINFO
RI
Date: Feb 24, 2014
SC
Search strategy: ([Link]("behavioral assessment")) AND ([Link](treatment
NU
OR intervention)) and limited to dissertations and journal articles (books excluded)
MA
2. PubMed
(treatment or intervention)
References
References marked with an asterisk indicate studies that were included in the quantitative review
Allison, D. B., & Gorman, B. S. (1994). “Making things as simple as possible, but no simpler”.
PT
A rejoinder to Scruggs and Mastropieri. Behaviour Research and Therapy, 32, 885-890.
RI
doi:10.1016/0005-7967(94)90170-8
SC
American Psychological Association (2009). Publication Manual of the American Psychological
NU
Auerbach, C. & Zeitlin, W. (2014). SSD for R: An R package for analyzing single-subject data.
MA
Oxford: Oxford University Press.
Beavers, G. A., Iwata, B. A., & Lerman, D. C. (2013). Thirty years of research on the functional
D
doi:10.1002/jaba.30
P
Buchanan, J. A., & Fisher, J. E. (2002). Functional assessment and noncontingent reinforcement
Cafri, G., Kromrey, J. D., & Brannick, M. T. (2009). A SAS® macro for statistical power
doi:10.3758/BRM.41.1.35.
ACCEPTED MANUSCRIPT
*Carter, D. R., & Horner, R. H. (2007). Adding functional behavioral assessment to first step to
doi:10.1177/10983007070090040501
PT
*Carter, D. R., & Horner, R. H. (2009). Adding FBA-based behavioral support to first step to
RI
success: Integrating individualized and manualized practices. Journal of Positive
SC
Cohen, J. (1977). Statistical power analysis for the behavioral sciences (rev. ed.). New York,
NU
NY: Academic Press.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ:
MA
Erlbaum.
*Devlin, S., Healy, O., Leader, G., & Hughes, B. (2011). Comparison of behavior intervention
D
TE
Didden, R. Korzilius, H., van Oorsouw, W., & Sturmey, P. (2006). Behavioral treatment of
CE
doi:10.1352/0895-8017(2006)111[290:BTOCBI][Link];2
Dixon, M. R., Guercio, J., Falcomata, T., Horner, M. J., Root, S., Newell, C., & Zlomke, K.
(2004). Exploring the utility of functional analysis methodology to assess and treat
Durand V. M., & Crimmins D. B. (1988). Identifying the variables maintaining self-injurious
Donovan, J. (2005). Problem behavior theory. In C.B. Fisher and R.M. Lerner (Eds.),
California: Sage.
PT
*Ellingson, S. A., Miltenberger, R. G., Stricker, J., Galensky, T. L., & Garlinghouse, M. (2000).
RI
Functional assessment and intervention for challenging behaviors in the classroom by
SC
doi:10.1177/109830070000200202
NU
*Filter, K. J., & Horner, R. H. (2009). FBA-based academic interventions for problem behavior.
Garman, K. S., Nevins, J. R., & Potti, A. (2007). Genomic strategies for personalized cancer
P
10.1016/[Link].2004.11.004
Hagopian, L., Rooker, G. W., Jessel, J., & DeLeon, I. G. (2013). Initial functional analysis
Hanley, G., Iwata, B., & McCord, B. (2003). Functional analysis of problem behavior: A review.
*Hawkins, R. O., & Axelrod, M. I. (2008). Increasing the on-task homework behavior of youth
PT
Haynes, S. N., Mumma, G. H., & Pinson, C. (2009). Idiographic assessment: Conceptual and
RI
psychometric foundations of individualized behavioral assessment. Clinical Psychology
SC
Haynes, S. N., & O’Brien, W. O. (1990). The functional analysis in behavior therapy. Clinical
NU
Psychology Review, 10, 649–668. doi:10.1016/0272-7358(90)90074-K
Haynes, S., O’Brien, W., & Kaholokula, J. (2011a). Behavioral Assessment and Case
MA
Formulation. Hoboken, New Jersey: John Wiley and Sons.
Haynes, S. N., Smith, G. T., & Hunsley, J. D. (2011b). Scientific Foundations of Clinical
D
TE
Hedges, L. V., & Pigott, T. D. (2001). The power of statistical tests in meta-analysis.
P
Hedges, L. V., Pustejovsky, J. E., & Shadish, W. R. (2012). A standardized mean difference
AC
effect size for single case designs. Research Synthesis Methods, 3, 224-239.
doi:10.1002/jrsm.1052
Hedges, L. V., Pustejovsky, J. E., & Shadish, W. R. (2013). A standardized mean difference
effect size for multiple baseline designs across individuals. Research Synthesis Methods,
4, 324-341. doi:10.1002/jrsm.1086
*Ingram, K., Lewis-Palmer, T., & Sugai, G. (2005). Function-based intervention planning:
doi:10.1177/10983007050070040401
Iwata, B.A., DeLeon, I.G., & Roscoe, E.M. (2013). Reliability and validity of the functional
PT
analysis screening tool. Journal of Applied Behavior Analysis, 46, 271-284.
RI
doi:10.1002/jaba.31
Iwata, B. A., Dorsey, M. F., Slifer, K. J., Bauman, K. E., & Richman, G. S. (1994). Toward a
SC
functional analysis of self-injury. Journal of Applied Behavior Analysis, 27, 197-209.
NU
doi:10.1901/jaba.1994.27-197 (Original published 1982)
Iwata, B. A., DeLeon, I. G., & Roscoe, E. M. (2013). Reliability and validity of the functional
MA
analysis screening tool. Journal of Applied Behavior Analysis, 46, 271-284. doi: 10.1002
Iwata, B. A., Pace, G. M., Dorsey, M. F., Zarcone, J. R., Vollmer, T. R., Smith, R. G. …Wills,
D
TE
doi:10.1901/jaba.1994.27-215
CE
Iwata, B.A., & Worsdell, A.S. (2005). Implications of functional analysis methodology for the
AC
nonprescriptive treatment for children and adolescents with school refusal behavior.
Kahng, S., & Iwata, B.A. (1999). Correspondence between outcomes of brief and extended
Keenan, M., & Dillenburger, K. (2011). When all you have is a hammer: RCTs and Hegemony
doi:10.1016/[Link].2010.02.003
PT
*Kodak, T., Fisher, W. W., Clements, A., Paden, A. R., & Dickes, N. R. (2011). Functional
RI
assessment of instructional variables: Linking assessment and treatment. Research in
SC
Kratochwill, T. R., Hitchcock, J., Horner, R. H., Levin, J. R., Odom, S. L., Rindskopf, D. M., &
NU
Shadish, W. R. (2010). Single-case designs technical documentation. Retrieved from
LaRue, R.H., Lenard, K., Weiss, M.J., Bamond, M., Palmieri, M., & Kelley, M.E. (2010).
P
10.1016/[Link].2009.10.020
doi:10.1080/09362835.2011.565725
Martens, B. K., Gertz, L. E., de Lacy Werder, S. C., & Rymanowski, J. L. (2010). Agreement
9110-9
Matson, J. L., Horovitz, M., Kozlowski, A. M., Sipes, M., Worley, J. A., & Shoemaker, M. E.
PT
(2011). Person characteristics of individuals in functional assessment research. Research
RI
in Developmental Disabilities, 32, 621-624. doi:10.1016/[Link].2010.12.012
SC
effectiveness for students diagnosed with ADHD? A single-subject meta-analysis.
NU
Journal of Behavioral Education, 22, 253-282. doi:10.1007/s10864-013-9174-4
Moher, D., Liberati, A., Tetzlaff, J., & Altman, D. G. (2009). Preferred reporting items for
MA
systematic reviews and meta-analyses: The PRISMA statement. Annals of Internal
Moore, T. R., Gilles, E., Mccomas, J. J., & Symons, F. J. (2010). Functional analysis and
treatment of self-injurious behavior in a young child with a traumatic brain injury. Brain
P
Myrbakk, E., & von Tetzchner, S. (2008). The prevalence of behavior problems among
doi:10.1080/19315860802115607
PT
assessment reliability and effectiveness of FBA-based interventions. Journal of
RI
Emotional and Behavioral Disorders, 12, 168-181. doi:10.1177/10634266040120030401
*Payne, L. D., Scott, T. M., & Conroy, M. (2007). A school-based examination of the efficacy of
SC
FBA-based intervention. Journal of Behavioral Disorders, 32, 158-174.
NU
Querim, A. C., Iwata, B. A., Roscoe, E., Schlichenmeyer, K. J., Virues Ortega, J., & Hurl, K. E.
Qureshi, H., & Alborz, A. (1992). Epidemiology of challenging behavior. Mental Handicap
D
TE
Reitman, D., & Passeri, C. (2008). Stimulus fading and functional assessment to treat pill refusal
P
doi:10.1177/1534650107307476.
AC
*Repp, A., Felce, D., & Barton, L. (1988). Basing the treatment of stereotypic and self-
Roscoe, E. M., Kindle, A., E., & Pence, S. T. (2010). Functional analysis and treatment of
Schlichenmeyer, K. J., Roscoe, E. M., Rooker, G. W., Wheeler, E. E., & Dube, W. V. (2013).
PT
Scotti, J. R., Evans, I. M., Meyer, L. H., & Walker, P. (1991). A meta-analysis of intervention
RI
research with problem behavior: Treatment validity and standards of practice. American
SC
Scruggs, T. E., Mastropieri, M. A., & Casto, G. (1987). The quantitative synthesis of single-
NU
subject research: Methodology and validation. Remedial and Special Education, 8, 24-33.
doi:10.1177/074193258700800206
MA
*Starosta, K. M. (2010). A comparison of functional assessment- and non-functional assessment-
based interventions for students with emotional and behavioral disorders (Doctoral
D
TE
Sturmey, P (Ed.). (2008). Behavioral case formulation and intervention. New York: Wiley.
P
*Taylor, J., & Miller, M. (1997). When timeout works some of the time: The importance of
CE
treatment integrity and functional assessment. School Psychology Quarterly, 12, 4-22.
AC
doi:10.1037/h0088943
Thomason-Sassi, J.L., Iwata, B.A., Neidert, P.L., Roscoe, E.M. (2011). Response latency as an
Thompson, R. H., & Iwata, B. A. (2007). A comparison of outcomes from descriptive and
functional analyses of problem behavior. Journal of Applied Behavior Analysis, 40, 333-
University, Louisiana.
PT
Virues-Ortega, J., Iwata, B. A., Fahmie, T. A., & Harper, J. M. (2013). Effects of alternative
RI
responses on behavior exposed to noncontingent reinforcement. Journal of Applied
SC
Wightman, J. K., Julio, F. J., & Virues-Ortega, J. (2014). Advances in the indirect, descriptive
NU
and experimental approaches to the functional analysis of problem behavior. Psicothema,
Wilder, D. A., Masuda, A., O’Conner, C., & Baham, M. (2001). Brief functional analysis and
P
Wilder, D. A., White, H., & Yu, M. L. (2003). Functional analysis and treatment of bizarre
Yates, B. T., & Taub, J. (2003). Assessing the costs, benefits, cost-effectiveness, and cost-benefit
Footnote
PT
(1) Definitions of terms vary across disciplines and authors within disciplines. Definitions of
RI
terms used in this article include:
SC
causal, and noncausal functional relations applicable to specified behaviors for an individual.
NU
Functional behavioral assessment (FBA): the process of identifying contiguous
antecedent and consequent variables associated with a specific behavior problem using indirect
MA
(questionnaires, interviews), naturalistic observation (descriptive analysis), or experimental
methods.
D
TE
independent variables in order to evaluate their effects on one or more problem behaviors. Often
P
Records identified
PsycINFO (n = 1449) FBA- vs. non-FBA-based
PT
Pubmed (n = 280) comparison absent
Cochrane Library (n = 96) (preliminary screening)
Manual search (n = 5) (n = 1506)
RI
Total distinct = 1763
SC
FBA- vs. non-FBA-based
Full-text articles assessed
comparison absent
NU
for eligibility
(full-text screening)
(n = 257)
(n = 238)
MA
Studies included in
quantitative review
(n = 19)
D
TE
(n = 12)
CE
PT
Bellone (2013) 3 1.65 ( 0.11, 3.19) 2.32 (1.38, 3.26) 0.42 (-0.28, 1.11)
Carter & Horner (2009) 4 1.08 (0.44, 1.72)
RI
Devlin et al. (2011) 3 0.64 (-0.12, 1.41) 0.71 (0.12, 1.30) 0.04 (-0.43, 0.50)
0.33 (-0.66, 1.32)
SC
Ellingson et al. (2000) 4 1.13 (-0.09, 2.36) 0.07 (-0.45, 0.59)
Kearney & Silverman (1999) 3 0.99 (0.03, 1.95)
Newcomer & Lewis (2004) 3 0.91 ( 0.34, 1.47) 0.97 (0.11, 1.83) -0.38 (-0.94, 0.17)
NU
Payne et al. (2007) 3 1.66 (-1.40, 4.72) 0.57 (<-9.99, >9.99) -0.56 (-0.59, 1.70)
Repp et al. (1988) 4 0.98 (-0.05, 2.01) 0.80 (-0.10, 1.69) 0.10 (-4.87, 5.06)
Starosta (2013) 3 0.81 (-0.24, 1.87) 0.38 (-0.29, 1.05) 0.42 (-0.18, 1.02)
MA
Overall 0.92 ( 0.56, 1.27) 0.85 (0.41, 1.29) 0.06 (-0.21, 0.33)
-1 0 1 2 3 -1 0 1 2 3 -1 0 1 2 3
D
Figure 2. Results of random-effect meta-analysis for problem behavior. Effect sizes (Shadish G)
TE
and 95% confidence intervals are reported for every study and two-term comparison across study
P
phases. All values above zero denote a favorable outcome for the second term of the comparison.
CE
Bellone (2013) 4 4.45 (1.42, 7.48) 5.13 (3.49, 6.76) 0.66 (-0.05, 1.36)
Carter & Horner (2009) 3 1.14 (0.55, 1.73)
PT
Hawkins & Axelrod (2008) 3 1.16 (0.60, 1.71) 0.72 (0.00, 1.45) 0.54 (0.03, 1.04)
Kodak et al. (2011) (%A) 4 0.72 (0.15, 1.29) 0.67 (0.20, 1.14) 0.25 (0.03, 0.25)
Kodak et al. (2011) (CU) 4 1.40 (0.77, 2.02) 0.83 (0.28, 1.39) 0.26 (-0.11, 0.64)
Murphy (2010) 4 1.75 (>-9.99, <9.99) 0.52 (-0.29, 1.34) 0.21 (-0.14, 0.56)
RI
Sarasota (2010) 3 0.90 (-0.25, 2.05) 0.41 (-0.35, 1.17) 0.53 (-0.13, 1.18)
Vance (2008) 3 0.81 (<-9.99, >9.99) -2.65 (-3.65,-1.65) 2.77 (<-9.99, >9.99)
SC
Overall
Overall 1.27 ( 0.89, 1.66) 0.76 (-0.27, 1.79) 0.35 (0.14, 0.56)
-1 0 1 2 3 4 5 -4 -3 -2 -1 0 1 2 3 4 5 6 -1 0 1 2 3 4 5
Figure 3. Results of random-effect meta-analysis for appropriate behavior. Effect sizes (Shadish
NU
G) and 95% confidence intervals are reported for every study and two-term comparison across
MA
study phases. For Kodak et al. (2011) both correct unprompted responses (CU) and percentage of
trials attending (%A) are reported. The former was included in the pooled effect size. All values
D
above zero denote a favorable outcome for the second term of the comparison.
P TE
CE
AC
ACCEPTED MANUSCRIPT
1.5
PT
1.0
RI
0.5
0.0
SC
-0.5
NU
-1.0
BL Non-FBA BL BL Non-FBA BL
vs. vs. vs. vs. vs. vs.
MA
FBA FBA Non-FBA FBA FBA Non-FBA
D
Figure 4. Summary of random-effects meta-analysis. Effects above 0 favor the second term of
TE
Table 1
Studies in All
PT
Meta-Analysis Studies
RI
n (%) n (%)
SC
Gender
NU
Female MA 10 (21.7) 11 (19.3)
Diagnosis
Age
Setting
ACCEPTED MANUSCRIPT
PT
Unknown 4 (8.7) 4 (7.0)
RI
Topography of Problem Behaviorsa
SC
Disruptive/Off task 26 (61.9) 43 (81.1)
NU
School refusal 4 (9.5) 4 (7.5)
FBA-based Interventiona
PT
Antecedent Manipulation 0 (0.0) 2 (3.8)
RI
Differential Reinforcement 22 (52.4) 22 (41.5)
SC
Multiple 15 (35.7) 22 (41.5)
NU
Note. For functional assessment method, studies that had an EFA were considered
EFA, Studies that had descriptive functional assessment but no EFA were considered a
MA
descriptive functional assessment, and studies that used an indirect method of
assessment. aDoes not include Kodak et al. (2011) as the target behaviors in their study
TE
Table 2
CE
Study Characteristics and Effect Sizes for FBA-based Versus Non-FBA-based Intervention
Comparison.
AC
Problem Appropriate
Behavior behavior
Study Proble Functio Quali Ag Diagnosi FB FBA- PN PZ Hedg PN Hedg
m n ty e s A based D D es g D es g
behavi Index (y) Interventi (%) (%) (%)
or on
Bellone 0
(2013)
Jackso D/OB Attentio 4 None EF DR 50.0 22.2 6.9 100.
n n A 0 1.4
Percy D/OB Attentio 3 None EF DR 100. 50.0 5.0 100.
n A 0 0 3.2
Derric D/OB Attentio 4 None EF DR 70.0 10.0 4.4 100.
k n A 0 1.9
Marcu D/OB Attentio 4 None EF DR 100. 0.0 2.1 100. 5.0
s n A 0 0
ACCEPTED MANUSCRIPT
Problem Appropriate
Behavior behavior
Study Proble Functio Quali Ag Diagnosi FB FBA- PN PZ Hedg PN Hedg
m n ty e s A based D D es g D es g
behavi Index (y) Interventi (%) (%) (%)
or on
PT
Carter & 5
Horner
(2007)
RI
Noah Multipl Attentio 6 None IN Antecede 36.8 18.8 1.1 57.9 0.4
e n D nt
SC
Carter & 5
Horner
(2009)
Gabrie Multipl Attentio 6 None IN DR 85.7 0.0 1.5 85.7 2.8
NU
l e n D
Jonas Multipl Attentio 7 None IN DR 28.6 0.0 0.9 0.0 9.5
e n D
MA
Patrick Multipl Attentio 5 None IN DR 73.9 0.0 1.4 26.1 8.0
e n D
Devlin et 3
al. (2011)
D
e A
CE
(2000)
Christi D/OB Attentio 19 DD/ID DE DR 50.0 100. 1.7
ne n S 0
Derek AGG Attentio 18 DD/ID DE DR 50.0 50.0 0.6
n S
Kurt D/OB Attentio 12 DD/ID DE DR 80.0 33.3 2.0 80.0 1.0
n S
Filter & 4
Horner
(2009)
Brett D/OB Escape 10c LD EF Antecede 2.0 87.5 1.8
A nt
Dylan D/OB Escape 10c None EF Multiple 100. 0.0 2.0 100. 3.1
A 0 0
Hawkins 0
&
Axelrod
(2008)
ACCEPTED MANUSCRIPT
Problem Appropriate
Behavior behavior
Study Proble Functio Quali Ag Diagnosi FB FBA- PN PZ Hedg PN Hedg
m n ty e s A based D D es g D es g
behavi Index (y) Interventi (%) (%) (%)
or on
PT
James D/OB Escape 12 ADHD/ DE DR 75.0 1.9
BD S
Rob D/OB Escape 11 ADHD/ DE DR 100. 5.2
RI
BD S 0
Tom D/OB Escape 16 ADHD/ DE DR 75.0 2.0
SC
LD S
Sean D/OB Escape 16 ODD DE DR 0.0 1.6
S
NU
Ingram et 4
al. (2005)
Carter D/OB Escape 12c None IN Multiple 100. 0.0 3.2
D 0
MA
Bryce D/OB Escape 12c None IN Multiple 88.9 0.0 1.7
D
Kearney 0
&
D
Silverma
n (1999)
TE
e D 0
CE
Kodak et 5
al.
(2011)b
Kevin 4 ASD EF N/A 100. 2.8
A 0
Kevin 4 ASD EF N/A 100. N/A
A 0
Linda 7 ASD EF N/A 0.0 0.8
A
Linda 7 ASD EF N/A 0.0 0.1
A
Bobby 4 ASD EF N/A 52.9 1.1
A
Bobby 4 ASD EF N/A 58.8 2.4
A
Hal 4 ASD EF N/A 62.1 1.5
A
ACCEPTED MANUSCRIPT
Problem Appropriate
Behavior behavior
Study Proble Functio Quali Ag Diagnosi FB FBA- PN PZ Hedg PN Hedg
m n ty e s A based D D es g D es g
behavi Index (y) Interventi (%) (%) (%)
or on
PT
Hal 4 ASD EF N/A 0.0 1.3
A
Murphy 5
RI
(2010)
Kenny D/OB Multipl 9 None DE Multiple 80.0 1.7
SC
e S
Sharon D/OB Multipl 9 ADHD DE Multiple 100. 2.9
e S 0
NU
Dennis D/OB Multipl 6 None DE Multiple 100. 2.7
e S 0
Stephe D/OB Multipl 6 None DE Multiple 100. 2.8
n e S 0
MA
Mustian 4
(2010)
Todd D/OB Escape 11 None EF Multiple 100. 0.0 14.2
A 0
D
Newcome 5
r & Lewis
(2004)
P
w A
Jerrod D/OB Escape 11 None EF Multiple 5.3 0.0 2.8
A
AC
Problem Appropriate
Behavior behavior
Study Proble Functio Quali Ag Diagnosi FB FBA- PN PZ Hedg PN Hedg
m n ty e s A based D D es g D es g
behavi Index (y) Interventi (%) (%) (%)
or on
PT
P3 STYP Automa 6 DD/ID DE Extinctio 80.0 0.0 1.8
tic S n
Starosta 3
RI
(2010)
Harry D/OB Attentio 10 None DE Multiple 60.0 0.0 1.5 60.0 1.6
SC
n S
Calvin D/OB Escape 9 ADHD DE Multiple 0.0 0.0 0.0 100. 2.1
S 0
NU
Eduard D/OB Attentio 12 None DE DR 25.0 0.0 0.7 0.0 0.4
o n S
Taylor & 4
Miller
MA
(1997)
Tate D/OB Escape 11 ASD EF Extinctio 94.4 100. 5.5
A n 0
Reily D/OB Escape 9 ASD EF Extinctio 100. 0.0 4.0
D
A n 0
Vance 5
TE
(2008)
Carlos D/OB Attentio 11a None DE DR 0.0 2.8
n S
P
a n S
Stacy D/OB Attentio 11a None DE DR 0.0 1.8
n S
AC
Wilder et 4
al. (2006)
Amy Tantru Tangibl 2 None EF Multiple 100. 66.7 1.9
ms e A 0
Don Tantru Escape 3 None EF Multiple 100. 100. N/A
ms A 0 0
Notes. Kodak’s target behavior was an appropriate behavior; intervention labeled as non-applicable (N/A). All
studies with the exception of Devlin et al. (Ireland) were conducted in the US.
a
Age estimate based on grade level, 4th grade was coded as 10 years old, 5th grade as 11 years old, and 6th grade as
12 years old. b Appropriate behaviors for each individual included correct unprompted responses (listed first) and
percentage of trials attending (listed second).
ADHD = Attention Deficit/Hyperactivity Disorder; ASD = Autism Spectrum Disorder; BD = Behavior Disorder;
DD/ID = Developmental/Intellectual Disability; DES = Descriptive analysis; D/OB = Disruptive/Off-Task
Behavior; DR = Differential Reinforcement; EFA = Experimental Functional Analysis; IND = Indirect assessment;
LD = Learning Disability; ODD = Oppositional Defiant Disorder; PND = Percentage of Non-Overlapping Data;
PZD = Percentage of All Zero Data; SIB = Self-Injurious Behavior; SR = School Refusal; STYP = Stereotypic
Behavior.
ACCEPTED MANUSCRIPT
Table 3.
PT
Problem Behavior Appropriate behavior
RI
Non-FBA- Non-FBA-
Baseline vs. Baseline vs.
n n based vs. n n based vs.
SC
FBA-based FBA-based
FBA-based FBA-based
FBA ns p = .067 ns ns
Non- 4 0.83 ( 0.53, 6 0.61 ( 0.27, 4 1.11 ( 0.61, 5 0.07 (-1.03,
NU
experimental 3 1.32) 3 0.95) 2 1.61) 2 1.18)
1.01 ( 0.49, 1.63 ( 0.46, 2.55 (-0.36, 2.91 (-1.30,
Experimental 1.54) 2.80) 5.45) 7.11)
MA
Age (years) ns p = .007 ns ns
≤9 3 0.89 ( 0.32, 4 1.17 ( 0.54, 3 2.41 (-0.01, 4 1.63 ( 0.50,
>9 4 1.45) 5 1.80) 2 4.83) 2 2.76)
0.94 ( 0.48, 0.45 ( 0.06, 1.11 ( 0.61, 0.57 ( 0.05,
D
Diagnosis ns ns ns ns
ASD, ID 3 0.84 ( 0.29, 3 0.46 (-0.02, 1 - 1 -
Other 4 1.39) 6 0.94) 5 1.26 ( 0.53, 6 0.78 (-0.52,
P
Study quality ns ns ns ns
≤3 3 0.89 ( 0.32, 4 1.17 ( 0.54, 3 2.12 ( 0.38, 5 0.81 ( 0.63,
>3 4 1.45) 5 1.80) 2 3.86) 2 2.41)
AC
Notes. P values of random-effect meta-regression; values < .1 reported. All values above zero denote a favorable
outcome for the second term of the comparison. Continuous predictors: age and study quality; medians used as
cut-off points in the sensitivity analysis. Sensitivity analysis reported as effect sizes and 95% confidence
intervals. ASD = Autism spectrum disorder; FBA = Functional behavioral assessment; ID = Intellectual
disability; n = number of studies; ns = non-significant.
ACCEPTED MANUSCRIPT
HIGHLIGHTS
PT
The effect of FBA-based interventions on appropriate behavior was four times
larger.
A pre-intervention functional analysis was associated with larger effects.
RI
SC
NU
MA
D
P TE
CE
AC