Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
research-article2018
PBIXXX10.1177/1098300718768204Journal of Positive Behavior InterventionsGage et al.
Empirical Research
Journal of Positive Behavior Interventions
Quasi-Experimental Analysis
Abstract
This study explored the effects of School-Wide Positive Behavior Interventions and Supports (SWPBIS) on school
suspensions and behavioral incidents for elementary and intermediate schools in Georgia implementing with fidelity by
comparing results with a propensity score–matched comparison group of schools that never received SWPBIS training.
Significant decreases in suspensions and disciplinary exclusions were found in schools implementing with fidelity compared
with matched comparison schools. Schools implementing SWPBIS with higher fidelity had fewer out-of-school suspensions
and disciplinary incidents than schools implementing with lower levels of fidelity, but both groups had significantly fewer
suspensions and incidents than the comparison group. When converted to standard mean difference effect sizes, results
indicated medium to large effects. These findings suggest that SWPBIS is an effective model for reducing disciplinary
exclusions and disciplinary incidents and that implementing SWPBIS with fidelity can result in meaningful improvements on
student behavioral outcomes in schools.
Keywords
positive behavior support, school-wide intervention(s), disciplinary exclusion
According to the most recent data from the U.S. Department deleterious effects of suspensions on students and recom-
of Education’s Office of Civil Rights, over three million stu- mended three strategies to reduce their use in schools: (a)
dents receive one or more in-school suspension and just over early intervention, (b) early identification, and (c) the imple-
three million students receive one or more out-of-school sus- mentation of School-Wide Positive Behavior Interventions
pension each year (U.S. Department of Education, 2016). Of and Supports (SWPBIS). Of those, SWPBIS is the only rec-
those being suspended, a student with a disability is over ommendation that has an established framework with clearly
twice as likely to be suspended as a student without a dis- outlined critical components. Yet, although evidence sug-
ability, while Black males are three and a half times more gests that SWPBIS is an effective framework for reducing
likely to be suspended than White students. Although school problem behavior and, subsequently, suspensions (Horner,
suspension, a widely used form of disciplinary exclusion, is Sugai, & Anderson, 2010), most of the empirical evidence is
a consequence for a behavioral incident in school (e.g., based on case studies and correlational analyses. In fact,
fighting, disrespect, theft), suspensions have concerning Gage, Whitford, and Katsiyannis (in press) conducted a sys-
short- and long-term consequences. Suspensions reduce the tematic review and meta-analysis of experimental (i.e., ran-
amount of time students are in class and receiving instruc- domized controlled trials [RCTs]) and quasi-experimental
tion (Losen, Hodson, Keith, Morrison, & Belway, 2015), studies evaluating the impact of SWPBIS on school-level
decrease student engagement in school (Kupchik & Catlaw,
2015; Mowen & Manierre, 2015) and academic achieve- 1
University of Florida, Gainesville, USA
2
ment (Arcia, 2006; Noltemeyer, Ward, & McLoughlin, University of South Florida, Tampa, USA
2015), and significantly increase the odds of dropping out of Corresponding Author:
school (Noltemeyer et al., 2015) and being arrested in early Nicholas A. Gage, University of Florida, 1403 Normal Hall, P.O. Box
adulthood (Mowen & Brent, 2016). 117050, Gainesville, FL 32611, USA.
A recent policy statement from the American Academy of Email: gagenicholas@gmail.com
Pediatrics Council on School Health (2013) highlighted the Action Editor: Melissa Stormont
2 Journal of Positive Behavior Interventions 00(0)
rates of disciplinary exclusions, including in- and out-of- (www.pbis.org). Research studies have documented the
school suspensions and office discipline referrals (ODR). impact of SWPBIS on both student- and school-level out-
Overall, they found four experimental studies conducted comes. Empirical evidence documents improvements in dis-
with 90 schools total. No treatment effects were found for ciplinary consequences, school climate, organizational
ODR, but there was a significant and large effect for school health, bullying and peer victimization, and academic
suspensions (g = –.86). Two of the four studies included sus- achievement when implemented with fidelity (Bradshaw,
pensions with an effect size based only on a total sample of Koth, Thornton, & Leaf, 2009; Bradshaw, Mitchell, & Leaf,
just 72 schools. Council for Exceptional Children (CEC) 2010; Childs, Kincaid, George, & Gage, 2016; Gage, Leite,
quality indicators (Gersten et al., 2005) suggested that a Childs, & Kincaid, 2017; Horner, Sugai, Smolkowski, et al.,
practice be considered promising if (a) at least two high- 2010; Sadler & Sugai, 2009; Simonsen et al., 2012;
quality studies support the practice and (b) the weighted Waasdorp, Bradshaw, & Leaf, 2012). SWPBIS provides
effect size is greater than 0. Both studies with suspension schools with a framework to construct an efficient and effec-
outcomes (Algozzine et al., 2012; Flannery, Fenning, Kato, tive prevention continuum of behavioral supports, ensuring
& McIntosh, 2014) met at least 93% of the CEC quality indi- evidence-based practices are accessible to all students across
cators. Therefore, based on Gersten et al.’s criteria, SWPBIS all settings (Sugai, Simonsen, Bradshaw, Horner, & Lewis,
is a promising practice, but more research is needed for 2014). Therefore, studies evaluating the school-level effects
determining whether SWPBIS is an evidence-based prac- of SWPBIS are aligned with questions of efficacy and
tice. Therefore, we conducted a large quasi-experimental effectiveness.
design (QED) study on the treatment effects of SWPBIS on
school-level disciplinary exclusions.
Empirical Evidence of School-Level
Effects
SWPBIS A handful of studies have examined the effects of SWPBIS
SWPBIS is a systematic approach for implementing school- on student behavioral outcomes, including suspensions, by
wide evidence-based prevention and intervention practices comparing schools implementing with fidelity to those not
using a multitiered system to provide behavioral support for implementing with fidelity. For example, Simonsen et al.
all students (Horner, Sugai, & Anderson, 2010). SWPBIS is (2012) examined SWPBIS implementation in 428 Illinois
not a curriculum, intervention, or program (Sugai & Horner, schools across 7 years and found that schools implementing
2009), but a process of building a school’s capacity to (a) SWPBIS with fidelity had significantly fewer school sus-
implement effective and preventive behavioral practices pensions. However, no differences were found for ODR by
with integrity, (b) make data-based and team-based deci- implementation levels. Childs et al. (2016) examined the
sions, and (c) build a positive school climate and culture relationship between fidelity of implementation and school
leading to school improvement and success (Horner, Sugai, suspensions in 1,122 Florida schools across 4 years. Their
& Anderson, 2010). SWPBIS is comprised of a continuum findings suggest schools implementing with higher fidelity
of three tiers of prevention and intervention, including pri- had significantly fewer suspensions and ODR. Results from
mary prevention implemented across schools for all stu- these studies, along with Freeman et al. (2016), highlighted
dents, secondary prevention implemented for small groups the relationship between increased fidelity of implementa-
of students, and tertiary intervention provided for individu- tion and reduced ODR. However, these studies only com-
als to meet the unique needs of each student. When data pared schools that received SWPBIS training rather than
indicate that students are not responsive to the primary pre- comparing results to control schools that were not trained.
vention efforts (e.g., positively stated behavioral expecta- Two RCT studies and two QED studies examined the
tions that are taught and reinforced school-wide, a school-level effects of SWPBIS on suspensions and ODR
continuum for responding to inappropriate behaviors), they while including a business-as-usual control group.
receive targeted interventions, which may include evidence- Bradshaw et al. (2010) examined the effects of SWPBIS via
based mentoring programs such as Check-In Check-Out a 5-year longitudinal RCT conducted in 37 elementary
(Crone, Hawken, & Horner, 2010) or small group social schools in Maryland. Overall, the authors found significant
skills lessons (Mitchell, Stormont, & Gage, 2011). Students reductions in suspensions in both treatment groups, but
who continue to exhibit elevated levels of behavior prob- there was no significant difference between treatment and
lems following targeted intervention can be referred for control groups (d = −0.07). The authors only collected ODR
intensive or tertiary intervention, which typically involves data for treatment schools; thus, no comparisons were con-
development of a functional behavior assessment and a sub- ducted for this measure. Algozzine et al. (2012) examined
sequent individualized behavior intervention plan. the effects of multitiered academic and behavior (i.e.,
SWPBIS has been implemented for over 30 years in more SWPBIS) instruction in seven urban elementary schools in
than 25,000 schools nationally and in numerous countries the Southeastern United States. The authors only reported
Gage et al. 3
ODR differences between treatment and control schools at small or medium and large extent of evidence is n = 350
the beginning of the study and did not report differences at units of analysis (WWC, 2014). One way to increase the
the end. However, the authors reported on suspensions, sample size of efficacy studies is to leverage existing
finding no significant difference between the treatment and school-level data made available by state departments of
control schools. It should be noted that in one additional education. Therefore, to increase the experimental evidence
SWPBIS RCT, Horner, Sugai, Smolkowski, et al. (2010) supporting the implementation of SWPBIS with fidelity on
reported ODR data; however, the authors did not compare school-level disciplinary exclusions, we conducted a sec-
ODR rates by treatment conditions. ondary analysis of publicly available school-level data in
Three QED studies examined differences in suspensions Georgia and conducted a quasi-experimental analysis using
and ODR between schools implementing and not imple- propensity score matching analysis. Specific research ques-
menting SWPBIS. Nelson, Martella, and Marchand- tions were as follows:
Martella (2002) evaluated the effects of a comprehensive
school-based program designed to prevent problem behav- Research Question 1: Is there a statistically significant
iors and maximize student learning. The program included difference in the number of disciplinary incidents and in-
five core elements: (a) SWPBIS, (b) one-to-one tutoring in school and out-of-school suspensions for schools imple-
reading, (c) conflict resolution, (d) a video-based family menting SWPBIS with fidelity compared with matched
management program, and (e) individualized, function- comparison schools?
based behavior intervention plans when necessary. The Research Question 2: Is there a statistically significant
study included seven elementary schools in the treatment difference in the number of disciplinary incidents and in-
group and 28 elementary schools in the comparison group. school and out-of-school suspensions for schools imple-
All schools were in the same district in the Northeastern menting SWPBIS at different fidelity levels compared
United States. Overall, the authors found significant reduc- with matched comparison schools?
tions in suspensions (d = −0.85) and ODR (d = −0.94).
However, the authors did not report fidelity of implementa-
tion; therefore, it is unclear how well the SWPBIS elements Method
were implemented. Caldarella, Shatzer, Gray, Young, and
Young (2011) examined the impact of SWPBIS in two mid-
Sample
dle schools in the Western United States. Overall, the Data from all public schools in the state of Georgia during
authors found no significant differences in the reductions of the 2015–2016 school year were collected from the Georgia
ODR between schools at the end of 4 years. Furthermore, Department of Education (GaDOE) website. We restricted
the study compared one school to one school (n = 1 con- the data to only public elementary and intermediate schools,
found) and the authors did not report fidelity of implemen- excluding alternative schools, vocational/technical schools,
tation. Flannery et al. (2014) examined the effects of middle schools, and high schools because reported training
SWPBIS in 12 high schools in the Northeastern and and fidelity data on SWPBIS included only public elemen-
Midwestern United States. Overall, the authors found sig- tary and intermediate schools. The final sample included
nificant decreases in ODR across 3 years of implementing 1,755 schools. The average school enrollment was 548.5
SWPBIS and fidelity of implementation had a significant students (SD = 245.2), and there were equal percentages of
effect on ODR reductions. students that were White (40.3%) and Black (39.0%), fol-
Overall, results of studies examining the school-level lowed by students that were Latino/a (13.8%). On average,
effects of SWPBIS on suspensions and ODR are mixed. 68.5% of the students were considered economically disad-
Furthermore, the small number of schools included in those vantaged and almost 12% of students received special edu-
studies limits generalization. The largest study only exam- cation services.
ined differences in disciplinary exclusions by fidelity levels Implementation of SWPBIS in Georgia is supported by
and did not include comparisons with business-as-usual the GaDOE, which facilitates district-level planning and
comparison groups (Childs et al., 2016). Taken together, it provides school team training, technical assistance, and
is clear that more research is necessary to evaluate the effect ongoing coaching to SWPBIS district coordinators to build
of SWPBIS on disciplinary exclusions. capacity and support the SWPBIS process. SWPBIS imple-
mentation occurs in districts where active support through a
district leadership team, a district action plan, and district
Purpose
coordinator has been established. The GaDOE website
To date, few studies meeting rigorous research design stan- reports the names of schools receiving SWPBIS training and
dards, including RCT and group QED studies with equiva- support, and three levels of fidelity of implementation:
lent comparison groups, have been conducted. For example, Installing, defined primarily as fidelity below 70% on the
What Works Clearinghouse (WWC) sample size criteria for Benchmarks of Quality (BoQ); Emerging, defined as fidelity
4 Journal of Positive Behavior Interventions 00(0)
Table 1. Descriptive Statistics for Full Sample and Propensity Score–Matched Samples.
Treatment
Full sample PSM comparison schools
(n = 1,418) schools (n = 119) (n = 119) Equivalence
Note. PSM = propensity score–matched. Equivalence is the standardized mean difference between the full sample and treatment (Full) and the PSM
comparison schools and the treatment schools (PSM).
between 70% and 85% on the BoQ; and Operational, defined (107), with scores 70% or above considered implementing
as BoQ above 85%. In this study, we focused exclusively on with fidelity. The BoQ scores are collected each year in
schools implementing SWPBIS with fidelity (Emerging and April or May for all schools implementing SWPBIS in
Operational) and excluded schools (n =218) that were Georgia.
Installing as they were not implementing Tier 1 prevention
efforts with fidelity but had received training. A total 119 School-level covariates
schools implemented SWPBIS with fidelity during the Student demographics. We included seven student demo-
2015–2016 school year. The average enrollment for schools graphic covariates in the final dataset. First, we captured the
implementing SWPBIS with fidelity (i.e., treatment schools) total student enrollment for each school, and the percentage
was 528.9 students (SD = 170.7) and the majority of students of students in each school that were categorized as White,
were White (44.1%), followed by students that were Black Black, or Latino/a. Next, we captured the percentage of stu-
(32.6%) and students that were Latino/a (16.2%). On aver- dents in each school that was categorized as economically
age, 68.9% of the students were economically disadvantaged disadvantaged, defined as the number of students eligible
and 11.2% of students received special education services. to receive free- or reduced-price meals (as reported to the
Demographic characteristics for all schools by fidelity GaDOE in the October Nutrition Count) divided by the total
groups are presented in Table 1. school enrollment (as reported by the October Full-Time
Equivalency[FTE] count). Last, we collected the percent-
age of students with disabilities, defined as any student
Measures receiving special education services, and the percentage of
BoQ. The BoQ (Childs, Kincaid, & George, 2011; Cohen, students categorized as limited English proficient (LEP) by
Kincaid, & Childs, 2007; Kincaid, Childs, & George, 2005, their school.
2010) is a self-report measure used to assess the implemen-
tation fidelity of SWPBIS at the Tier 1/universal level. The Georgia Milestones Assessment System. The Georgia
BoQ consists of 53 items rated on a 3-point Likert-type Milestones Assessment System measures how well students
scale (i.e., In Place, Needs Improvement, and Not in Place). have learned the knowledge and skills outlined in the state-
Prior psychometric evidence suggests that the BoQ demon- adopted content standards in English Language Arts, Math-
strates strong internal consistency (overall α = .96), inter- ematics, Science, and Social Studies. Students in Grades 3
rater reliability (r = .87), and test–retest reliability (r = .94). through 8 take an end-of-grade assessment in English Lan-
The 53 items are organized under 10 subscales reflecting guage Arts and Mathematics while students in Grades 5 and
the essential components of Tier 1 implementation (e.g., 8 are also assessed in Science and Social Studies. Students
faculty commitment, expectations and rule developed, receive a scale score and an Achievement Level designation
classroom systems). Scores for the BoQ are scaled as the based on their total test performance in each content area.
percentage of points earned out of the total possible points Achievement levels are as follows: Beginning Learner,
Gage et al. 5
Developing Learner, Proficient Learner, and Distinguished propensity score is defined as the conditional probability of
Learner. We captured the percentage of students at the Pro- treatment assignment based on all available covariates
ficient Learner and above levels for each school across the (Rosenbaum & Rubin, 1983) and can be used for one-to-one
four subject tests because this level indicates grade-level matching treatment to comparison schools. The value of
performance. PSM is that a covariate equivalent comparison group can be
matched to a treatment group, meeting established standards
Outcome variables for high-quality QED research proposed by the WWC (2014)
Behavioral incidents. The GaDOE collects the frequency evidence standards, and that treatment estimates have been
of 36 different behavioral incidents that result in an office found to be as accurate as those from RCT studies (Fortson,
disciplinary referral for all schools in Georgia. The incidents Verbitsky-Savitz, Kopa, & Gleason, 2012).
are operationally defined by GaDOE Pursuant to O.C.G.A. Following procedures outlined by Leite (2017), we esti-
§ 20-2-740 in a discipline matrix designed to provide guid- mated propensity scores using logistic regression and all
ance for schools and districts (Georgia Department of Edu- available school-level covariates (see Table 1). Specifically,
cation, 2017). During the 2015–2016 school year, 33% of we created a dichotomous variable for all schools. Schools
all reported incidents were for incivility, defined as insub- that implemented SWPBIS with fidelity were coded as 1,
ordination or disrespect to staff members or other students, while all other schools were coded as a 0. Next, we esti-
which includes but is not limited to refusal to follow school mated the predicted probability (p), or propensity score,
staff member instructions, use of vulgar or inappropri- that a school was in the treatment or control group based on
ate language, and misrepresentation of the truth. Another the included covariates, log(p / [1 – p]). Then we used the
21% of incidents were for disorderly conduct, defined as estimated propensity scores to match schools using the one-
any act that substantially disrupts the orderly conduct of a to-one optimal matching method (Rosenbaum, 1989),
school function, substantially disrupts the orderly learning which minimizes global propensity score distance between
environment, or poses a threat to the health, safety, and/ treatment and comparison schools. Essentially, the one-to-
or welfare of students, staff, or others (includes disruptive one matching procedure identifies a perfect match school
behaviors on school buses). We summed all 36 behavioral for each treatment school so that the treatment and compari-
incidents together because (a) occurrences were infrequent son schools were equivalent, or the same, on all of the
across many of the incidents (e.g., gang-related incident) included covariates. In an RCT, any difference between
and (b) we were interested in the impact of SWPBIS on treatment and comparison schools is assumed to be random
disciplinary incidents broadly as a proxy for ODR. error, while in PSM, differences between the two groups on
school characteristics are equivalent, ensuring that any
School suspensions. The GaDOE also collects frequency treatment results are based on the intervention, not any
(i.e., number, not length) data on disciplinary outcomes, other potential confound. The one-to-one optimal matching
including in- and out-of-school suspension. In-school sus- algorithm was conducted using the matchit (Ho, Imai, King,
pensions are defined by GaDOE as the temporary removal Stuart, & Whitworth, 2017) and optmatch (Hansen,
of a student from his or her regular classroom(s) for at least Fredrickson, Fredrickson, Rcpp, & Rcpp, 2016) packages
half a school day. The student remains under the direct in R (R Core Team, 2013). To confirm covariate equiva-
supervision of school personnel (direct supervision means lence, we calculated standardized mean difference effect
school personnel are physically in the same location as stu- sizes (d), where equivalence is defined as d < .25 standard
dents under their supervision; Georgia Department of Edu- deviations (WWC, 2014).
cation, 2017). Out-of-school suspension is defined simply
as the removal of a student from the school for at least one Estimation of treatment effects. The three primary outcome
school day. variables (i.e., behavioral incidents, in-school suspensions,
and out-of-school suspensions) were all scaled as counts;
therefore, modeling of treatment effects relied upon Poisson
Data Analysis
regression to accurately estimate treatment effects given
To address the primary research questions in this study, we their distributional characteristics. Although the PSM
conducted a QED comparing schools implementing SWPBIS model, or design part of the analysis (Rubin, 2007), con-
with fidelity to propensity score–matched comparison trolled for all potential available confounds on the treatment
schools not implementing SWPBIS. effect (Leite, 2017), we controlled for covariates with stan-
dardized mean differences greater than .05 (WWC, 2014).
Propensity score matching. Propensity score matching (PSM) As noted, we estimated six Poisson regression models:
methods are designed to reduce bias in treatment effect esti- three treatment-on-the-treated (ToT) models and three mod-
mates in experimental design studies that do not have random els examining differences by fidelity levels using the fol-
assignment of participants to conditions (i.e., Leite, 2017). A lowing formulas:
6 Journal of Positive Behavior Interventions 00(0)
log ( λ ) = β0 + β1Treatment + β2 % White + β3 %Black counts is expected to change by the respective coefficient.
Therefore, to determine the strength of the relationship rela-
D + β6 Language
+ β4 %Hispanic + β5 %SPED
tive to the scaling, the coefficient (β) becomes the exponent
+ β7 Math + β8Science. of e, and is subtracted from 1 and multiplied by 100 to iden-
log ( λ ) = β0 + β1Fidelity Levels + β2 % White + β3 %Black tify the relative percentage change in dependent variable
counts for the treatment group. This calculation was con-
+ β4 %Hispanic + β5 %SPED + β6 Language
ducted in Microsoft Excel for each treatment coefficient
+ β7 Math + β8Science. using the following function:
Parameter β SE β SE β SE
Intercept 3.79*** 0.259 2.86*** 0.221 3.74*** 0.149
SWPBIS −0.80*** 0.020 −1.60*** 0.021 –0.89*** 0.011
% White 0.00 0.002 0.01*** 0.002 0.01*** 0.001
% Black 0.01*** 0.003 0.01*** 0.002 0.02*** 0.001
% Latino/a 0.00 0.003 0.01* 0.002 0.01*** 0.002
% SPED 0.05*** 0.003 0.09*** 0.002 0.05*** 0.002
% Proficient: Math −0.05*** 0.002 −0.12*** 0.002 –0.07*** 0.001
% Proficient: Language Arts 0.03*** 0.002 0.08*** 0.002 0.04*** 0.001
% Proficient: Science 0.01*** 0.002 0.04*** 0.001 0.03*** 0.000
Note. SWPBIS = a dichotomous indicator for school implementing school-wide positive behavior intervention and supports; SPED = special education.
*p < .05. ***p < .001
Out-of-school
suspension In-school suspension Disciplinary incidents
Parameter β SE β SE β SE
Intercept 3.51*** 0.263 2.87*** 0.222 3.53*** 0.150
SWPBIS: Emerging −0.63*** 0.028 −1.62*** 0.033 −0.75*** 0.016
SWPBIS: Operational −0.92*** 0.025 −1.59*** 0.024 −0.98*** 0.014
% White 0.00 0.002 0.01*** 0.002 0.01*** 0.001
% Black 0.02*** 0.003 0.01*** 0.002 0.02*** 0.001
% Latino/a 0.00 0.003 0.01* 0.002 0.01*** 0.002
% SPED 0.05*** 0.003 0.09*** 0.002 0.05*** 0.002
% Proficient: Math −0.05*** 0.002 −0.12*** 0.002 −0.07*** 0.001
% Proficient: Language Arts 0.03*** 0.002 0.08*** 0.002 0.04*** 0.001
% Proficient: Science 0.01*** 0.002 0.04*** 0.001 0.03*** 0.001
Note. SWPBIS = School-Wide Positive Behavior Interventions and Supports and fidelity levels was factored so that control schools are the reference
group for the emerging and operational groups; SPED = special education.
*p < .05. ***p < .001
Table 4. Standardized Mean Difference Effect Sizes for fidelity. The effect sizes suggest that differences were, on
Treatment and PSM Comparison Schools. average, almost 0.50 standard deviation units.
These findings extend prior research by evaluating treat-
PSM
SWPBIS comparison ment effects on a much larger sample of schools and utilizes
schools schools a QED that could meet the WWC evidence standards with
reservations. Specifically, we included a total of 238 schools,
Outcome M SD M SD g a sample almost 3 times that of all prior experimental studies
OSS 29.9 34.2 69.2 95.8 −0.54 on suspensions (Gage et al., in press), and established base-
ISS 24.6 55.7 123.1 188.2 −0.71 line equivalence on relevant covariates. Perhaps the most
Disciplinary incident 92.1 97.5 232.6 291.2 −0.64 directly comparable study to ours is the Bradshaw et al.
(2010) RCT. They randomly assigned 37 elementary schools
Note. PSM = propensity score–matched; SWPBIS = a dichotomous to treatment (n = 21) and comparison (n = 16) groups and
indicator for school implementing school-wide positive behavior
intervention and supports; OSS = out-of-school suspension; ISS = in- trained schools in the treatment group to implement univer-
school suspension. sal SWPBIS with fidelity. However, after 4 years of imple-
mentation, they found no significant difference in school
found statistically significantly fewer behavioral incidents suspensions between treatment and comparison groups. Two
and suspensions in schools implementing SWPBIS with possible reasons for the difference in findings were (a) that
8 Journal of Positive Behavior Interventions 00(0)
schools in the Bradshaw et al. study had very few suspen- Limitations
sions and (b) the school-level sample size was not large
enough to estimate a meaningful treatment effect. On aver- Although all efforts were made to ensure the highest quality
age, the treatment schools suspended ~30 students, while the QED study, a number of limitations necessitate mention.
comparison schools suspended an average of ~17 students First, this study relied solely on administrative data, includ-
across the 4 years of implementation. The authors reported ing reporting of fidelity of implementation. There is no way
suspensions as the percentage of students and analyzed the to independently confirm the reliability of the BoQ scores
differences using a nonparametric test, presumably due to and the fidelity levels reported on the GaDOE website.
the scaling of their outcome (i.e., percentage of students) and Relatedly, there is no way to confirm the reliability of the
the small sample size (n = 37). In our study, we were able to outcome measures (i.e., discipline incidents and suspen-
(a) collect the exact number of suspensions from each school sions). Second, although PSM created an equivalent com-
and (b) estimate an effect with a much larger sample size, parison group, there are other potential confounds we were
resulting in a significant treatment effect. unable to match schools on. The goal of PSM is to include
With regard to ODRs, the most comparable study used a all possible school covariates, but not all potential con-
QED and was conducted by Nelson et al. (2002). Nelson founders were available. For example, some schools may
and colleagues trained seven elementary schools to imple- have been under state investigation for disproportionate
ment SWPBIS across all three tiers and found significant suspension rates or have superintendents or principals with
and large differences in the number of ODR between the new goals for decreasing suspensions. Future research may
treatment and comparison students, similar to the results of consider additional covariates related to school policies cor-
this study. However, Nelson et al. trained schools to imple- related with the likelihood of school suspensions. Third, the
ment all three tiers of prevention and intervention, while the suspension outcome did not disaggregate suspensions by
schools in this study were only evaluated on their universal the number of days students received in- and out-of-school
implementation as measured by the BoQ. Some of the suspensions. Therefore, we cannot confirm whether or not
schools may have been implementing additional tiers, but there was a differential effect on suspension length. Fourth,
no addition information beyond universal implementation fidelity of implementation was based on the BoQ, which is
was available. The difference in effect sizes in our study typically completed by the school and their SWPBIS coach.
may be the additional impact adding Tiers 2 and 3 can have The addition of an external measure, such as the SET, would
on ODR. Unfortunately, Nelson and colleagues did not increase the reliability of the fidelity score. Last, this study
report fidelity of implementation; therefore, future research is a QED relying on administrative data. There is no way to
is needed to evaluate any additive effect on ODR for imple- confirm whether or not data entry errors could have been
menting Tiers 2 and 3 with fidelity. made. Future studies should consider utilizing RCT designs
In addition to overall treatment effects, we examined dif- in which they assign schools at random to SWPBIS and
ferences by fidelity levels as defined by the GaDOE. control conditions to increase the validity of the research
Specifically, we examined whether higher levels of imple- findings.
mentation fidelity, defined as 85% or greater on the BoQ,
resulted in fewer behavioral incidents and suspensions.
Overall, both groups reported statistically significant fewer
Implications
incidents and suspensions than the PSM comparison schools. In this study, we evaluated the efficacy of SWPBIS imple-
However, for out-of-school suspensions and behavioral inci- mented with fidelity on behavioral incidents and school
dents, schools implementing with higher levels of fidelity suspensions and found significant reductions for both out-
had larger treatment effects, suggesting that fidelity has a comes. Our results have important implications for both
direct relationship with incidents and out-of-school suspen- researchers and practitioners. First, the significant and mod-
sions. This finding is encouraging as it suggests that the erate treatment effect needs to be replicated in other states
more components of Tier 1 a school implements, the greater and settings, controlling for additional confounds and
the impact on behavioral incidents and the lower likelihood including additional outcomes and populations (e.g., stu-
a student receives an out-of-school suspension. This finding dents receiving special education services). We believe that
corroborates the results of prior longitudinal evaluations of this study provides a template for future state-level analyses
SWPBIS. Childs et al. (2016) found that as fidelity of imple- that can be replicated broadly. Second, the results suggest
mentation, as measured by the BoQ, increased in Florida that state-level initiatives, like Georgia’s, can (a) effectively
schools, both suspensions and ODR decreased. Similar train districts and schools to implement SWPBIS with fidel-
results were found by Simonsen et al. (2012) for suspen- ity in large numbers of schools and (b) that the training and
sions. Therefore, there appears to be a relation between subsequent implementation can have a meaningful impact
fidelity of implementation and ODR and suspensions, with on disciplinary incidents and suspensions. Finally, we
greater fidelity resulting in fewer exclusions. believe the study results provide evidence that fidelity of
Gage et al. 9
Department of Child & Family Studies, University of South Rosenbaum, P. R. (1989). Optimal matching for observational
Florida, Tampa. Retrieved from http://flpbs.fmhi.usf.edu/ studies. Journal of the American Statistical Association, 84,
ProceduresTools.cfm 1024–1032.
Kincaid, D., Childs, K., & George, H. P. (2010). School-wide Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the
Benchmarks of Quality (BoQ). Unpublished instrument, propensity score in observational studies for causal effects.
Department of Child & Family Studies, University of South Biometrika, 70, 41–55.
Florida, Tampa. Retrieved from http://flpbs.fmhi.usf.edu/ Rubin, D. B. (2007). The design versus the analysis of observa-
ProceduresTools.cfm tional studies for causal effects: Parallels with the design of
Kupchik, A., & Catlaw, T. J. (2015). Discipline and participation: randomized trials. Statistics in Medicine, 26, 20–36.
The long-term effects of suspension and school security on Sadler, C., & Sugai, G. (2009). Effective behavior and instruc-
the political and civic engagement of youth. Youth & Society, tional support: A district model for early identification and
47, 95–124. doi:10.1177/0044118X14544675 prevention of reading and behavior problems. Journal of
Leite, W. (2017). Propensity score methods using R. Los Angeles, Positive Behavior Interventions, 11, 35–46.
CA: SAGE. Simonsen, B., Eber, L., Black, A. C., Sugai, G., Lewandowski,
Losen, D., Hodson, C., Keith, M. A., Morrison, K., & Belway, H., Sims, B., & Meyers, D. (2012). Illinois statewide posi-
S. (2015). Are we closing the school discipline gap? tive behavioral interventions and supports: Evolution
UCLA Center for Civil Rights Remedies. Retrieved from and impact on student outcomes across years. Journal of
https://www.civilrightsproject.ucla.edu/resources/projects/ Positive Behavior Interventions, 14, 5–16. doi:10.1177/1098
center-for-civil-rights-remedies/school-to-prison-folder/ 300711412601
federal-reports/are-we-closing-the-school-discipline-gap/ Sugai, G., & Horner, R. H. (2009). Defining and describing school-
AreWeClosingTheSchoolDisciplineGap_FINAL221.pdf wide positive behavior support. In W. Sailor, G. Dunlap, G.
Mitchell, B. S., Stormont, M., & Gage, N. A. (2011). Tier two Sugai, & R. Horner (Eds.), Handbook of positive behavior
interventions implemented within the context of a tiered pre- support (pp. 307–326). New York, NY: Springer.
vention framework. Behavioral Disorders, 36, 241–261. Sugai, G., Simonsen, B., Bradshaw, C., Horner, R., & Lewis, T. J.
Mowen, T., & Brent, J. (2016). School discipline as a turn- (2014). Delivering high quality school wide positive behavior
ing point: The cumulative effect of suspension on arrest. support in inclusive schools. In J. McLeskey, N. L. Waldron,
Journal of Research in Crime & Delinquency, 53, 628–653. F. Spooner, & B. Algozzine (Eds.), Handbook of effective
doi:10.1177/0022427816643135 inclusive schools: Research and practice (pp. 306–321). New
Mowen, T. J., & Manierre, M. J. (2015). School security measures York, NY: Routledge.
and extracurricular participation: An exploratory multi-level U.S. Department of Education. (2016). 2013-2014 civil rights data
analysis. British Journal of Sociology of Education, 38, 344– collection: A first look. Washington, DC: U.S. Department of
363. doi:10.1080/01425692.2015.1081091 Education Office for Civil Rights. Retrieved from https://
Nelson, J. R., Martella, R. M., & Marchand-Martella, N. (2002). www2.ed.gov/about/offices/list/ocr/docs/2013-14-first-look.
Maximizing student learning: The Effects of a comprehen- pdf
sive school-based program for preventing problem behaviors. Waasdorp, T. E., Bradshaw, C. P., & Leaf, P. J. (2012). The
Journal of Emotional and Behavioral Disorders, 10, 136– impact of schoolwide positive behavioral interventions and
148. doi:10.1177/10634266020100030201 supports on bullying and peer rejection: A randomized con-
Noltemeyer, A. L., Ward, R. M., & McLoughlin, C. (2015). trolled effectiveness trial. Archives of Pediatrics & Adolescent
Relationship between school suspension and student out- Medicine, 166, 149–156.
comes: A meta-analysis. School Psychology Review, 44, 224– What Works Clearinghouse. (2014). What Works Clearinghouse
240. doi:10.17105/spr-14-0008.1 procedures and standards handbook (Version 3.0). Author:
R Core Team. (2013). R: A language and environment for statisti- Washington, DC. Retrieved from https://ies.ed.gov/ncee/
cal computing. Vienna, Austria: R Foundation for Statistical wwc/Docs/referenceresources/wwc_procedures_v3_0_stan-
Computing. Available from http://www.R-project.org/ dards_handbook.pdf