Sei sulla pagina 1di 56

ADMINISTRATION AND

SCORING MANUAL
FOR THE
OQ

-45.2
(OUTCOME QUESTIONNAIRE)
IMPORTANT NOTICE!
BEFORE BREAKING THE SEAL ON THIS MANUAL
READ THE AGREEMENT ON THE BACK OF THIS MANUAL
J ANUARY 2004
AMERICAN PROFESSIONAL CREDENTIALING SERVICES, L.L.C.
(Toll Free) 1-888-MH SCORE (1-888-647-2673)
E-MAIL: apcs@oqfamily.com
WEB: www.oqfamily.com
Copyright 1994, 1996, 2004 by
American Professional Credentialing Services L.L.C.
Administration and Scoring Manual
for the
OQ

-45.2
(Outcome Questionnaire)
Michael J. Lambert, Jared J. Morton, Derick Hatfield, Cory Harmon, Stacy Hamilton,
Rory C. Reid, Kenichi Shimokawa, Cody Christopherson, and Gary M. Burlingame
BRIGHAM YOUNG UNIVERSITY
American Professional Credentialing Services L.L.C.
(Toll Free 1.888.647.2673)
Revised J anuary 2004
Copyright 1994, 1996, 2004 by
American Professional Credentialing Services L.L.C.
Acknowledgments
We wish to recognize individuals and organizations that have contributed and acted as partners
in the development of the OQ

-45.2. Funding for this project came from the College of Family, Home,
and Social Sciences, Brigham Young University. Without the kind support of the University, a project of
this size could not have been undertaken.
Human Affairs International (HAI), and particularly Betty Lynn Davis, LCSW, ACSW (Vice
President of Quality Management & Training), Wayne Neff, Ph.D. (Implementation Manager, Clinical
Management), and J eb Brown, Ph.D. (Director of Clinical Programs) were highly supportive in the
initial development of the OQ.
Under the direction of Curtis W. Reisinger, Ph.D. at Intermountain Health Cares Psych-Resource
Network of Salt Lake City, the members of its Center for Behavioral Healthcare Efficacy demonstrated
unabated commitment to a variety of projects related to the OQ

-45.2. In addition, they worked to


ensure successful development and initial distribution of the OQ

to interested users. Thanks also to


Peter Moran, Ph.D. and Leonard Doerfler, Ph.D. of the Boston Road Clinic and Assumption College,
whose early use of the OQ

in their investigations of outcomes with inpatients was most helpful.


The Brigham Young University Counseling Center has been instrumental in testing the value of
the OQ

-45.2 as a means of improving the quality of patient care. In particular we want to thank Drs.
David Smart, Stevan Nielsen, J ohn Okiishi, David Vermeersch, and Ronald Chapman for their support
and leadership in showing how outcome research can be used to affect clinical practice. Without the
commitment of the fine clinicians at the Counseling Center developing methods to implement quality
management and test their effects would not have been possible.
We would also like to thank the many students who, as members of the Center For Psychotherapy
Outcome Research Group at Brigham Young University, helped with data collection and analysis; with-
out their painstaking efforts the OQ

-45.2 would only be an idea. Thanks also to the many people who
gave their time and effort by taking the OQ

-45.2, especially those patients and non patients who filled


it out on a weekly basis.
Largely because of the unselfish support of these organizations and people, the OQ

45.2 is
now in use with the public. It is a pleasure to offer it at a low cost to the professional community for
unlimited use. We ask that OQ

-45.2 users carefully follow our licensing requirements. We would ap-


preciate your support in encouraging your colleagues to properly license and use this tool. With this type
of support we will be able to continue to offer the OQ

-45.2 as one of the most competitively priced


mental health outcome tools available.
American Professional Credentialing Services LLC
TABLE OF CONTENTS
Introduction............................................................................................................................. 1
Administration of the OQ-45.2................................................................................................ 2
Scoring..................................................................................................................................... 2
Test Interpretation.................................................................................................................... 3
Psychometric Properties .......................................................................................................... 8
Calculation of Cutoff Scores for Rating Recovery, Improvement, and Deterioration............. 17
Interpretation of Initial Score................................................................................................... 19
Potential Uses of the Outcome Questionnaire......................................................................... 21
Clinical Applications of the Instrument for Outcomes Assessment.......................................... 29
Additional Versions of the OQ................................................................................................. 30
References................................................................................................................................ 31
Technical Report #1Factor Analysis..................................................................................... 36
Technical Report #2Psychometric Properties of the Dutch OQ-45.2.................................. 40
Spanish version of the OQ-45.................................................................................................. 42
Appendices AG.................................................................................................................... 43
LIST OF TABLES
Table 1: Normative Groups for the OQ

- Total Score........................................................... 4
Table 2: Normative Groups for the OQ

- Domain Scores..................................................... 4
Table 3: Comparison of Gender Scores on the OQ

-Total Score.......................................... 4
Table 4: Comparison of Gender Scores on the OQ

-Domain Scores.................................... 5
Table 5: OQ

Score by Age in a Sample of EAP Patients....................................................... 5


Table 6: OQ

Score by Ethnicity in a Sample of EAP Patients............................................... 5


Table 7: Comparative Outcomes of Native American, Latino/a, African American
Asian/Pacific Islander, and Caucasian Clients.......................................................................... 7
Table 8: Test-Retest Reliability and Internal Consistency Values for the
OQ

Total and Domain Scores ................................................................................................ 8


Table 9: Correlation Coefficients between Weekly Testing on the
OQ

Over a 10 Week Period.................................................................................................... 8


Table 10: Validity Estimates for the OQ

................................................................................ 9
Table 11: Validity Data from Patient Populations................................................................... 9
Table 12: Amount of Improvement Demonstrated by the OQ

after
Seven Sessions of Therapy....................................................................................................... 11
Table 13: Average Slopes, t and d Values Based on Comparisons between
Average Slopes, and Allocation by Sensitivity to Change for Clinical
and Nonclinical Samples on the 45 Items, Subscales, and Total Score
of the Outcome Questionnaire................................................................................................. 12
Table 14: Average Slopes, t and d Values Based on Comparisons Between
Average Slopes, and Allocation by Sensitivity to Change for Clinical
and Nonclinical Samples on the 45 Items, Subscales, and Total Score of
the Outcome Questionnaire(Counseling Center Samples) ....................................................... 14
Table 15: Comparison of Level of Psychopathology as Measured by the OQ

across
Patient and Nonpatient Samples .............................................................................................. 17
Table 16: Sensitivity and Specificity of the OQ

45................................................................ 17
Table 17: Outpatient Benchmarks for the OQ-45................................................................... 29
Table 18: Number of Patients, by Site, Who Demonstrated Reliable Negative
Change (Deteriorated), Did Not Demonstrate Reliable Change
(No Change), Demonstrated Reliable Positive Change (Improved),
and Demonstrated Reliable Change into the Functional Range (Recovered) .......................... 30
LIST OF FIGURES
Figure 1: Mean OQ-45 scores in Mainland, Hawaii, Pacific, Asian/Chinese,
Korean Samples................................................................................................................... 6
Figure 2: Outcome Questionnaire (OQ) Item Response Curves for Item 42:
I feel blue.............................................................................................................................. 15
Figure 3: Outcome Questionnaire (OQ) Item Response Curves for Item 35:
I feel afraid of open spaces, driving, being on buses, subways, & so forth. ......................... 15
Figure 4: Outcome Questionnaire (OQ) Total Score Response Curves ................................. 16
Figure 5: Relationship Between Number of Sessions of Therapy,
Pretest OQ

, Raw Score, and Rapidity of Improvement ......................................................... 23


Figure 6: CS Probability.......................................................................................................... 23
Figure 7: RC Probability......................................................................................................... 24
Figure 8: Expected Recovery Curve for Intake OQ-45 Total Scores 87-88........................... 27
Figure 9: Treatment Gains for Signal Alarm Cases Following Feedback to
Therapists about Potential Treatment Failure Versus No-Feedback........................................ 29
ADMINISTRATION AND SCORING MANUAL
FOR THE
OQ

-45.2
Administration and Scoring Manual
for the OQ

-45.2
INTRODUCTION
The OQ

-45.2 (Outcome Questionnaire: herein


referred to as the OQ

) measures patient progress in


therapy, and is designed to be repeatedly administered
during the course of treatment and at termination. Pa-
tient progress is measured along several important di-
mensions, based on Lamberts (1983) conceptualization
suggesting that three aspects of the patients life be
monitored: 1) Subjective discomfort (intrapsychic func-
tioning), 2) Interpersonal relationships, and 3) Social
role performance. These areas of functioning suggest a
continuum covering how the person feels inside, how
they are getting along with significant others, and how
they are doing in important life tasks, such as work and
school. In addition, the OQ

was designed to be used


as a baseline screening instrument to apply to gross treat-
ment assignment decisions. The OQ

was not designed


to be used for patient diagnosis, a task that is appropri-
ate for much longer tests such as the MMPI-2. The OQ

is conceptualized as having three levels of usage: 1) To


measure current level of distress; 2) As an outcome
measure to be administered prior to and following treat-
ment interventions, or to monitor ongoing treatment re-
sponse; and 3) To accompany computerized decision
support tools to improve the quality of patient care.
The OQ

was designed to address limitations of


other current outcome measures. Specifically, the OQ

is available at low cost, sensitive to change over short


periods of time, and brief, while maintaining high levels
of reliability and validity. The OQ

was also designed


to access common symptoms across a wide range of
adult mental disorders and syndromes, including stress-
related illnesses and V. codes.
Test Development
The selection of specific items was determined by
several considerations. First, items were selected that
address commonly occurring problems across a wide
variety of disorders. Second, items needed to tap the
symptoms that are most likely to occur across patients,
regardless of their unique problems. Third, items needed
to measure personally and socially relevant character-
istics that affect the quality of life of the individual.
Finally, the number of items was limited so that admin-
istration of the OQ

assists, rather than hinders, cus-


tomary clinical practice. The length of the OQ

makes
it tolerable to patients and suitable for repeated testing
while providing clinicians with data that can be used
for decision making. Preliminary information on the
basic characteristics of the OQ

was published by
Burlingame, Lambert, Reisinger, Neff, and Mosier
(1995).
The rationale behind selection of each of the three
domains (subscales) constituting the OQ

is described
below. Results of a large scale facor analysis can be
found near the end of the manual in Technical Report
#1.
Symptom Distress (SD)
This subscale, measuring subjective (symptom) dis-
tress, was derived from: 1) a 1988 NIMH study (Regier
et al., 1988) that identified the most prevalent types of
mental disorders across five U.S. catchment areas; and,
2) a review of a nationwide insurance companys records
on the frequency of diagnosed DSM-III-R disorders. The
1988 epidemiological study of 18,571 people across the
United States showed that 15.4% of the population over
18 years of age fulfilled diagnostic criteria for a mental
disorder. Approximately 12% of the total population
received either an anxiety diagnosis or an affective dis-
order classification. The insurance company data re-
porting codes given to 2,145 patients indicated nearly
one-third of the diagnoses given involved a form of af-
fective disorder. An additional third dealt with some
kind of anxiety disorder, including posttraumatic stress
disorder. These data suggest that the most common in-
trapsychic symptoms to be measured are depression and
anxiety-based, particularly when adjustment disorders
are also taken into account. However, considerable re-
search suggests that the symptoms of anxiety and de-
pression cannot be easily separated and tend to occur
simultaneously and across a wide variety of patients
who are diagnosed with a variety of other disorders (e.g.,
Feldman, 1993). Therefore, the OQ

was heavily loaded


with such items, but no attempt has been made to pro-
vide separate scales for anxiety and depression symp-
OQ

45.2 Administration and Scoring Manual 1


toms. Next to these disorders, substance abuse was the
most common diagnosis, and thus, items on substance
abuse were also included in the OQ

.
Interpersonal Relations (IR)
The OQ

includes items that measure satisfaction


with, as well as problems in, interpersonal relations.
Research on life satisfaction and quality of life suggests
that people consider positive relationships essential to
happiness (Andrews & Witney, 1974; Beiser, 1983;
Blau, 1977; Deiner, 1984; Veit & Ware, 1983). Re-
search on patients seeking therapy has shown that the
most frequent problems addressed in therapy are inter-
personal in nature (Horowitz, 1979; Horowitz,
Rosenberg, Baer, Ureno, & Villasenor, 1988). While
factors associated with quality of life vary from study
to study, most emphasize the importance of intimate
relationships and their central contribution to well-be-
ing (Deiner, 1984; Zautra, 1983). In addition, interper-
sonal problems are clearly related to intrapersonal dis-
tress, either as a direct cause or result of psychopathol-
ogy, or as both a cause and a result (Horowitz et al.,
1988). Therefore, items dealing with friendships, fam-
ily, family life, and marriage were included for assess-
ment. Items were included that attempt to measure fric-
tion, conflict, isolation, inadequacy, and withdrawal in
interpersonal relationships. These items were derived
from marital and family therapy literature, as well as
from research on those interpersonal problems most of-
ten described by patients who are undergoing psycho-
therapy (Horowitz et al., 1991).
Social Role (SR)
Social role performance was assessed by focusing
on the patients level of dissatisfaction, conflict, distress,
and inadequacy in tasks related to their employment,
family roles, and leisure life. Assessment of social roles
suggests that a persons intrapsychic problems and symp-
toms can affect their ability to work, love, and play.
This is supported by the quality of life research already
discussed, as well as by the rationale that once people
start to develop symptoms it is common for these symp-
toms to have an effect on their personal and work lives
(Frisch, Cornell, Villanueva, & Retzlaff, 1992). Kopta,
Howard, Lowrey, & Beutler (1994) also suggest that
these symptoms can exist somewhat independently of
intrapsychic symptoms and subjective distress. Thus,
items were developed that measure performance in so-
cietal tasks, such as work and leisure. Satisfaction in
these areas is highly correlated with ratings of overall
life satisfaction (Beiser, 1983; Blau, 1977; Frisch et al.,
1992; Veit & Ware, 1983).
Overall, the OQ

is proposed as a brief screening


and outcome assessment scale that attempts to measure
the subjective experience of a person, as well as the way
they function in the world. A copy of the license agree-
ment is found in Appendix G.
ADMINISTRATION
The OQ

is self-administering and requires no in-


structions beyond those printed on the answer sheet.
Participants should be encouraged to complete all items.
It should be mentioned that participants taking this
test can be affected by the attitudes of those who are in
charge of the administration. It is important for the test
administrator to encourage the participant to fill out the
scale in an honest and conscientious manner. Negative
attitudes by clinicians or others who administer this test
can severely impair its validity, as can personal reasons
respondents may have for wanting to give a less than
candid picture of themselves.
Time
Under usual circumstances participants will com-
plete the scale in about five minutes. Some especially
careful individuals may require as much as 15 minutes,
while others can complete the test in three to four min-
utes.
Narrative Administration
Under special circumstances the OQ

can be ad-
ministered orally. If the patient is unable to read, physi-
cally unable to write, or if the test is administered by
phone, for example, in a follow-up study, completion of
the test can be accomplished by reading items to the
patient. This can be accomplished by giving the patient
a card with a 0-4 numerical scale (i.e., never to almost
always), or by asking them to write the scale out and
refer to it while the administrator reads the items. The
administrator may then enter the item responses on the
blank test or directly into a data base. This procedure,
however, will often increase the time of administration.
SCORING
Scoring the OQ

is a straightforward procedure in-


volving simple addition of item values. The OQ

pro-
vides a total score and three individual domain scores.
Each item is scored on a five-point Likert scale (range
0-4). Special attention must be given to nine items that
are scored in reverse (1, 12, 13, 20, 21, 24, 31, 37, &
OQ

45.2 Administration and Scoring Manual 2


43). In order to alleviate any possible confusion these
items may create, we recommend using computer-
scanned scoring methods or software solutions. The
Total Score (TOT) is calculated by summing the patients
ratings across all 45 items. This yields a total score rang-
ing from 0-180. The higher the score, the more dis-
turbed the individual.
The Symptom Distress score (SD) is calculated by
summing the patients ratings on items 2, 3, 5, 6, 8, 9,
10, 11, 13, 15, 22, 23, 24, 25, 27, 29, 31, 33, 34, 35,
36, 40, 41, 42, and 45. The Symptom Distress score
has a range from 0-100.
The Interpersonal Relations score (IR) is calculated
by summing the patients ratings on items 1, 7, 16, 17,
18, 19, 20, 26, 30, 37, 43. This score ranges from 0-
44.
The Social Role score (SR) is calculated by sum-
ming the patients ratings on items 4, 12, 14, 21, 28, 32,
38, 39, and 44. The Social Role scale has a range from
0-36.
Computer Scoring
Automated scoring can be accomplished through a
variety of means such as scoring software, fax to file,
web, and scanning. Information about the availability
and cost of these systems can be obtained from
apcs@oqfamily or www.oqfamily.com. We highly rec-
ommend software solutions or internet scoring and ad-
ministration because they are efficient, accurate, and
allow the user to take advantage of quality assurance
tools to be discussed later in this manual.
Template Free Scoring
A self-scoring copy of the OQ

is included with this


manual (see Appendix A). To use this form of the test,
the participant simply writes the selected numeric value
in the corresponding blank space. After values are en-
tered on the appropriate blanks on the test form, sum up
the patients responses on each subscale, then add the
subscales to calculate the total score. A sample-scored
copy of the OQ

is presented in Appendix B. Any criti-


cal item with an answer other than zero should be flagged
for clinician attention. The critical items are 8 (sui-
cide), 11, 32 (drug/alcohol abuse screening), and 44
(work violence).
Missing Data
In the event that participants omit answers to items,
substitute values are prepared by computing the mean
of the remaining domain items and rounding to the near-
est whole number. This value is then inserted into the
test in place of the missing value.
TEST INTERPRETATION
Normative Data
Normative data were drawn from several samples
collected across a variety of geographical locations in
the United States. The undergraduate samples from
Utah, Idaho, and Ohio were tested in a classroom set-
ting, with a proctor administering the tests to the stu-
dents after reading the test directions and obtaining in-
formed consent. At this time a formal consent form was
also completed. In order to ensure candid responses,
participants names were detached from their test re-
sults after they consented to participate. A participant
number was assigned at that time. Testing lasted ap-
proximately 20 minutes. Retest administration followed
the same procedure three weeks following the initial test-
ing period. Stability coefficients based on the Pearson
Product Moment Correlation Coefficient allow estimates
of the reliability of testing performed on a weekly basis.
The community sample was drawn from a variety
of locations. A sub-sample of 208 individuals was col-
lected from Utah. Participants were chosen by select-
ing each tenth name in the local Utah County phone
directory. They were then contacted by phone. At this
time, adults in the household were asked if they would
fill out questionnaires in order to help the researchers
better understand the tests and how people respond to
them. If they consented to participate, they were mailed
questionnaires along with the consent form and a return
envelope. After a week they were contacted by phone
to see if they had complied. If they had not, they were
encouraged to do so. Responses were anonymous to
encourage candid reporting.
Additional normative groups were collected from
business settings. A large national insurance firm with
800 employees allowed us to administer the OQ

. A
letter was sent under the signature of the primary au-
thor to each of the employees. The purpose of testing
was explained and they were asked to complete the OQ

and return it in a project-provided envelope. Comple-


tion of the test was voluntary and employees were in-
structed not to provide their name or other identifying
information. Of the 800 OQs

that were mailed out,


365 (45%) were returned. This same procedure was
also replicated in Ohio in a variety of business settings.
The data collected from the various community and
business locations were analyzed for differences using
a one-way ANOVA. As no significant differences were
found, the community data were merged into one large
OQ

45.2 Administration and Scoring Manual 3


sample of 815 participants.
Data from the clinical samples were typically col-
lected by clinic receptionists who administered the OQ

prior to the patients first therapy session and any sub-


sequent therapy session. Included in the test packet was
information pertaining to participant confidentiality as
well as a formal consent form. The University Coun-
seling Center data came from a counseling center at a
large private Western University. Student clients were
included in the sample whether or not they received a
DSM diagnosis. The employee assistance program
(EAP) patients came from a database supplied by Hu-
man Affairs International. This EAP patient sample
sought, or was referred for assistance and received a
DSM-III-R diagnosis. EAP patients who came for help
or were referred but who were not diagnosed or treated
for an emotional problem were excluded from the study,
as were patients who were immediately referred for in-
or outpatient treatment. The data summarizes responses
from patients across seven different states. The outpa-
tient clinic sample was drawn from a university-based
outpatient clinic used to train clinicians in social work,
clinical psychology, and marriage and family therapy.
The outpatient clinic sample was drawn from an Ohio-
based community mental health center serving a mostly
rural catchment area. Inpatient data came from samples
in Utah and Massachusetts. Data from the clinical
samples have been combined as the values of the groups
are comparable.
At this point in time, normative data on the follow-
ing samples have been analyzed: college undergradu-
ates, community volunteers, University Counseling Cen-
ter clients, employee assistance program patients, uni-
versity outpatient clinic patients, community mental
health center patients, and inpatients. Normative data
for the OQ

Total Score are presented in Table 1 and


for the domain scores in Table 2. The data presented in
these tables is divided by the site of the data collection.
Tables reflecting gender, age, and ethnic groups are pro-
vided in later sections of this manual.
It is apparent from these tables that there are clear
differences between the non-patient samples and the
patient samples in mean scores. These differences are
discussed more fully under the topic of construct valid-
ity.
Gender Differences
For the groups where gender data were available, it
is apparent that no differences exist between the aver-
age scores of males and females (see Table 3 and Table
4). Inferential statistics (F test) confirm the obvious
similarities between male and female OQ

scores. This
is true in both patient and non-patient samples. Thus, it
does not appear to be necessary to have distinct male/
female norms or interpretative graphs. Callahan and
Hyman (2002), on the other hand, reported some differ-
ences based on gender (with females scoring higher)
within patient samples but no differences within their
non-patient samples. While reporting some statistically
different mean scores, they did not report scores in a
form that allowed interpretation of the extent to which
differences were clinically relevant.
Age Differences
The OQ

was administered to adults between the


ages of 17 and 80. Data at the upper end of the age
continuum are not yet sufficient to draw firm conclu-
sions, but the data analyzed up to this point do not sug-
Sample N Mean S.D.
Undergraduate Students (Utah) 235 42.15
16.61
Undergraduate Students (Idaho) 131 51.34
24.45
Undergraduate Students (Ohio) 172 45.63 18.06
Community 815 45.19 18.57
EAP Clinical Services 441 73.61 21.39
University Counseling Center 486 75.16 16.74
Outpatient Clinics 342 83.09 22.23
Inpatient 207 88.8 26.66
Normative Groups For The OQ Total Score
Table 1
Di stress Interpersonal Soci al Rol e
Sampl e N Mean S.D. Mean S.D. Mean S.D.
Undergraduate Students
(Utah)
235 22.96 10.48 8.78 4.97 10.40 3.62
Undergraduate Students
(Idaho)
131 27.51 14.55 12.42 7.20 11.41 4.73
Undergraduate Students
(Ohio)
172 25.20 11.04 10.30 5.33 10.13 3.69
Community 815 25.43 11.55 10.20 5.56 9.56 3.87
EAP Clinical Services 441 42.87 14.33 17.15 6.05 13.77 4.90
University Counseling
Center
486 41.28 14.53 18.57 4.28 14.64 3.96
Outpatient Clinics 342 49.40 15.05 19.68 5.93 14.01 5.30
Inpatient 207 49.92 15.97 20.73 7.44 15.90 7.67
Tabl e 2
Normati ve Groups For The OQ

Domai n Scores
Sample N Mean S.D.
Undergraduate 238 42.33 (16.60)
Male 91 42.73 (15.89)
Female 147 42.1 (17.21)
Community 102 48.16 (18.23)
Male 46 49.2 (17.59)
Female 56 48.43 (18.48)
Employee Assistance Program 504 73.02 (21.05)
Male 198 73.52 (21.87)
Female 306 72.7 (20.70)
University Outpatient Clinic 76 78.01 (25.71)
Male 23 76.27 (26.53)
Female 53 81.82 (23.58)
TABLE 3
Comparison of Gender Scores on the OQ

Total Score
OQ

45.2 Administration and Scoring Manual 4


gest a significant correlation between age and OQ

score.
Exemplary data on this topic are presented in Table 5.
These data are from the Employee Assistance Program
database.
Callahan & Hynan (2002) have reported some age
differences in a study of undergraduates, people in an
internet sample, and clinic clients. They reported no dif-
ferences in the normal samples, but among patients, those
under 20 years of age were significantly more disturbed
than other age groupings. Those in the age group from
20 to 39 had the lowest scores.
Ethnicity and Cross-Cultural Considerations
The OQ

has been administered to adults of several


ethnic groups. The data available from some ethnic
groups are not yet sufficient to draw definite conclu-
sions. Limited data for members of the African Ameri-
can, and elevated scores for African Americans were
the following: 11, After heavy drinking, I need a drink
the next morning to get going; 19, I have frequent
arguments; 18, I feel lonely; and 26, I feel annoyed
by people who criticize my drinking. The questions
that showed elevated scores for Caucasians were both
positive items: 20, I feel loved and wanted; and 37, I
feel my love relationships are full and complete.
Contrary to Total Score similarities for the above-
mentioned groups, Gregersen, Nebeker, Seely and Lam-
bert (2005) found OQ-45 total score differences between
Asians, Pacific Islanders, and Caucasians. In an effort
to explore the generalizability of norms developed for
the OQ-45 on different populations of the Pacific Rim,
this study investigated total score differences of non-
patient students, whose ethnic identity included Cauca-
sian, Japanese, Chinese, Korean, Filipino, Fijian, Maori,
Race N Total Score
Mean (S.D.)
Symptom
Di stress Mean
(S.D.)
Interpersonal
Rel ati ons Mean
(S.D.)
Soci al Rol e
Performance
Mean (S.D.)
Caucasian 1,931 63.9 (22.7) 35.6 (14.7) 16.0 (6.5) 12.1 (4.8)
African American
274 64.7 (24.1) 35.1 (14.8) 16.6 (6.8) 12.7 (5.4)
Hispanic
36 63.5 (22.7) 36.7 (13.8) 15.5 (6.7) 12.7 (5.0)
Other 37 66.1 (21.0) 37.0 (12.4) 16.1 (5.5) 12.2 (5.0)
TABLE 6
OQ

Score by Ethni ci ty i n a Sampl e of EAP Pati ents


Sample N
Undergraduate 238 23.08 (10.53) 8.95 (5.39) 10.37 (3.62)
Male 91 22.71 (10.07) 9.81 (6.24) 10.43 (3.63)
Female 47 23.43 (10.89) 8.31 (4.72) 10.35 (3.65)
Community 102 25.73 (10.26) 10.81 (5.74) 9.81 (3.91)
Male 46 25.37 (9.70) 11.51 (5.83) 10.43 (3.39)
Female 56 26.52 (10.85) 10.52 (5.39) 9.48 (3.95)
EAP 504 41.83 (14.15) 17.13 (6.03) 13.76 (4.83)
Male 198 41.64 (14.48) 17.49 (6.26) 14.41 (5.04)
Female 306 41.96 (13.96) 16.9 (5.88) 13.33 (4.64)
University Outpatient Clinic 76 42.88 (14.72) 17.25 (6.61) 14.24 (5.72)
Male 23 40.86 (15.08) 17.86 (6.42) 14.27 (5.75)
Female 53 45.34 (13.82) 17.8 (6.17) 14.7 (5.62)
Mean (S.D.) Mean (S.D.) Mean (S.D.)
TABLE 4
Comparison of Gender Scores on the OQ

- Domain Scores
Distress Interpersonal Social Role
Age Range N
Age = <20 21 71.95 (22.72) 42.10 (15.41) 15.62 (6.56) 14.95 (4.73)
Age = 20-39 303 73.44 (21.08) 41.99 (14.02) 17.50 (6.06) 13.73 (4.76)
Age = 40-59 172 72.49 (21.49) 41.37 (14.46) 16.80 (5.84) 13.76 (4.82)
Age = >60 8 71.75 (13.86) 45.37 (10.24) 14.50 (7.23) 11.38 (7.56)
TABLE 5
OQ

Score by Age i n a Sampl e of EAP Patients
Soci al Rol e Performance
Mean (S.D.)
Interpersonal
Rel ati ons Mean (S.D.)
Symptom Di stress
Mean (S.D.)
Total Score Mean
(S.D.)
OQ

45.2 Administration and Scoring Manual 5


Kiribati, Cook Islander, Hawaiian, Samoan, and Tongan.
In order to secure adequate sample sizes, the preceding
groups were reclassified into Mainland Caucasian (US),
Hawaiian Caucasian (H), Pacific Islander (PI), Asian
(AS), Chinese (C), and Korean (K). Caucasians had
significantly lower OQ-45 Total Scores than all other
group, and Pacific Islanders had significantly lower
scores than Asians and Koreans (see Figure 1 ).
Mean OQ-45 scores in Mainland,
Hawaii, Pacific,Asian/Chinese,
Korean Samples
0
10
20
30
40
50
60
US H PI As C K
country of
origin
Figure 1. Differences in OQ-45 test
scores for students from different cultures.
This finding of ethnic differences is consistent with
many other comparative studies of Asian and Cauca-
sian populations. Examples of such differences often
include higher rates of expressed symptomatology and
higher rates of psychopathology in Asian populations
(Cheng, Leong, & Geist, 1993; Hsu & Folstein, 1997;
Okazaki, 1997). Cautious interpretation of ethnic dif-
ferences is called for since confounding linguistic and
cultural considerations, including socioeconomic status
(Dana, 1998), and differing symptom patterns between
cultures (Cho & Kim, 1998), in addition to degree of
acculturation (Abe & Zane, 1990) and degree of iden-
tity with their native culture (Hishinuma et al., 2000)
can account for such differences (see Zane, Hall, Sue,
Young, & Nunez, 2003).
The findings of the Gregersen et al. study (2003)
highlight the necessity of contextual score interpreta-
tion for Asians and Pacific Islanders (Okazaki & Sue,
2000). Particularly for recent immigrants and their fami-
lies, scores on the OQ-45 should be interpreted with
caution. Normative sampling of clinical and asymptom-
atic Asian samples needs to be performed to determine
clinical significance and reliable change indices for these
populations. Until such data is obtained, clinicians and
third-party providers using the OQ-45 should remem-
ber: (a) there may be a response bias toward endorsing
negative items and denying positive items; (b) the col-
lectivist heritage of many Asian respondents may clash
with the individualistic questions of the OQ-45; and, (c)
although some evidence suggests higher rates of psy-
chopathology in Asian populations, elevations in the OQ-
45 scores should be interpreted in light of the specific
clients linguistic and cultural background. Together,
these factors may result in elevated scores and reduced
internal and external validity when using the OQ-45 with
Asian populations, particularly those with less expo-
sure to and experience with Western culture. In testing
with Pacific Islander populations, clinicians and third
party providers should be aware that there also may be
elevated scores resulting from higher family pressures
in these cultures (Booth, 1999). Again, higher test scores
in some ethnic minority samples should be interpreted
with caution as it is currently unclear whether such el-
evations indicate higher prevalence of problems, or in-
dicate linguistic and cultural factors affecting the re-
porting of such issues.
Despite difficulties in interpreting scores from some
samples, the Gregersen et al. (2005) study concludes
that in spite of significant differences in total OQ

scores
and response patterns, the OQ

can still be a helpful


measure for tracking psychotherapeutic outcome within
ethnic populations. Since the OQ

was designed to
measure clinical change resulting from therapy, partici-
pants scores from repeated administrations of the OQ

should be highly related to each other and provide idio-


graphic validity in within-subject designs. However,
cut-off scores for estimating the meaningfulness of in-
dividual change specific to particular ethnic populations
will need to be developed. In spite of varied racial and
cultural response sets and a lack of race-specific norms,
the OQ

is probably still capable of providing mean-


ingful psychotherapeutic outcome data. When clinicians
and third-party providers practice acceptable standards
of care and base treatment decisions upon comprehen-
sive data from multiple sources (including cultural fac-
tors and individual psychosocial history), repeated ad-
ministrations of the OQ

should provide, at the very


least, an adequate marker of the direction of movement
during the course of treatment.
To examine this supposition, Campbell et al.
(2003) examined outcomes of African American (n =
29), Latino/a ( n =279), Native American (n =50), and
Asian/Pacific Islander (n =118) clients, compared to
equal size samples of Caucasian clients matched with
OQ

45.2 Administration and Scoring Manual 6


each ethnic groups initial level of disturbance. Clients
were all treated in the same university counseling cen-
ter. The results showed that clients of self-identified eth-
nic groups had outcomes equal to the Caucasians with
the surprising exception of Native American clients who
had significantly better outcomes. The results of this
analysis are presented in Table 7.
Cultural Groups N CS % RCI % Deterioration %
M SD M SD
Native American 50 76.08 24.255 60.02 28.388 12 13 0
Caucasian 50 76.08 24.256 65.04 26.226 13 9 2
Latino/a 279 69.75 23.995 61.66 25.131 8.1 11.1 2.3
Caucasian 279 69.74 23.969 61.59 23.908 9 10.2 3.4
African American 29 69.44 22.976 59.86 25.366 10.3 12.1 1.7
Asian/Pacific Islander 118 75.82 23.532 65.85 24.361 9.7 9.3 3
Caucasian 118 75.82 23.52 63.97 22.058 13.1 9.3 2.5
Note. CS = Clinically Significant Change; RCI = Reliable Change Index .
Table 7.
Comparative outcomes of Native American, Latino/a, African American, Asian/Pacific Islander and Caucasian Clients
Caucasian 29 69.45 22.982 61.28 23.048 5.2 12.1 1.7
Pre-test Post-test
In addition to these data, studies from other coun-
tries provide evidence that the OQ-45 is useful for mea-
suring outcome cross-culturally. For example, de la Para
and Bergen (2002) reported successful use of the OQ-
45 in Chile with lower and middle class patients receiv-
ing treatment in either inpatient or outpatient settings.
They noted considerable similarity between non-patient
samples gathered in the United States and data gathered
in Santiago, Chile (n=129, M=48.7, SD=19.3). They
report higher scores among their patient samples than
was shown in American patients (outpatients, n=124,
M=100.4, SD=21.7; inpatients, n=30, M=92.0,
SD=27.3; and emergency service crises intervention,
n=32, M=115.8, SD=23.4). These normative data were
interpreted as suggesting that relatively few services in
Chile are available and these are reserved for those most
in need. Some small changes were needed in the Span-
ish translation used in Chile in order to accommodate
the especially low educational level of some of the par-
ticipants and the sentence structure differences in the
Spanish language.
These authors also noted reliability and validity
coefficients that were quite high and similar to those
obtained in the United States. For example, de la Para
and Bergen (2002) found an internal consistency coeffi-
cient of .91 and a test-retest reliability coefficient of
.82, figures almost identical to those published in this
manual. In addition, the Spanish version appeared to be
sensitive to the effects of treatment, with large changes
evident in those who completed treatment, moderate
changes in those whose therapy was still ongoing, and
little or no change in those who did not undergo treat-
ment. The authors noted that with slight modifications,
the OQ-45 is a useful outcome measure in the context
of Chilean patient samples across a wide variety of so-
cioeconomic levels.
Harlinger, Auger, Garcia, and Rodriguez (2002)
conducted similar studies in Puerto Rico using a Span-
ish version of the OQ-45. They reported normative data
on 71 non-patients (M=41.16, SD=18.62) and several
patient groups with mean scores ranging from 69.93 to
84.20. These data are lower than those found in Chile
and similar to those reported in samples from the United
States. They also note similar reliability (e.g., internal
consistency =.88) and validity data as that reported later
in this manual. They reported that scores decreased sub-
stantially over time in patients who underwent treatment.
Data from other cultures have provided similar re-
sults. The OQ-45 is used extensively in Germany for
monitoring treatment response during psychotherapy
(Percevic, Lambert, & Kordy, 2004). In a large scale
study undertaken in Germany, Lambert, Hannover,
Nisslmuller, Richard, and Kordy (2002) found norma-
tive data that were very similar to data collected in the
United States. For example, they found non-patients
(n=232) had a mean score of 46.19 (SD=18.52), a fig-
ure almost identical to that found in the United States.
In addition, they reported internal consistency to be .93
and three-week test-retest reliability of .89, as well as
validity coefficients ranging from .45 to .76, figures very
similar to those reported later in this manual for the
English version of the OQ-45. Data from studies con-
ducted in the Netherlands (deJ ong, 2003) are presented
as a technical report near the end of this manual (Tech-
nical Report #2). These data suggest both similarities
and differences between norms in the USA and the Neth-
erlands.
In spite of significant differences in total OQ-45
scores and response patterns across some ethnic groups,
OQ

45.2 Administration and Scoring Manual 7


the OQ-45 has been found to be a helpful measure for
tracking psychotherapeutic outcome within ethnic popu-
lations. Since the OQ-45 was designed to measure clini-
cal change resulting from therapy, a clients scores from
repeated administrations of the OQ-45 should be highly
related to each other within each patient. In spite of var-
ied ethnic and cultural response sets and a lack of more
ethnic-specific norms, the OQ-45 appears to be capable
of providing meaningful psychotherapeutic outcome data
within specific cultures.
PSYCHOMETRI C PROPERTI ES
Reliability
Reliability was assessed using a sample of 157 stu-
dents from a large western University. The sample had
a mean age of 23.04 (SD =3.41) and was 34.3% male
and 65.7% female. The ethnic composition of the sample
was 93.8% Caucasian, 1.5% Hispanic, 1.5% Asians or
Pacific Islanders, and 3.2% other. Internal consis-
tency was also calculated on a subset of 298 patients
from the EAP sample. Internal consistency was found
to be high, and test retest values were significant at the
.01 level. Test-retest and internal consistency reliabil-
ity values are summarized in Table 8.
In addition to the above data, the OQ

was admin-
istered to a sample of 56 undergraduate students on a
weekly basis for a period of 10 weeks. These data were
collected primarily to assess the stability of OQ

scores
over time in a non-patient sample to compare with clini-
cal participants undergoing treatment. Table 9 presents
the correlation coefficients between OQ

scores at week
one and each subsequent OQ

score. These data sug-


gest that the OQ

is fairly stable over time, with reli-


ability decreasing over weekly administrations. Figures
2, 3, and 4, presented later in this manual, give a graphi-
cal presentation of similar data from an independent
study aimed at evaluating the effects of repeated admin-
istrations. These figures illustrate the stability of scores
across time in persons who are not in treatment and av-
erage decreasing scores in people who are in psycho-
therapy.
1
Pearson product-moment correlation coefficient (Cohen & Cohen, 1993)
2
Coefficient alpha (Cronbach, 1951)
3
All variables significant (p <.01)
Test-Retest
1
Student Student Patient
Symptom Distress .78 (N=157) .92 (N=157) .91 (N=298)
Interpersonal .80 (N=157) .74 (N=157) .74 (N=294)
Social Role .82 (N=157) .70 (N=157) .71 (N=295)
OQ

Total .84 (N=157) .93 (N=157) .93 (N=289)


3
TABLE 8
Test-Retest Rel iabil ity and Internal Consistency Values for the OQ

Total
and Domain Scores
Internal Consistency
2
Week One - Week Two 0.82
Week One - Week Three 0.86
Week One - Week Four 0.82
Week One - Week Five 0.77
Week One - Week Six 0.73
Week One - Week Seven 0.72
Week One - Week Eight 0.71
Week One - Week Nine 0.67
Week One - Week Ten 0.66
TABLE 9
Correl ation Coeffi ci ents Between Weekl y Testi ng on the
OQ Over a Ten Week Peri od
Validity
Concurrent validity was estimated for the student
sample by calculating Pearson product-moment corre-
lation coefficients (Cohen & Cohen, 1983) on the OQ

total score and individual domain scores with their re-


spective counterparts on the Symptom Checklist 90 R
(SCL 90 R; Derogatis, 1977); Beck Depression Inven-
tory (BDI; Beck et al., 1961); Zung Self Rating De-
pression Scale (ZSDS; Zung, 1965); Zung Self Rating
Anxiety Scale (ZSAS; Zung, 1971); Taylor Manifest
Anxiety Scale (TMAS; Taylor, 1953); State Trait Anxi-
ety Inventory (STAI; Spielberger, 1983); Inventory of
Interpersonal Problems (IIP; Horowitz et al., 1988);
Social Adjustment Scale (SAS; Weissman & Bothwell,
1976); and the SF 36 Medical Outcome Questionnaire
(Ware, Snow, Kasinki,& Gandek, 1994). In addition, a
small patient sample (N=18) took the OQ

and the Fried-


man Well Being Scale (Friedman, 1994). Concurrent
validity for the OQ

and its individual domains with


the criterion measures were all significant beyond the
.01 level of confidence. These results are presented in
Table 10.
Since the initial validity data were collected, a small-
scale validity study was completed involving three clini-
cal samples (Umphress, Lambert, Smart, Barlow, &
Clouse, 1997). These include individuals recruited from
a college counseling center (N=53), patients recruited
from an outpatient clinic (N=106), and an inpatient
sample (N=24) who were tested as soon as possible af-
ter hospital admission. Patients who were excluded from
the inpatient sample either refused to participate or were
in a mental health state such that they could not be ap-
proached about the study.
Description and details of sample characteristics,
measures, and procedures are contained in the published
report of this research (Umphress et al., 1997). Each
participant in the study completed the OQ

, Symptom
Checklist 90 R, Social Adjustment Rating Scaleself-
report form, and the Inventory of Interpersonal Prob-
lems. The validity coefficients from this analysis are
presented in Table 11.
OQ

45.2 Administration and Scoring Manual 8


* These values were obtained with a preliminary 43-itemversion of the current 45-itemtest.
a
GSI =General SymptomIndex of the SymptomCheck List 90 Revised
b
BDI =Beck Depression Inventory
c
ZSDS =Zung Self Rating Depression Scale
d
ZSAS =Zung Self Rating Anxiety Scale.
e
TMA =Taylor Manifest Anxiety
f
STAI =State-Trait Anxiety Inventory (Y 1 =State Anxiety; Y 2 =Trait Anxiety)
g
IIP =Inventory of Interpersonal Problems
h
SAS =Social Adjustment Scale
i
SF 36 =Correlations are with the Mental Health Scale with SD, Social functioning with IR, and Global functioning with total OQ

.
j
FW B =Friedman Well Being Scale, composite score
1
Figures in parenthesis are froma study of German normative sample (Lambert, Hannover et al., 2002)
Sample SCL-90-R (GSI) IIP (Total Score) SAS (Total Score)
College Counseling Center
OQ

Total Score
0.78 0.66 0.79
OQ

Symptom Distress
0.82 0.6 0.75
OQ

Interpersonal
0.45 0.49 0.53
OQ

Social Role
0.55 0.63 0.73
Outpatient Clinic
OQ

Total Score
0.84 0.74 0.71
OQ

Symptom Distress
0.84 0.7 0.65
OQ

Interpersonal
0.62 0.64 0.62
OQ

Social Role
0.55 0.55 0.57
Inpatient
OQ

Total Score
0.88 0.81 0.81
OQ

Symptom Distress
0.92 0.86 0.79
OQ

Interpersonal
0.68 0.57 0.69
OQ

Social Role
0.51 0.54 0.54
* All values significant (p < .05).
Tabl e 11
Val i di ty Data From Pati ent Popul ati ons*
GSI (SCL-90R)
a
.61* (.76)
1
(0.53) (0.47) .78*(.73)
BDI
b
.63* .80*
ZSDS
c
0.88 0.88
ZSAS
d
0.81 0.81
TMA
e
0.88 0.86
STAI
f
(Y-1)
.50* .64*
STAI
f
(Y-2)
.65* .80*
IIP
g
(0.64) .62(.55) (0.51) .54(.66)
SAS
h
0.4353 0.65
SF-36
i
0.8 0.48 0.81
FW-B
j
0.77 0.81
TABLE 10
Val i di ty Esti mates for The OQ

Symptom Di stress Interpersonal Rel ati ons Soci al Rol e


OQ

Cri teri on
OQ

45.2 Administration and Scoring Manual 9


This study was undertaken to supplement validity
data that had been collected with non-disturbed college
populations. As can be noted from Tables 10 and 11,
the validity data is comparable to that which had al-
ready been collected. Notably, the OQ

Total Score
correlated highly with the General Severity Index (GSI)
of the SCL 90R in each of the patient samples (range
.78 - .88). This finding was similar to the correlations
found between the GSI and the Symptom Distress
Subscale of the OQ (range .82 - .92). These results
suggest considerable overlap between these indices of
patient symptomatic complaints and related distur-
bances.
Results from the Social Role and Interpersonal
Subscales were less convincing. The Interpersonal
Subscale correlated significantly with the measure of
interpersonal problems (IIP) (range .49 - .64) across
the three samples, but just as highly or even more highly
with the Social Adjustment Rating Scale. The reverse
was equally true. The Social Role Subscale correlated
moderately across samples on the SAS (range .54 - .73)
but also correlated with the IIP. This finding suggests
that all three scales measure similar constructs despite
attempts to distinguish functioning in different areas.
It appears from these data (in combination with those
collected from college students) that the OQ

has high
to moderately high concurrent validity with a wide vari-
ety of measures that are intended to measure similar
variables. Correlations are strongest with the Total
Score. Clinicians can be confident that the OQ

Total
Score provides an index of mental health, one that cor-
relates quite highly with a variety of scales intended to
measure symptom clusters of anxiety, depression, qual-
ity of life, social adjustment, and interpersonal func-
tioning. The status of the three subscales is less certain.
The Symptom Distress subscale correlates very highly
with measures of symptomatic disturbance (correlations
typically in the mid 80s). Both the Interpersonal Rela-
tions and Social Role Subscales show modest correla-
tions (.60s) with symptomatic scales as well as scales
aimed at measuring problems in other areas of func-
tioning.
A recent study compared the utility of the OQ

and
the BASIS-32, a self-report questionnaire that assesses
symptoms and social functioning in inpatients. Factor
analysis yielded five subscales in the BASIS-32: de-
pression and anxiety, impulsive and addictive behav-
iors, psychosis, daily living and role functioning, and
relation to self and others. The intake and release scores
of 261 patients on these two measures were compared.
Results indicated the total scores of the two measures
were correlated (r=.64), with the two measures sharing
41% of the variance. The OQ

Symptom Distress scale


significantly correlated with the BASIS-32 Depression
and Anxiety subscale (r=.72), and the OQ

Interper-
sonal Relations scale significantly correlated with the
BASIS-32 Relation to Self and Others subscale (r=.43).
However, the correlation between the OQ

Social Role
Subscale and the BASIS-32 Daily Living and Role Func-
tioning Subscale (r=.28) was unexpectedly weak
(Doerfler, Addis, & Moran, 2002).
Kaufman (1997) provided correlations between
patient reports on the OQ

(after the sixth session) and
therapist rated Global Assessment of Functioning Score
(completed after the third psychotherapy session) in a
doctoral dissertation study. She found therapist ratings
on the GAFS correlated .78 with the OQ-45, suggest-
ing fair correspondence between estimates of disturbance
from these two independent sources.
Along similar lines, Lueck (2003) correlated OQ-
45 scores with screening diagnoses based on a com-
puter administered SCID interview given to over 300
clients. He found a correlation of .87 between the num-
ber of diagnoses that a client screened for (zero to six)
and intake OQ. Results were interpreted as indicating
that both measures reflect the severity of disturbance
experienced by a client.
In a follow-up study of 302 former clients, Nielsen
et al. (2003) compared results obtained on the OQ-45
with those obtained with items from the Consumer Re-
ports (CR) effectiveness scale. Consistent with other
research examining satisfaction ratings and ratings based
on outcome scales, this study found a correlation of .52
between OQ-45 change scores and CR retrospective
ratings of amount of change. The OQ-45 also corre-
lated significantly with CR ratings of emotional state.
Sensitivity to Change
The OQ

s construct validity depends in part on the


ability of the OQ

to reflect change following interven-


tions such as psychotherapy. While retest scores for
individuals are not expected to fluctuate systematically
over time, it is expected that the scores of patients re-
ceiving psychological or psychopharmacological inter-
ventions would decrease over time. Past psychotherapy
research shows that most patients typically improve in
therapy, and a portion improve in placebo treatments.
Detectable gains can be expected to take place by the
eighth therapy session (Lambert & Ogles, 2004).
Given the consistent nature of these findings, the
OQ

would be considered to have construct validity


(measuring changes in level of psychological distur-
bance) if the scores for patients after seven sessions of
OQ

45.2 Administration and Scoring Manual 10


therapy were lower than their pre-therapy levels. This
hypothesis was tested by following a subset of patients
in treatment at a university outpatient clinic. Of the 76
patients who took the OQ

prior to entering therapy, 40


patients had at least seven therapy sessions. As expected,
a t test between the means of the patient pretest scores
and their post-test scores after seven sessions of therapy
revealed statistically significant improvement. These
data are presented in Table 12.
In addition to these data, Vermeersch, Lambert, and
Burlingame (2000) evaluated the sensitivity to change
of each item, each subscale, and the total score of the
OQ

by contrasting changes that take place over time


with and without treatment. This analysis used patient
data from multiple treatment settings to calculate a slope
of change for each patient on each item and then aver-
aged across these slopes (i.e., using linear regression
techniques). Table 13 presents the main findings of this
analysis, detailing the average change over time (slopes)
for the patient and control groups as well as the differ-
ence between rate of change for the treated and control
groups as estimated with the t-test and Cohens d. The t
results express the significance of difference between
groups, while the d expresses the size of the difference
in standard deviation units. This type of data is seldom
available for psychological tests but is an essential as-
pect of evaluating the adequacy of a test and its subscales
(as well as each item) for measuring change. As can be
seen at the bottom of Table 13, not all items were equally
sensitive to the effects of an intervention with the samples
used in the present analysis and in the presence of rela-
tively brief treatment.
Another recent study (Vermeersch et al., 2004) ana-
lyzed sensitivity for each individual item and compared
the results for a different treated and untreated popula-
tion. Data for both treated and untreated samples in this
study were drawn from archival databases. The un-
treated (control) sample was composed of 248 under-
graduate students taking psychology courses at a large
western university.
These students completed the OQ on a weekly, bi-
weekly, or monthly basis over the course of 12 weeks.
A large portion of the sample was collected as part of a
project aimed at assessing the presence of a test-retest
artifact in which participants repeated testing at vari-
ous time intervals (Durham et al., 2002). The remain-
der of the control data was collected on a weekly basis
by Lambert et al. (1996) for the purpose of assessing
test-retest reliability of the OQ. Participants in both
studies were told that they would be taking the OQ
multiple times but were not informed of the specific
hypotheses and purposes of the investigators. Control
participants who received psychotherapy or psychop-
harmacological treatment at any point during their par-
ticipation in the study were excluded from data analy-
ses. This sample was 64% female, averaged 21.7 years
of age, and was 94% Caucasian. The mean number of
OQ administrations completed by participants in this
group was 8.3 (SD =1.2), and the mean initial OQ total
score was 48.87.
The treated (experimental) sample was composed
of 5,553 counseling center clients seen by 527 thera-
pists working in 40 university counseling centers
throughout the United States. Data for the treated sample
were primarily collected as part of the Research Con-
sortium of Counseling and Psychological Services in
Higher Education (Drum & Baron, 1998), a large-scale
collaborative research effort in which many counseling
centers nationwide participated. Data collected by a
non-participating university counseling center were also
included in the sample. The client sample used in the
Vermeersch et al. (2000) study was not part of the larger
client sample used in the current study. Clients in the
current study received personal counseling from licensed
psychologists, postdoctoral psychologists, predoctoral
psychology interns, and graduate student therapists. This
sample was 70% female, averaged 22.46 years of age,
and was 83.5% Caucasian. Thirty-six percent had a
mood or anxiety disorder diagnosis. The mean number
of sessions completed by clients in this group was 3.77
(SD =2.39), and the mean pretreatment OQ total score
was 70.41.
OQ

Score
N
Total Score 40 84.65 (24.14) 67.18 (27.12) 4.78* (39)
Symptom Distress 40 46.2 (14.42) 36.65 (16.58) 4.26* (39)
Interpersonal Relations
40 18.35 (5.75) 15.67 (6.08) 3.30* (39)
Social Role Performance
40 15.83 (6.0) 11.98 (5.68) 4.30* (39)
Pre-test Mean (S.D.)
TABLE 12
Amount of Improvement Demonstrated by The OQ

After Seven Sessi ons of Therapy


t-Val ue (D.F.) Post-test Mean (S.D.)
* (P <.001>
OQ

45.2 Administration and Scoring Manual 11


OQ

45.2 Administration and Scoring Manual 12


Item, Subscale and Total Score Slope
Patients
a
Controls
b
Patients
a
vs t
Controls
b
d
42. I feel blue.
c
-0.0814 -0.0105 6.63* 0.44
40. I feel something is wrong with my mind.
c
-0.074 -0.011 5.78* 0.38
15. I feel worthless.
c
-0.0642 -0.0071 5.68* 0.38
23. I feel hopeless about the future.
c
-0.0645 -0.0046 5.67* 0.38
3. I feel no interest in things.
c
-0.0524 0.0027 5.57* 0.37
28. I am not working/studying as well as I used to.
d
-0.0524 0.0202 5.51* 0.37
9. I feel weak
c
-0.0701 -0.016 5.31* 0.35
4. I feel stressed at work/school.
d
-0.0936 -0.383 4.87* 0.32
27. I have an upset stomach
c
-0.0583 -0.0079 4.82* 0.32
25. Disturbing thoughts come to my mind that I cannot get rid of.
c
-0.1167 -0.0616 4.73* 0.31
5. I blame myself for things
c
-0.1214 -0.0734 4.52* 0.3
10. I feel fearful.
c
-0.0849 -0.036 4.49* 0.3
18. I feel lonely
e
-0.0697 -0.0215 4.47* 0.3
33. I feel that something bad is going to happen
c
-0.0579 -0.0187 3.83* 0.25
6. I feel irritated
c
-0.0682 -0.0322 3.78* 0.25
19. I have frequent arguments
e
-0.0515 -0.0171 3.76* 0.25
26. I feel annoyed by people who criticize my drinking (or drug use).
e
-0.0119 0.0064 3.75* 0.25
29. My heart pounds too much.
e
-0.0497 -0.015 3.35* 0.22
44. I feel angry enough at work/school to do something I might regret
d
-0.0416 -0.0125 3.32* 0.22
8. I have thoughts of ending my life
c
-0.0338 -0.0089 3.18* 0.21
30. I have trouble getting along with friends and close acquaintances
c
-0.017 0.0099 3.04* 0.2
21. I enjoy my spare time
d
-0.0184 0.0133 3.03* 0.2
11. After heavy drinking, I need the next morning to get going
e
-0.0073 0.0034 3.02* 0.2
13. I am a happy person
e
-0.0171 0.0082 3.01* 0.2
7. I feel unhappy in my marriage/significant relationships
e
-0.0445 -0.009 2.98* 0.2
36. I feel nervous
c
-0.0495 -0.0178 2.97* 0.2
38. I feel that I am not doing well at work/school
d
-0.0199 0.0101 2.92* 0.19
31. I am satisfied with my life
c
-0.431 -0.0085 2.86* 0.19
22. I have difficulty concentrating
c
-0.052 -0.0244 2.65* 0.18
12. I find my work/school satisfying
d
-0.0036 0.0266 2.58* 0.17
43. I am satisfied with my relationships with others
e
-0.0136 0.0111 2.56* 0.17
41. I have trouble falling or staying asleep
c
-0.0551 -0.023 2.49* 0.17
24. I like myself
c
-0.0109 0.0105 2.26* 0.15
32. I have trouble at work/school because of drinking or drug use.
d
-0.0062 0.0026 2.26* 0.15
16. I am concerned with family troubles
e
-0.0707 -0.476 2.02* 0.13
39. I have too many disagreements at work/school
d
-0.017 0.0011 2.01* 0.13
45. I have headaches
e
-0.0529 -0.0317 2.00* 0.13
Participantive Distress Subscale -1.4944 -0.4362 7.51* 0.5
Interpersonal Relationships Subscale -0.3229 -0.0673 4.71* 0.31
Social Role Subscale -0.2479 -0.0238 6.36* 0.42
Total Score -2.2128 -0.5155 7.61* 0.5
Items with slopes that differ at a level of nonsignificance, are positive in the patient sample, or are
significantly more negatively sloping in controls in patients (not sensitive to change)
20. I feel loved and wanted.
e
-0.019 0.0003 1.94 0.13
14. I work/study too much
d
-0.0381 -0.0284 0.88 0.06
17. I have an unfulfilling sex life
e
-0.0209 -0.0107 0.86 0.06
2. I tire quickly
c
-0.0425 -0.0354 0.72 0.05
35. I feel afraid of open spaces, of driving or of being on buses, subeways & so forth.
c,f
0.0004 0.0052 0.64 0.04
37. I feel my love relationships are full and complete.
e
-0.0071 -0.004 0.25 0.02
1. I get along well with others
e,f
0.0222 0.0229 0.09 0.01
34. I have sore muscles
c
-0.0236 -0.0519 -2.72 -0.18
Significantly more negatively sloping in patients than in controls (sensitive to change)
TABLE 13
Average Slopes, t and d Values Based on Comparisons Between Average Slopes, and Allocation by Sensitivity to Change for Clinical and
Nonclinical Samples on the 45 Items, Subscales, and Total Score of the Outcome Questionnaire
a
N=1,176.
b
N=284.
c
Subjective Distress subscale.
d
Social Role subscale.
e
Interpersonal Relations subscale.
f
Item that demonstrated a positive slope in individuals receiving therapy and was therefore excluded as a possible
change-sensitive item based on results of initial data analysis. *p<.05.
Results of the initial data analysis indicated that in
the treated sample, 43 OQ item slopes met the first cri-
terion for change sensitivity, in that these items demon-
strated change in the theoretically proposed direction
(i.e., clients improved over time as illustrated by a nega-
tive item slope). Of these 43 items, 35 demonstrated a
slope that was significantly different from zero. The two
OQ items (items 1 and 35) that did not meet the first
criterion for change sensitivity failed to do so because
their slopes demonstrated change in the opposite direc-
tion of what would be expected (i.e., clients worsened
over time as illustrated by a positive item slope). Nei-
ther of these two positively sloping items demonstrated
a slope that significantly differed from zero. Each of
the three OQ subscales and the total score obtained from
the clinical sample demonstrated change in the theoreti-
cally proposed direction and were significant.
Results of the data analysis for the untreated sample
indicated that 30 OQ items demonstrated a negative
slope (i.e., controls improved over time). Of these 30
items, 8 demonstrated slopes that differed significantly
from zero. There were 15 OQ items that demonstrated
positive slopes (i.e., controls worsened over time). Of
these 15 positively sloping items, none of them demon-
strated slopes that significantly differed from zero. Each
of the three OQ subscales and the total score obtained
from the control sample demonstrated a significant nega-
tive slope. Table 14 contains the slope estimates (i.e.,
average change rate) for the items, subscales, and total
score of the OQ for the treated and control samples.
The slope estimates obtained for the samples were
then used to calculate the slope estimate comparisons
and effect sizes, which were of primary interest in this
study.
The treated versus untreated comparison indicated
that 34 OQ items (76%) met the second criterion for
change sensitivity, in that scores on these items decreased
significantly more over time in the treated sample than
in the untreated sample (i.e., those who were treated
improved at a significantly faster rate than those who
were not treated). Eleven OQ items (24%) did not meet
the second criterion for change sensitivity, in that scores
on these items changed at a level of nonsignificance in
relation to one another (i.e., treated and untreated rates,
direction of change, or both did not differ significantly).
One OQ item (Item 34) decreased significantly more in
the untreated sample than in the treated sample. Orga-
nization of the 34 change sensitive items according to
the subscale indicated that 88% (22 of 25) of all SD
subscale items, 55% (6 of 11) of all IR subscale items,
and 67% (6 of 9) of all SR subscale items were sensi-
tive to change. Furthermore, each of the three OQ
subscales and the total score met the second criterion
for change sensitivity, in that treated individuals scores
decreased at a significantly faster rate than untreated
individuals scores. The t values obtained by comparing
the treated versus untreated item, subscale, and total
score slope estimates allow for the items, subscales, and
total score in Table 14 to be arranged by change sensi-
tivity.
As in the Vermeersch et al. (2000) study, effect sizes
(represented by d values) for the treated versus untreated
comparisons were calculated from the obtained t values
using the conversion formula d =t(1/N
e
+1/N
c
)
1/2
(Ray
& Shadish, 1996). Lipsey (1990) has defined a small
effect as a value less than .33, a medium effect size as a
value between .33 and .55, and a large effect size as a
value larger than .55. Applying these effect size classi-
fication ranges to the obtained d values indicated that
one OQ item (Item 42), the SD subscale, and the total
score produced large effect sizes, ranging from .59 to
.66. Fifteen OQ items, as well as the IR and SR
subscales, produced medium effect sizes, ranging from
.33 to .55. The remaining 29 OQ items (18 met change
sensitivity criteria and 11 failed to meet change sensi-
tivity criteria), three of which produced negative effect
sizes (items 16, 17, and 34), yielded small effect sizes
ranging from .14 to .32 (as presented in Table 14).
Figure 2 was drawn from Table data to illustrate
the response curves for item 42: I feel blue for a treated
and untreated population, highlighting the sensitivity of
this item. Figure 3 shows the response curves for Item
35: I feel afraid of open spaces, driving, being on buses,
subways, & so forth. As shown, sensitivity varied from
item to item. Item 42 is especially sensitive to change,
and item 35 shows similar change for untreated and
treated populations. The overall sensitivity to change
for the total score in this large sample of patients is
illustrated in Figure 4. This figure highlights the overall
consistent pattern of sensitivity to change for the OQ

OQ

45.2 Administration and Scoring Manual 13


Item, Subscale and Total Score
Clients
a
Controls
b
t d
Significantly more negativel y slopi ng in cli ents than in control s (sensitive to change)
Total Score -2.3786 -0.5262 9.15*** 0.59
Symptom Distress subscale -1.6596 -0.4388 9.28*** 0.6
Interpersonal Relations subscale -0.3569 -0.0613 5.68*** 0.37
Social Role subscale -0.3184 -0.0279 6.78*** 0.44
42. I feel blue.
c
-0.108 -0.0101 10.13*** 0.66
40. I feel something is wrong with my mind.
c
-0.1038 -0.0106 8.41*** 0.55
15. I feel worthless.
c
-0.0793 -0.0067 7.57*** 0.49
25. Disturbing thoughts come to my mind that I cannot get rid of.
c
-0.1394 -0.0617 7.54*** 0.49
31. I am satisfied with my life.
c
-0.0483 0.0103 6.95*** 0.45
3. I feel no interest in things.
c
-0.0602 0.0026 6.70*** 0.44
18. I feel lonely.
e
-0.0866 -0.0211 6.68*** 0.43
13. I am a happy person.
c
-0.0358 0.008 6.17*** 0.4
5. I blame myself for things.
c
-0.1295 -0.0736 6.11*** 0.4
28. I am not working/studying as well as I used to.
d
-0.0497 0.0163 5.35*** 0.35
9. I feel weak.
c
-0.0691 -0.0156 5.29*** 0.34
8. I have thoughts of ending my life.
c
-0.0492 -0.0092 5.24*** 0.34
43. I am satisfied with my relationships with others.
e
-0.0308 0.0173 5.17*** 0.34
37. I feel my love relationships are full and complete.
e
-0.0575 -0.0035 5.16*** 0.34
21. I enjoy my spare time.
d
-0.0335 0.0134 5.14*** 0.33
6. I feel irritated.
c
-0.0736 -0.0322 5.12*** 0.33
4. I feel stressed at work/school.
d
-0.086 -0.0381 4.97*** 0.32
20. I feel loved and wanted.
e
-0.0377 -0.0005 4.83*** 0.31
10. I feel fearful.
c
-0.0848 -0.037 4.74*** 0.31
33. I feel that something bad is going to happen.
c
-0.062 -0.0182 4.72*** 0.31
24. I like myself.
c
-0.0258 0.0107 4.39*** 0.29
23. I feel hopeless about the future.
c
-0.0481 -0.0048 4.29*** 0.28
26. I feel annoyed by people who criticize my drinking (or drug use).
e
-0.0088 0.0064 4.20*** 0.27
27. I have an upset stomach.
c
-0.0489 -0.0104 3.88*** 0.25
12. I find my work/school satisfying.
d
-0.0088 0.0226 3.82*** 0.25
44. I feel angry enough at work/school to do something I might regret.
d
-0.0387 -0.0123 3.70*** 0.24
36. I feel nervous.
c
-0.0462 -0.0177 3.59*** 0.23
38. I feel that I am not doing well at work/school.
d
-0.0453 -0.0082 3.55*** 0.23
2. I tire quickly.
c
-0.0626 -0.0346 3.54*** 0.23
22. I have difficulty concentrating.
c
-0.0528 -0.0243 3.36*** 0.22
11. After heavy drinking, I need a drink to get going the next morning.
c
-0.0055 0.0034 3.06** 0.2
29. My heart pounds too much.
c
-0.0328 -0.0147 2.20* 0.14
30. I have trouble getting along with friends and close acquaintances.
e
-0.0084 0.0103 2.00* 0.13
41. I have trouble falling asleep or staying asleep.
c
-0.042 -0.023 1.96* 0.13
7. I feel unhappy in my marriage/significant relationship.
e
-0.0338 -0.0101 1.92 0.12
32. I have trouble at work/school because of drinking or drug use.
d
-0.0021 0.003 1.71 0.11
19. I have frequent arguments.
e
-0.0272 -0.0173 1.31 0.09
39. I have too many disagreements at work/school.
d
-0.0059 0.0029 1.3 0.08
1. I get along well with others.
e,f
0.0215 0.0237 1.17 0.08
45. I have headaches.
c
-0.0401 -0.0317 1.1 0.07
35. I feel afraid of open spaces, driving, being on buses, subways, & so forth.
c,f
0.0001 0.0054 0.76 0.05
14. I work/study too much.
d
-0.0302 -0.0295 0.43 0.03
16. I am concerned about family troubles.
e
-0.0472 -0.0475 -0.32 -0.02
17. I have an unfulfilling sex life.
e
-0.0006 -0.0101 -0.62 -0.04
34. I have sore muscles.
c
-0.0282 -0.0516 -2.13* -0.14
a
N = 5,553.
b
N = 248.
c
Symptom Di stress subscale.
d
Social Rol e subscal e.
e
Interperson
Rel ations subscal e.
f
Item that demonstrated a positi ve sl ope in individual s recei vi ng counsel ing
and was therefore excl uded as a possible change sensitive item based on resul ts of i nitial data
anal ysi s. p < .05, ** p < .01, *** p < .001
Slope
Clients
a
Vs. Controls
b
Items with slopes that di ffer at a l evel of nonsignificance, are posi tive in the cli ent sample, or are si gnifi cantly more negati vely sl opi ng i n
controls than i n cli ents (not sensitive to change)
TABLE 14
Average Slopes, t and d Values Based on Comparisons Between Average Slopes, and Allocation by Sensitivity to Change for Clinical
and Nonclinical Samples on the 45 Items, Subscales, and Total Score of the Outcome Questionnaire
OQ

45.2 Administration and Scoring Manual 14


0
0.5
1
1.5
2
2.5
3
3.5
4
1 2 3 4 5 6 7 8 9
Session
O
Q

I
t
e
m

S
c
o
r
e
Clients (N = 5553)
Controls (N = 248)
Figure 2. Outcome Questionnaire (OQ) item response curves for Item 42: I feel blue.
0
0.5
1
1.5
2
2.5
3
3.5
4
1 2 3 4 5 6 7 8 9
Session
O
Q

I
t
e
m

S
c
o
r
e
Clients (N = 5553)
Controls (N = 248)
Figure 3. Outcome Questionnaire (OQ) item response curves for Item 35:
I feel afraid of open spaces, driving, being on buses, subways, & so forth.
OQ

45.2 Administration and Scoring Manual 15



30
35
40
45
50
55
60
65
70
75
1 2 3 4 5 6 7 8 9
Session
O
Q

T
o
t
a
l

S
c
o
r
e
Clients (N = 5553)
Controls (N = 248)
Figure 4. Total Score response curve comparing treated and untreated persons
Outcome in the EAP setting
A pilot study of persons seeking or being referred
for help in Employee Assistance Programs managed by
Human Affairs International provides interesting data
on change. Seven sites across the country provided data,
but no attempt was made to collect OQ

data on every
employee that asked for assistance. It was possible to
collect data on 78 patients who took the completed the
OQ at pretreatment and had at least two therapy visits.
Of the 78 patients, 58 (74%) had pretreatment scores
that placed them in the dysfunctional range. Their pre-
treatment mean was 82.34 (SD =15.82) whereas the
posttreatment mean was 66.01 (SD =22.46). These
patients had a mean of three sessions of treatment and a
maximum of eight sessions.
The number of participants who met criteria for
clinically significant improvement (i.e., passing the cut-
off of 63 and improving by at least 14 points) suggests
that patients improve in very brief treatments even when
the standard of improvement is rigorous. The total num-
ber of participants who significantly improved within
eight sessions was 22 of 58 (38%): 9 Recovered after 1
session, 6 Recovered after 2 sessions, 5 Recovered af-
ter 3 sessions, 1 Recovered after 4 sessions, 1 Recov-
ered after 5 sessions.
Five additional patients (8.6%) improved by at least
14 points but did not pass the cutoff. Two patients (3%)
got worse (i.e., at least a 14 point increase), and 50% of
the patients did not meet the criteria for having changed
in either way. Of those participants beginning in the
functional range (20 of 78), nine improved by at least
14 points.
Doerfler et al. (2002) also reported that the OQ

was very sensitive to change in a short-term hospital-


ization setting. They concluded that this sensitivity,
combined with the well-established utility of the OQ

in
outpatient settings, may make the OQ

an advantageous
instrument for outcome assessment across various lev-
els of care (e.g., inpatient, day treatment, outpatient)
(p. 19).
Sensitivity to Psychopathology
Support for the construct validity of the OQ

was
also sought by comparing the EAP and outpatient psy-
chotherapy clinical samples scores on the OQ

with
those of the community and undergraduate non-clinical
samples. It was assumed that statistically significant
OQ

45.2 Administration and Scoring Manual 16


differences between the means of the clinical and nor-
mative samples would suggest that the OQ

could dis-
tinguish between these groups. Further, it was expected
that the mean scores for the groups would be ordered
from the most pathological to least pathological. We
expected the outpatient psychotherapy group to be most
disturbed, followed by the EAP sample, the community
sample and the undergraduate sample. A one way
ANOVA was conducted to determine the difference be-
tween sample means. Comparisons between the clini-
cal and non-clinical samples were significant at the .001
level. T tests were conducted following the ANOVA for
the purpose of post hoc comparisons as well as to quan-
tify the differences between the various samples. These
results are presented in Table 15.
The data in Table 15 clearly suggest that the OQ

reflects pathology in line with expectations: there were


no statistically significant differences between the non-
patient groups, but clear differences emerged between
clinical and non-clinical samples. There were also
statistically reliable differences between levels of pa-
thology within patient samples.
Sensitivity and Specificity
Sensitivity is the proportion of true positives that
are correctly identified by a test. The sensitivity of the
OQ

is 0.84 (see Table 16), which means that 84% of


the true members of the Normal group (non-patient) were
properly classified as normal and 16% were
misclassified (erroneously put in the abnormal group)
using the cutoff score of 63.
Specificity is the proportion of true negatives that
are correctly identified. The specificity of the OQ

is
.83 (see Table 16), indicating that 83% of the true mem-
bers of the abnormal group (patients) were placed in the
abnormal group using the cutoff score of 63.
Compar ison Gr oup N Mean (S.D.) t -Value (D.F.)
1.15
(1251)
Community(non-patient) 815 45.19 (18.57)
24.52*
(1254)
Employee Assistance Program 441 73.61 (21.39)
6.05*
(781)
Outpatient Clinics 342 83.09 (22.23)
TABLE 15
Comparison of level of psychopathology as measured by the OQ across patient and nonpatient samples
F Rat io = 274.2 (signif icant , p < .001)
Undergraduate(non-patient)
438 46.49 (19.82)
Employee Assistance Program
441 73.61 (21.39)
Community(non-patient)
815
45.19 (18.57)
Criterion Group Normal Sample Abnormal Sample
Normal Sample 0.84 0.16
Abnormal Sample 0.17 0.83
TABLE 16
Sensi tivi ty and Speci fici ty of the OQ

Predicted Group
CALCULATION OF CUTTOFF SCORES
FOR RATING RECOVERY, IMPROVEMENT,
AND DETERIORATION
Defining normal functioning, dysfunction, and
meaningful change are central purposes of outcome
measures. Clinically significant change refers to change
in patient functioning that is meaningful for individuals
who undergo psychosocial or medical interventions. This
concept has considerable value in research aimed at clas-
sifying each individual patients status with regard to
normative functioning. In this regard it allows research-
ers to focus on the functioning of each patient rather
than on group averages and statistical significance of
between group comparisons. Research using
operationalizations of clinical significance has been es-
pecially useful in estimating dose-response relationships
(e.g., Anderson & Lambert, 2001), and in outcome
management systems that employ it as a marker for re-
covery and deterioration (Lambert, Whipple, Smart,
Vermeersch, Nielsen, & Hawkins, 2001). In addition, it
has been used to estimate the relative value of empiri-
cally supported therapies as examined in clinical trials
(Hansen, Lambert, & Forman, 2002).
In all these uses, it is the degree of change in the
individual that is of primary interest. Such a focus is
thought not only to be of scientific importance but also
to lead to narrowing the gap between clinical research
and clinical practice. Thus, the concept and its
operationalization have generated considerable interest.
Following its introduction by J acobson, Follette, and
Revenstorf (1984), it was regarded as an important ad-
vance in methodology (Lambert, Shapiro, & Bergin,
OQ

45.2 Administration and Scoring Manual 17


1986), an expected statistic in outcome studies by some
journal editors, and has generated considerable atten-
tion in special journal sections devoted to the topic (e.g.,
J acobson, 1988; Kendall, 1999; Tingey, Lambert,
Burlingame, & Hansen, 1996).
The original proposal of Jacobson et al. (1984), later
modified by J acobson and Truax (1991), suggested a
two-step criterion for clinically significant change. First,
a cutoff point for a measure of psychological function-
ing is established that is conceptualized as a cutoff be-
tween two populations: patient / nonfunctional, and non-
patient / functional. To this end, J acobson and Truax
identified three reasonable cutoffs for consideration. The
first, Cutoff A, was defined as the point two standard
deviations beyond the range of the pre-therapy mean.
Cutoff A assumes an outcome score below this score is
very unlikely to belong to the patient population. On the
other hand, it is hardly possible to make conclusions
about recovery because no information on a functional
comparison group is included. The second, Cutoff B,
was defined as the point two standard deviations within
a recognized functional mean. This cutoff is not diffi-
cult for most clients to attain because of the overlap
between the dysfunctional and functional distributions.
The third, Cutoff C, was a weighted midpoint between
the means of a functional and dysfunctional sample.
When both data sets are available and there is overlap
between the two distributions, C represents the best
choice for a cutoff point because it is the least arbitrary
(J acobson, Roberts, Berns, & McGlinchey, 1999).
The second step of the J acobson-Truax method is
to determine whether a clients change from pre- to
posttest is reliable, rather than simply an artifact of
measurement error. To assess this, Jacobson et al. (1984)
proposed a reliable change index (RCI) that each par-
ticipant has to meet or surpass in order to demonstrate
that his or her change is not simply due to chance.
Based on these two criteria, the J acobson-Truax
method classifies individuals as Recovered (i.e.,
passed both cutoff and RCI criteria), Improved (i.e.,
passed RCI criteria but not the cutoff), Unchanged
(i.e., passed neither criteria), or Deteriorated (i.e.,
passed RCI criteria, but towards a worsening direction).
The Jacobson-Truax method for assessing clinically
meaningful change is among the most frequently reported
by researchers. In a review of outcome studies report-
ing clinical significance analyses, published over a 9-
year period in the Journal of Consulting and Clinical
Psychology, Ogles, Lunnen, and Bonesteel (2001) noted
that the originally proposed clinical significance method
by J acobson et al. was used in 35% of studies that em-
ployed some form of clinical significance. No other
method came close in terms of frequency of use. Since
the original approach of J acobson and his colleagues
there is general consensus on a conceptual definition of
clinical significance: A patients status is characterized
as clinically significantly changed when at the begin-
ning of treatment it was in the nonfunctional range and
at the end of treatment it was in the functional range
and when that change is statistically reliable. From a
mathematical perspective, there are multiple ways to
realize this definition (Bauer, Lambert, & Nielsen, 2004).
We have relied here on the method proposed by Jacobson
and Truax (1991) because it is the most common method
and produces estimates that are similar to most other
statistical formulas.
Calculation of Cutoff C. A cutoff score for
demarking Cutoff C was calculated on the normative
data presented in this manual. The middle point be-
tween the community non-patient sample and data com-
bined from several of the outpatient samples.
The formula used to devise these cutoffs was:
(SD
1
)(mean
2
)+(SD
2
)(mean
1
)
c = ________________________
SD
1
+ SD
2
Using this formula, cutoffs can be derived between
any two normative samples for comparative purposes
in evaluating treatment outcome. We recommend the
cutoff scores we present in this manual for general pur-
poses as they are based on large and diverse samples. If
special populations are being assessed, however, it may
be appropriate to construct new normative samples and
compute new cutoffs. For example, applications out-
side of the United States that are based on samples from
local populations may be more appropriate.
Reliable Change Index. Similarly, a reliable change
index (RCI) was derived based on the work of Jacobson
and Truax (1991). The formula for computing the
RCI is:
OQ

45.2 Administration and Scoring Manual 18


The standard error of measurement (S
E
) is com-
puted using the internal consistency value of the OQ

,
which is 0.93, and a pooled standard deviation value
(SD). The resulting S
E
value is inserted into the stan-
dard error of difference formula (S
diff
). This value is
then multiplied by the z-value of the significance level
desired, in this case 1.96 (p <0.05). The resulting value
represents the size of the change needed to achieve reli-
able change.
As with the cutoff score, we recommend using the
RCI presented here for most general purposes as it is
based on large and diverse normative samples. If spe-
cialized or more specific RCI values are desired, ap-
propriate norms can be gathered and new RCI values
can be derived using the formulas given above.
Distribution cutoffs for the OQ

total score and the


subscale scores are as follows: Total Score = 63/64;
Symptom Distress =36/37; Interpersonal Relations =
15/16; and Social Role =12/13. Change score values
at or below these scores fall in the non-patient range.
These cutoff values are used in the patient progress
monitoring graphs found in appendices CF. The RCI
for theTotal Score = 14. The RCIs for each of the
subscales are as follows: Symptom Distress =10, In-
terpersonal Relations =8, and Social Role =7.
Although little work has been done to validate the
J acobson and Truax (1991) cutoff score formulas as a
method of providing adequate demarcations of mean-
ingful patient change, some validity data have been
published. Beckstead et al. (2003) examined the OQ

cutoff scores for clinical significance by comparing


concordance rates with cutoff scores based on other
measures of psychotherapy outcome. The OQ

and the
SCL-90-R (Derogatis, 1983), the SAS-SR, SAS-OR
(Weissman, Prusoff, Thompsom, Harding, & Myers,
1978), the IIP-S (Hansen, Umphress, & Lambert, 1998),
and the QOLI (Frisch, 1988) were administered to par-
ticipants in pre- and post-treatment assessments. It was
found that at pretest the mean concordance rate for clas-
sifying patients as functional or dysfunctional was 75%;
at posttest it was 77.5%, with one-third to just less than
one-half (43%) of the clients being classified perfectly
across all six measures at pre- and post-testing. At pre-
test, at least three out of the five comparative measures
agreed 85% of the time with the OQ

classification as
clinical or non clinical. At posttest, the percentage was
82.2%. Finally, regarding clinically significant change,
64.6% of the time at least three out of five measure-
ments agreed with the OQ

classification as meeting or
not meeting criteria for clinically significant change.
The results suggested similarity between the OQ

and
the other measures in the study, which offers prelimi-
nary support for the use of the OQ

alone (instead of a
battery of measures) to classify clients as functional or
dysfunctional and to detect clinically significant change.
Lunnen and Ogles (1998) also reported a study that
simultaneously used the OQ

and other measures of


outcome for the purpose of validating clinical signifi-
cance cutoffs. The purpose of their study was to ex-
plore the practical meaning of cutoff scores and criteria
for the Reliable Change Index. These authors compared
the perceived level of change as subjectively reported
from three distinct perspectives (patient, therapist, and
significant other). They also compared reports of the
therapeutic alliance and satisfaction across outcome
groups. The results of this study suggested that those
patients who were classified as improved (20-point posi-
tive change on the OQ

Total score based on sample


specific standard deviation rather than manual-based
cutoff) also were rated as most improved on therapist
and client ratings of perceived change. They also tended
to have higher alliance scores. Surprisingly perhaps,
satisfaction scores did not, for the most part, distinguish
between improvers, no-changers, and deteriorators.
Although more work needs to be done to validate
the current cutoff scores, they appear to have important
practical value and to be a central aspect of effectively
using the OQ-45.
INTERPRETATION OF INITIAL SCORES
To use the OQ

clinically, the clinician should con-


sider three elements: the participants answers to cer-
tain select items, the total score (TOT), and the subscale
scores. Interpretive graphs are included for the total
and subscale scores (see Appendix C).
Item Evaluation
The clinician should first consider patient ratings
on certain critical items. Item 8 is a screening item for
potential suicide that should be investigated further if
the participant gives any rating higher than 0 (never).
Items 11, 26, and 32 refer to substance abuse items and
should also be investigated further if ratings other than
0 (never) are given. Item 44 screens for violence at
work; any rating other than 0 (never) should be investi-
gated for the possibility of current and/or future work
conflicts that lead to violent acts against fellow employ-
ees.
Total Score (TOT)
A high total score indicates that the patient admits
to a large number of symptoms of distress (mainly anxi-
OQ

45.2 Administration and Scoring Manual 19


ety, depression, somatic problems, and stress) as well
as interpersonal difficulties, difficulties in social roles
(e.g., work problems), and in their quality of life. In
general, lower scores suggest that the patient is no more
disturbed than the general population.
An effective way to use the OQ

in clinical settings
is to compare a patients score with different normative
samples. Ideally, normative data from inpatients, out-
patients, community samples and asymptomatic indi-
viduals would be available. At this time, only cutoff
scores comparing patient and non-patient samples are
available for the OQ

. The cutoff score is presented in


Appendix C. Cutoff scores for the total score and
subscale scores were derived using the procedures sug-
gested by J acobson and Truax (1991). As can be seen
in the Total Score graph, the cutoff for entering the com-
munity population has been set at 63. When a patients
score falls at or below 63, it is more likely that they are
part of the community sample than the patient sample.
In addition, when a patients score changes by more than
14 points in either direction from pretest, this change is
said to be reliable. Changes of 14 points or more sug-
gest movement by the patient that reliably (p <.05) ex-
ceeds the measurement error of the OQ

.
Extremely low scores (<20) from those who are
entering treatment is an uncommon occurrence; such
scores indicate that the person is admitting to little dis-
turbance. It is possible that they have a problem that is
so specific and limited that it causes them little diffi-
culty and therefore it is reflected accurately by their score
on the OQ-45. It is more likely that they are not being
open about their concerns. Low test scores in treatment
samples are not uncommon in people who take the test
under duress, such as involuntarily committed patients,
and substance abusing patients referred in by employ-
ers or spouses.
Subscale Scores
To identify specific problem areas, subscale scores
can be consulted. The OQ

reports three subscale scores:


Symptom Distress, Interpersonal Relations and Social
Role. It is not possible for a patient to have a high Total
Score without also having high subscale scores. On the
other hand, a low total score does not mean that the
patient does not have problems in one or more subscale
domains.
Symptom Distress (SD). Research suggests that
the most common disorders are anxiety disorders, af-
fective disorders, adjustment disorders, and stress-re-
lated illnesses. The Symptom Distress subscale is com-
posed of items that have been found to reflect the symp-
toms of these disorders. A high score indicates that pa-
tients are bothered by these symptoms and low scores
indicate either absence or denial of symptoms. Symp-
tom Distress scores correlate highly with measures of
depression, such as the Beck Depression Inventory. They
also correlate highly with measures of anxiety, such as
the State Trait Anxiety Inventory (see section on psy-
chometric properties). The cutoff for this subscale was
derived by the same method used for the total score cut-
off. The graph is presented in Appendix D. As noted,
the cutoff for symptom distress is 36. When a
participants score falls below this point, they are scor-
ing like people in the non-patient sample. Reliable change
is considered to occur after a patients score has changed
10 points.
Interpersonal Relationship (IR). Research sug-
gests that most patients experience difficulty in inter-
personal relationships in addition to the subjective dis-
comfort reflected in the Symptom Distress subscale.
Interpersonal Relationship subscale items assess such
complaints as loneliness, conflict with others, and mar-
riage and family difficulties. High scores suggest con-
cerns in those areas, and low scores suggest both the
absence of interpersonal problems as well as satisfac-
tion with the quality of intimate relationships. The cut-
off for Interpersonal Relationships (IR) is presented in
Appendix E. Scores below the cutoff of 15 suggest the
patient is experiencing a level of satisfaction in rela-
tionships that is equivalent to normal functioning. Reli-
able change is considered to occur after a patients score
has changed 8 points.
Social Role Performance (SR). Dysfunction may
extend beyond a persons subjective sense of discom-
fort and beyond their closest relationships into the be-
haviors that are commonly expected to be manifested
by adults in our society. The Social Role subscale mea-
sures the extent to which difficulties fulfilling workplace,
student, or home duties are present. Conflicts at work,
overwork, distress and inefficiency in these roles are
assessed. High scores indicate difficulty in social roles,
while low scores indicate adequate social role perfor-
mance. Additional attention should be given to low
scores to determine whether they result from social role
satisfaction or from participant unemployment (e.g., the
participant arbitrarily marking the items 0 for never or
not applicable). The cutoff score for SR is 12. The
graph for this subscale is located in Appendix F. Reli-
able change is considered to occur after a patients score
has changed 7 points on this subscale.
OQ

45.2 Administration and Scoring Manual 20


POTENTIAL USES OF THE
OUTCOME QUESTIONNAIRE
Use of the OQ

for Treatment Planning


The OQ

can be used in treatment planning if it is


employed with other patient data. For example, Hu-
man Affairs International (HAI), a large multi-state
managed care company, used the OQ-45 total score at
the inception of treatment to assist clinicians in initial
level of care decisions. Because their system is propri-
etary, specific details cannot be offered, but generalities
of procedures can be explained. HAIs system used the
OQ-45 intake score to sort clients into categories of high
(85 and above), medium (6484), or low (63 or below)
functioning. Other patient information, such as history
of psychological treatment (e.g., no history of psycho-
logical treatment, recent inpatient care), motivation for
treatment, and diagnosis, were combined through algo-
rithms to produce computer-generated suggestions for
clinicians and care managers for treatment planning or
referral.
Based on the composite patient picture at intake,
some patients were retained in a brief therapy format
(one to eight sessions) whereas others were referred for
longer-term outpatient treatment, medication consulta-
tion, substance abuse intervention, group therapy, and
the like. The OQ-45 played an important role in such
decisions by providing a marker for initial level of dis-
turbance. In this context, it is considered an index of
current psychopathology to be used in conjunction with
clinical judgments, diagnostic formulations, and related
information.
As therapy continues, changes in OQ

score (using
intake as the baseline) are used in conjunction with other
information to form additional algorithms for treatment
planning and decision making regarding the patient. For
example, changes in OQ-45 scores can be used to trig-
ger decisions regarding termination, step down to less
intensive and costly treatments, or shift to other alter-
nate treatments such as medication. In addition, the
early discovery of negative change can be very helpful
in sparking reviews of current treatment strategies, thus
preventing or reducing patient dropout, as well as ulti-
mate negative effects from treatment. Some evidence
suggests that the best predictor of dropout from outpa-
tient treatment as well as ultimate patient outcome is
negative change from intake to session three. Consid-
erable research is necessary before we can be confident
that the OQ

is appropriate for such uses since deci-


sions may need to be based on the degree of accelera-
tion in change and not just the direction. This will be
discussed more fully when the issue of tracking patient
progress is addressed.
An important strength of the OQ-45 for treatment
planning is the large amount of data that have been col-
lected and analyzed to predict the amount of therapy
needed to produce reliable and clinically significant
change. To date, the best empirical estimates for setting
reasonable treatment lengths come from studies that have
attempted to understand the relationship between thera-
peutic units of intervention (sessions) and patient re-
covery status (clinically significant change), so-called
dose-response research. Patients in this research typi-
cally completed the OQ-45 prior to each weekly therapy
session. Completion of the pretest occurred immedi-
ately before the first session; the first post-test then pre-
ceded the second session, and the second post-test pre-
ceded the third session, and so on. This procedure was
consistent with OQ-45 instructions asking patients to
describe their functioning over the last week. Patients
received an OQ-45 from the clinic receptionist at the
time of their appointment, completed it in a waiting area,
and returned it to the receptionist before beginning their
session.
The outcome criteria used in these studies required
an operational definition of the positive treatment re-
sponse of each individual patient. In this research, pa-
tients were considered recovered when they met both
of the criteria for clinically significant change by mov-
ing from the OQ-45 dysfunctional distribution into the
OQ-45 functional distribution (i.e., scored less than 64)
and showing positive gains of sufficient magnitude to
be considered statistically reliable (improvement of at
least 14 points). Since the aim of these studies was not
only to assess whether a patient had recovered but also
to indicate when that recovery occurred, a third crite-
rion had to be specified. Session-by-session assessment
of change raised the possibility that some patients might
be observed continuing in therapy after obtaining a clini-
cally significant change (recovered) or might fluctuate
between recovered and non-recovered status prior to
termination. Therefore, patients were considered recov-
ered at the earliest session at which they persistently
met the criteria for clinically significant change (i.e.,
during the remainder of therapy they did not return to a
non-recovered status).
In analyzing participant results, recovered pa-
tients, as discussed, met both criteria for clinically sig-
nificant change. Improved patients met the criterion
for statistical reliability by improving by at least 14 OQ-
45 points but remained within the same dysfunctional
or functional distribution they were in before starting
OQ

45.2 Administration and Scoring Manual 21


therapy. Deteriorated patients moved at least 14
points in the direction of increasing psychopathology.
Patients showing no change did not improve or dete-
riorate more than 14 points during therapy.
A study of change with persons seeking, or being
referred for help, in employee assistance programs man-
aged by HAI provides interesting data on change (Lam-
bert & Huefner, 1996). One hundred and fifty sites
across the country provided data, but no attempt was
made to collect OQ

data on every employee that asked


for assistance. It was possible to collect data on 3,302
patients who took the pretest and had at least two therapy
visits. The maximum number of visits was 10. Twenty-
one hundred patients had pretreatment scores that placed
them in the dysfunctional range. Their pretreatment
mean Total score was 84.14 (SD =15.82; range 64
148), and the mean Total score at post-treatment was
70.81 (SD =22.46; range 6150). These patients had a
mean of 3.9 sessions of treatment.
The number of participants who met criteria for
clinically significant improvement [i.e., passing the To-
tal score cutoff (63) and improving by the RCI (14)]
when summarized suggest patients improve in very brief
treatments, even when the standard of improvement is
rigorous. Thirty percent (n=627) of clients significantly
improved within 10 sessions. After one session, 107
recovered; after two, 147 recovered; after three, 110
recovered; after four, 82 recovered; and after five, 57
recovered with 124 more improving through the 10
th
session.
Another way to characterize change following
therapy is displayed in Figure 5
1
. Figure 5 uses sloping
procedures to show change on OQ

scores after time in


reference to entry into the ranks of the non-patient
sample. In this graph, one can see that there is a rela-
tionship between severity of disturbance (initial OQ

elevation) and number of sessions to (group) recovery.


When patients are grouped by the number of sessions
they had, it appears that these groups are rank ordered
in regard to initial test scores. Patients in this database
were drawn from an EAP sample similar to that de-
scribed by Lambert and Huefner (1996).
Lambert and associates (e.g., Anderson & Lambert,
2001; Hansen, Lambert, & Forman, 2002; Kadera,
Lambert, & Andrews, 1996) have reported the results
of several studies in this area. In general, they have dem-
onstrated the following: (1) about 1820 sessions are
needed for 50% of patients to recover, (2) reliable change
is achieved faster than recovery, (3) patients with higher
scores recover more slowly (need more sessions) even
though they make larger gains during treatment, (4)
patients show not only great variability from one an-
other in their responses to therapy, but also show wide
fluctuation in their subjective estimates of the intensity
of their symptoms over the course of treatment (few
patients show steady week-to-week linear change), and
(5) although neither therapists nor patients received feed-
back about OQ-45 scores, there is fairly high concor-
dance between when termination occurs and meeting
criteria for recovery. A graph of the dose-response
relationship comparing dysfunctional samples (initial
OQ-45 score 64 or above), using survival analysis sta-
tistics is presented in Figure 6. Similarities in research
results concerning CS change in studies conducted by
Anderson & Lambert, 2001; Kadera et al.,1996, and
Wolgast et al. (2003) are illustrated in Figure 7.
Figure 7 presents the survival recovery curves for
all patients who entered treatment (regardless of their
intake score) and who had at least one treatment session
following intake. The event of interest here was reliable
change. As can be seen, 50% of patients are expected to
meet criterion for reliable change after about 8 sessions
of psychotherapy. Of course, a great deal of future re-
search needs to be done before treatment planning, (in
the form of estimating optimal treatment length) is based
on a firm empirical foundation. The OQ-45 is well-suited
to such a task.
In addition to using the OQ-45 as part of the pro-
cess of initial decision making, the OQ-45 can be used
to help focus the treatment on specific aspects of pa-
tient difficulties. Although validity data do not provide
strong support for the use of OQ-45 subtest scores, these
scores can provide the clinician with clues about areas
of dysfunction. Some patients, for example, may ex-
press greater distress related to Interpersonal Function-
ing while others may appear to have greater dysfunc-
tion in Social Role performance. Occasionally, study-
ing a patients profile of scores on each of the subscales
provides a dramatic illustration of poor functioning in a
particular domain.
The OQ-45 was designed to measure patient
progress and the eventual outcome of mental health ser-
vices. Though it is possible that certain patterns of OQ-
45 responses may coincide with specific symptomatic
presentations related to diagnostic considerations, it
would be difficult to justify the use of such patterns as a
guide for treatment planning at this time. The OQ-45
provides valuable feedback on patient progress by evalu-
ating treatment efficacy and deciding whether to termi-
nate or continue a current treatment protocol, but it is
simply not capable (by itself) of leading an individual
therapist to the most productive treatment strategy. The
OQ

45.2 Administration and Scoring Manual 22


Figure 5 Relationship Between Number of Sessions of Therapy, Pretest OQ

Total Score, and Rapidity of I mprovement


1
Figure reprinted from Lambert, M. J ., & Huefner, J . APA Workshop, August 1997, Chicago
Figure 6. Probability of Clinically Significant Change as a Function of treatment dosage.
OQ-45 is an outcome instrument in the same man-
ner that the MMPI-2 is a diagnostic tool. Both are in
valuable within their specific arena, but much less ef-
fective beyond those boundaries.
OQ

45.2 Administration and Scoring Manual 23


RC Probability
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
13579
1
1
1
3
1
5
1
7
1
9
Sessi ons Reci eved
R
C

P
r
o
b
a
b
i
l
i
t
y

Wolgast, Lambert,
Puschner
Anderson and
Lambert
Figure 7. Probability of Reliable Change as a Function of Treatment Dosage.
Use of the OQ

for Treatment Monitoring


Considerations for Frequency of Monitoring
Treatment Progress. The information provided by the
OQ

is most meaningful when it is first administered to


a patient prior to applying any therapeutic interventions.
The initial administration is best provided during the
intake process. Remember that any intervention, even
an intake interview, is likely to cause patient improve-
ment; therefore, delaying administration of the OQ-45
will result in an underestimation of treatment effects.
Since the OQ

takes a relatively small amount of time


to complete, taking the test on multiple occasions should
not place much of a burden on the client. Subsequent
administrations may be given weekly, or at any deter-
mined midpoint intervals, and at the conclusion of treat-
ment. Since routine treatment typically ends with the
patient leaving treatment at their convenience, data col-
lected on an interval less often than weekly will result in
failure to collect end of treatment data. Irregular ad-
ministration of the OQ-45 typically results in such high
rates of missing data that the purposes of collecting data
can be fatally compromised. While information about
improvement following a specific session may be very
meaningful, perhaps more important is the ability to see
the patterns and trends exhibited by a specific patient
across the course of therapy. We highly recommend
weekly outcome assessment, at least for the first 10 treat-
ment sessions.
Identification of Potential Treatment Failures.
Significant progress has been made in using the OQ-45
to identify patients at risk for treatment failure. Two
parallel methods have been developeda rational (ex-
pert judge) method and a statistical method. Either
method can be applied by providing information to thera-
pists in the form of graphs and messages. Both methods
presume that the essence of improving outcomes for
poorly responding patients is a signaling system that
attempts to identify the failing patient before termina-
tion of services has occurred. Both methods require that
the patient provide session-by-session OQ-45 data and
that it be evaluated between sessions to classify a
patients treatment response as a positive or negative
sign for likely functioning at treatment termination. In
patient-focused research, such a signaling system is
based on the assumption that termination status can in
fact be predicted prior to termination and that providing
treatment progress information to the therapist will posi-
tively affect final outcome.
Rational method. Information regarding early re-
sponse to treatment (dramatic response during the first
three sessions; Haas, Hill, Lambert, & Morrell, 2002),
the dose response relationship (and its size; Anderson
& Lambert, 2001; Howard, Kopta, Krause, & Orlinsky,
1986;), and the reliability of the OQ-45 were used to
create the rational algorithms. Expert judges then agreed
upon cut scores for classifying patients as either, 1)on
OQ

45.2 Administration and Scoring Manual 24


track for a positive outcome, 2) predicted to leave treat-
ment before receiving therapeutic benefit, or to be at
risk for having a negative treatment outcome. For sim-
plicity of communication in the clinical setting, the pa-
tients identified as at-risk are referred to as signal-
alarm cases. This is a term that has precedence in
other research aimed at improving the quality of patient
care (Kordy et al., 2001).
Empirically derived method. The empirically de-
rived method employed a large data base and statistical
model to identify poorly responding patients. The data-
base for the expected recovery curves was drawn from
numerous sites that were collapsed into a national data-
base for research using the OQ-45. This database was
created by a research agreement that allows various
provider groups, managed care organizations, and other
treatment settings to use the OQ-45 without a licensing
fee in return for submitting all data gathered to the
Brigham Young University Psychotherapy Research
Center. Submitting groups included a wide range of treat-
ment settings, and patients who were treated by licensed
professionals using a variety of techniques. This resulted
in a total aggregate sample of 11,492 patients with two
or more OQ-45 administrations.
An initial graphical analysis of the data revealed
decelerating growth curves similar to those identified in
previous studies on recovery curves. In their 1986 study,
Howard et al. clearly established a lawful linear rela-
tionship between the log of the number of sessions and
the normalized probability of patient improvement. This
lognormal relationship appears to be quite common in
psychotherapy outcome studies and illuminates the fact
that larger doses, or number of sessions, are required to
produce a higher percentage of recovered patients.
A similar relationship was found with these data,
and subsequent analyses showed that a log transforma-
tion of the session number also produced a data set that
more closely approximated a normal curve. This allowed
the analysis to proceed using elements of a general lin-
ear model, since the data no longer violated the requi-
site assumptions of normality.
Ideally it would be possible to generate a recovery
curve for every possible intake score on the OQ-45 be-
tween 0 and 180. Though the data set used for this pur-
pose was large, it was not of sufficient size to be able to
establish an individual recovery curve for each intake
score because the statistical techniques require a larger
number of cases for reliable modeling. OQ-45 scores
falling at the extremes of the continuum are quite rare.
Therefore, the full range of scores was divided into dis-
tinct groups by percentiles. This yielded 50 groups, iden-
tified by intake score, with no fewer than 220 patients
in each band, representing approximately two percent
of the total sample. The resulting distribution across
intake scores was approximately normal, with intake
score increments as small as one point at the group av-
erage and a larger spread between intake scores at the
two extreme tails.
The resulting groups of data were analyzed using
the PROC MIXED functions of the Statistical Analysis
System (SAS) to generate a linear model for recovery
curves. This was necessary for several reasons, such as
the nested nature of these data, missing data points for
many of the patients at various sessions, and the influ-
ence of both fixed and random variables on the eventual
estimated recovery curves. This form of mixed model
analysis is also called Hierarchical Linear Modeling
(HLM), Multilevel Linear Modeling, Variance Compo-
nents Modeling, Random Coefficient Regression Mod-
eling, and Systematically Varying Slopes Modeling
(Finch, Lambert, & Schaalje, 2001).
For creation of the expected recovery curves, this
modeling technique was applied to each of the groups
created by dividing intake scores into 50 clusters by
percentile. A random slope, and random intercept linear
model for the OQ-45 total score by the log of each ses-
sion number was created, accounting for the within-sub-
ject variance of each participant, between subject vari-
ance, and the between-site variance. Mean estimates were
calculated for each session from 1 through 20 for each
of the 50 subdivisions by intake score. Error estimates
from the fixed effects, random effects, and correlations
were combined into an aggregate error term for the esti-
mates of the OQ-45 total score at each session. This
combined error term was then used to establish the up-
per and lower bounds of tolerance intervals for each of
the coefficients. The tolerance interval is a quality con-
trol protocol often used in engineering applications.
Tolerance intervals determine the probability that a given
OQ-45 score at a given session will fall within a speci-
fied interval. With large data sets the estimated upper
and lower limits are equivalent to prediction intervals
(Ostle & Malone, 1996). Thus, the tolerance intervals
allowed for the identification of OQ-45 total score val-
ues that have an established probability of falling out-
side of the upper and lower limits of the tolerance inter-
val.
Tolerance intervals were calculated for the expected
mean OQ-45 total score at each session. A two-tailed,
80% tolerance interval was then created around each of
these estimates. This provided a cutoff score for each
session for identifying patients that might be included
OQ

45.2 Administration and Scoring Manual 25


in the 10% of clients likely to fail in therapy or drop out
early. Next, a two-tailed, 68% tolerance interval was
calculated for each expected mean by session number.
This provided a cutoff score for individuals whose
progress in therapy was either above or below the ex-
pected recovery rate by at least one standard deviation.
With each mean estimate and the upper and lower bounds
for two-tailed 80% and 68% tolerance intervals calcu-
lated, it was possible to plot lines across the mean esti-
mates of OQ-45 total scores for each session as well as
for each upper and lower bound of the tolerance inter-
vals. This produced a visual representation of the ex-
pected recovery curve by OQ-45 total scores across each
session centered within the upper and lower cutoff
bounds of each tail of the tolerance intervals.
These coefficients and tolerance intervals formed
the core of the empirically derived warning system by
providing table values and charts of predicted thera-
peutic gains against which any given patient can be com-
pared. After an individual has completed an OQ-45
administration, the total score can then be compared to
the corresponding session value for others beginning
therapy with a comparable pretest score. If at any ses-
sion following intake the OQ-45 total score for a pa-
tient is within the 68% tolerance interval shown on the
chart, then therapy is proceeding as anticipated for this
particular patient and a green message can be given as
feedback for the therapist to proceed as usual. If the
same OQ-45 score falls outside of the upper 68% toler-
ance interval (upper 16%) but is still within the upper
bound of the 80% tolerance interval, the patient is be-
ginning to deviate by greater than one standard devia-
tion from what is expected of a typical person at this
point in therapy, and the therapist would receive a yel-
low message as a warning to attend to this patients
progress. This one standard deviation unit approximates
a 14 point increase in the OQ-45 score, the marker for
reliable change. If this same OQ-45 score falls above
the upper limits of the 80% tolerance interval (upper
10%), then the patient is deviating significantly in a
negative direction from what is predicted for patients at
this point in therapy, and his or her recovery curve is
within the range of scores predicted for the 10% of pa-
tients whose progress is most in question. The 10%
boundary is consistent with the estimate that about 5-
10% of patients deteriorate following psychotherapy
(Lambert & Ogles, 2004). At this point the therapist
would receive a warning message that therapy may be
heading toward an unsuccessful conclusion and that the
therapist needs to consider an alternative course of ac-
tion. As with the rationally derived method, those pa-
tients who receive either red or yellow warnings are re-
ferred to as signal-alarm cases.
Figure 8 depicts a sample graph or quality manage-
ment chart of a patient who scored an 87 on the OQ-45
at intake and whose response to treatment was plotted
across 20 sessions. Therapy proceeded along the ex-
pected course for this moderately depressed patient with
worsening occurring at the sixth session. At this point
in therapy, the patient had just lost her job, an event that
may have caused her worsening. Over the ensuing weeks
she had several job offers, and was able to return to
work. This patient continued to make progress through
session 16 as she had returned to the green zone. Us-
ing this system, the therapist would be given a white
signal at session 19 indicating that it might be time to
terminate. The patient continued to improve through
the twentieth and final session.
The accuracy of the algorithms has been tested and
both rational and statistical methods appear to be suc-
cessful at identifying patients who have negative treat-
ment outcomes. Lambert, Whipple, Bishop et al. (2002),
examined predictive accuracy with 492 clients who were
in treatment at a university counseling center. Thirty-
six (7.3%) of these clients deteriorated during treatment.
Twenty-nine of these deteriorators (80.6%) were identi-
fied prior to termination using the rational algorithms,
and 7 (19.4%) were missed. This level of accuracy came
at the expense of misidentifying 95/492 (20.8%) of the
clients as signal-alarm cases who did not in fact dete-
riorate. These rates compared favorably with identifi-
cation procedures based on a purely statistical approach
(Finch et al. 2001) that identified all 36 (100%) of dete-
riorated clients but misclassified 82 (18%) of the clients
as signal-alarm cases who did not in fact deteriorate. In
contrast to the empirical method, one advantage of the
rationally derived method is that it identified potential
treatment failures more rapidly and was more likely to
limit identification to patients who were initially more
disturbed and therefore of greatest clinical concern.
Methods for identifying signal-alarm cases have been
embedded in the software products for administering
and scoring the OQ-45.
Does use of the signal-alarm system enhance pa-
tient outcomes? Lambert et al. (2001) undertook a study
to determine if providing therapists with feedback re-
garding patient progress would affect patient outcome
and the number of sessions attended. This application
of patient-focused research enables practitioners to de-
termine if a specific intervention is working for a spe-
cific patient. Using the rational algorithms these re-
searchers hypothesized the following: (1) when thera-
OQ

45.2 Administration and Scoring Manual 26


Expected Recovery Curve For Intake OQ-45 Total Scores 87-88
0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Sessi o n Numb er
Red War ni ng Cutof f
Bl ue War ni ng Cutof f
Whi te War ni ng Cutof f
Expected Recover y
Yel l ow War ni ng Cutof f
pists are notified that a patient is not progressing ad-
equately, the patient would show a better outcome than
a similar patient whose therapist was not notified, and
(2) that patients of therapists receiving feedback would
show more cost-effective attendance than similar pa-
tients of therapists not receiving feedback.
To test these hypotheses, data were collected from
clients at a university counseling center until at least 30
patients in both the experimental and the control groups
had received or could have received a signal warning
(i.e., a notification of inadequate progress). The fol-
lowing four treatment conditions were then established:
on-track patients with therapists receiving feedback (OT-
Fb); on-track patients with therapists not receiving feed-
back (OT-NFb); not-on-track patients (signal-alarm
cases) with therapists receiving feedback (NOT-Fb); and
not-on-track patients with therapists not receiving feed-
back (NOT-NFb).
Both hypotheses were confirmed. It was found that
the NOT-Fb group had significantly lower OQ scores
at termination than the NOT-NFb group, which actu-
ally showed worsening. Twenty-six percent of the NOT-
Fb cases reached J acobson and Truaxs (1991) crite-
rion for reliable or clinically significant change versus
16% of those in the NOT-NFb group. It was also found
that NOT-Fb patients received significantly more ses-
sions than NOT-NFb patients and that patients in the
OT-Fb condition received significantly fewer sessions
than the patients in the OT-NFb condition. This result
was interpreted as being consistent with the second hy-
pothesisfeedback increased sessions for NOT patients
while decreasing them for OT patients. This finding
suggests that this feedback system may make therapy
more efficient, in that therapists are able to spend less
time on clients who have improved, and more time on
those who need additional attention.
With the intent of addressing limitations resulting
from the small sample size of the Lambert et al. (2001)
study, a replication was performed by Lambert et al.
(2002), with the primary difference being a substan-
tially larger sample size (1020 vs. 609 participants),
which included a substantially larger number of signal-
alarm cases (240 vs. 66 participants). It was again found
that feedback to therapists improved outcome. The
NOT-Fb group had lower scores at termination than the
NOT-NFb group, which again showed worsening. At
termination, the mean OQ

score of those in the NOT-


Fb group was 73.87 (SD=25.34), while that of the NOT-
NFb group was 83.72 (SD=21.05). When data from
both studies were combined, it was found that 15.2% of
those in the NOT-Fb group deteriorated and 30.5% im-
proved or recovered compared to 23.2% and 17.5% in
OQ

45.2 Administration and Scoring Manual 27


Figure 8.
the NOT-NFb groups (chi square=8.33
(df=2)
, p =.016).
One significant limitation of the feedback research
has been the fact that the majority of patients predicted
to have a poor outcome and whose therapists received
feedback did not attain a satisfactory outcome at termi-
nation, even though their improvement surpassed that
of patients whose therapists did not receive feedback.
This suggests that a strengthened feedback manipula-
tion is necessary if better outcomes are desired for pa-
tients predicted to have a poor treatment response. To
address this limitation, Whipple et al. (2003) replicated
the two prior studies while addressing an additional in-
tervention. Therapists who were treating NOT patients
and receiving feedback were also provided with a set of
clinical support tools (CSTs) to systematically direct
their attention toward certain factors known to be im-
portant in psychotherapy outcome. These factors were
quality of the therapeutic relationship, patient change-
related motivation, patient social support network, pos-
sible need to reevaluate diagnostic formulations, and pos-
sible need for medication referral.
The same four groups used in the previous feed-
back studies were used in this experiment. However, a
fifth group (NOT-Fb+CST) was created, which con-
sisted of the NOT-Fb patients with whom the therapists
utilized one or more of the CSTs (n=59). It was found
that the NOT-Fb+CST group improved significantly
more than the NOT-Fb group (p<.05) and again found
that the NOT-Fb group improved significantly more than
the NOT-NFb group (p<.05). It was also found that the
NOT-Fb+CST group improved more than the NOT-Fb
group (p<.05) in the point-of-feedback to termination
period. Furthermore, even when the data were subjected
to the rigorous criteria of clinically meaningful change,
it was found that 8.5% of those in the NOT-Fb+CST
group deteriorated and 49.2% improved, compared to
13.6% and 33% in the NOT-Fb group, and 19.1% and
25.2% in the NOT-NFb group. Regarding the amount
of psychotherapy, the results showed that the NOT-
Fb+CST group received significantly more sessions than
the NOT-Fb group and the NOT-NFb group (p<.001).
Although additional research is necessary, these results
indicate that the use of CSTs with therapist feedback
may significantly improve psychotherapy outcome.
The results of the feedback studies are presented
graphically in Figure 9. The size of the treatment effect
is d=.40. When one considers that the comparison group
was treatment as usual (rather than a no-treatment
control group), this result is substantial.
Use of the OQ

for Treatment Outcomes Assessment


Formal outcome research is a manifold enterprise
ideally incorporating numerous measures of patients
subjective discomfort, expert judge ratings, physiologi-
cal indices, environmental data sources such as employee
reports of work performance, and the like (Lambert &
Lambert 1999). While it is commonly accepted that
such a multidimensional approach offers greatly im-
proved means of charting patient progress in terms of
both scientific rigor and comprehensive assessment,
practical considerations encountered in routine clinical
practice limit a clinical researchers ability to conduct
comprehensive assessments that integrate criteria from
multiple sources.
Outcome-minded third-party payers show a con-
tinued interest in brief measures of patient improvement
that tap a variety of potential outcomes, without the
attendant methodological complexities of formal out-
come research. The OQ

is designed for repeated mea-


surement of client status through the course of therapy
and at termination. Ease of administration and scoring,
low cost, sensitivity to changes in psychological dis-
tress over short periods of time, and ability to tap a
wide array of symptoms and role performance difficul-
ties make this instrument useful in a variety of clinical
and counseling applications.
As has been previously mentioned, the OQ

was
formulated in accordance with Lamberts (1983) orga-
nizational scheme for outcome assessment, suggesting
that three dimensions or content areas be evaluated:
intrapersonal (subjective discomfort) or symptomatic
distress, interpersonal functioning, and social role per-
formance. Use of this conceptualization seems justi-
fied in that its breadth affords a comprehensive review
that encompasses both the patients inner life as well as
functioning in applied situations like work and school.
In addition, some items were included to tap positive
states of mental health and life functioning. It was be-
lieved that these items would not only assess quality of
life as perceived by the client, but also increase the range
of measurement so that the test did not suffer from an
artificially low ceiling as is true in tests that only mea-
sure the presence or absence of psychopathology and to
exclude aspects of healthy functioning.
Essentially, the OQ

was developed in an attempt


to bridge the gap between the demands of the health
care providers and the stringent requirements of the re-
search community. While there are obvious shortcom-
ings to such a compromise, the net result is a psycho-
metrically sound instrument that can be used in real
world applications to assess mental health care treat-
ment outcome.
OQ

45.2 Administration and Scoring Manual 28


50
55
60
65
70
75
80
85
90
95
Pretreatment OQ Warning OQ Termination OQ
O
Q

T
o
t
a
l

S
c
o
r
e
NOT-NFb (N=286)
NOT-Fb (N=298)
OT-NFb (N=985)
OT-Fb (N=1,036)
Figure 9. Treatment gains for signal alarm cases following feedback to therapists
about potential treatment failure verses no-feedback.
CLINICAL APPLICATIONS OF THE INSTRU-
MENT FOR OUTCOME ASSESSMENT
Benchmarks for Evaluating Outcomes in Clinical
Practice.
Some provider groups have expressed an interest in
having benchmarks for comparing outcomes in their
setting with the outcomes in other settings. Table 17
presents several benchmarks for comparisons. These
data were gathered from a variety of settings that will
be of interest to persons who want this kind of informa-
tion. It includes pre-treatment OQ

scores for outpa-


tients, EAP settings, university counseling centers, and
Medicaid/Medicare patients. It also provides an esti-
mate of the number of sessions that were necessary to
achieve the post-treatment gains that were reported, as
well as the median number of sessions necessary for at
least half the patients within each sample to show a re-
liable gain. Table 18 presents benchmark data in terms
of percentage of patients attaining clinically significant
treatment status across different treatment settings.
Index Group (N) Outpatient (1,715) EAP (3,589) Univ Counseling Center (1,188) Medicade/Medicare (460)
Benchmark
Pre-test mean (sd) 80.98 (24.84) 68.48 (22.88) 67.84 (22.90) 85.30
Post-test mean (sd) 72.46 (27.13) 59.37 (22.82) 60.19 (24.24) 79.01 (27.28)
Number of Sessions mean (sd) 4.34 (4.22) 3.74 (2.08) 5.75 (5.36) 4.44 (3.23)
Change Per Session mean -1.98 -2.44 -1.43 -1.42
Mean Number of Sessions for Patients Who Improve 3.62 (3.41) 3.38 (1.88) 4.70 (4.39) 4.02 (3.13)
Median Session for 50% of Cases to Improve 6.06 5.79 8.21 10.29
TABLE 17
Outpatient Benchmarks for the OQ-45
OQ

45.2 Administration and Scoring Manual 29


Site Deteriorated No Change Improved Recovered
Employee Assistance Program 216 (6.6%) 1911 (58.5%) 645 (19.7%) 497 (15.2%)
University Counseling Center 115 (9.7%) 684 (57.6%) 239 (20.1%) 150 (12.6%)
Local HMO 84 (14.1%) 321 (53.9%) 122 (20.5%) 68 (11.4%)
National HMO 40 (7.5%) 258 (48.1%) 153 (28.5%) 85 (15.9%)
Training CMH 4 (3.2%) 57 (45.6%) 39 (31.2%) 25 (20.0%)
State CMH 37 (10.2%) 219 (60.7%) 74 (20.5%) 31 (8.6%)
Total 496 (8.2%) 3448 (56.8%) 1272 (20.9%) 856 (14.1%)
Number of Patients, By Site, Who Demonstrated Reliable Negative Change (Deteriorated), Did Not Demonstrate
Reliable Change (No Change), Demonstrated Reliable Positive Change (Improved), and Demonstrated Reliable
Change into the Functional Range (Recovered)
TABLE 18
Provision of Feedback on Patient Treatment Response
Feedback based on the results of OQ

administra-
tions may be used in a wide range of applications. Fre-
quently clients will ask what purpose the measure serves
and inquire as to their personal results. The course of
action to be followed here is typically left up to the cli-
nician to determine, and may even include a full disclo-
sure of the results. Such an inquiry is essentially the
equivalent of a client asking the question How am I
doing . . . am I getting better? and should be handled
accordingly on a case-by-case basis. Charting the
progress of a specific client may also be quite informa-
tive to a clinician and can even provide validating feed-
back as to therapeutic setbacks, stagnation, or rate and
pattern of progress. Hawkins et al., (2004) studied the
effects of providing therapists and patients feedback on
patient progress. This treatment condition was contrasted
with a no-feedback condition (treatment-as-usual), and
therapist feedback only condition. Results suggested the
value of patient/therapist feedback.
For a third-party provider, the most meaningful feed-
back is typically provided by an aggregate of clients
and sessions. Once OQ

results have been accumu-


lated across multiple clients and sessions, the resulting
data may provide critical feedback on the progress of
patients, typical patterns of improvement for the patients
of different clinicians, and the effectiveness of treatments
found in various hospitals and regions. Various re-
search groups and health care systems are committed to
improving the quality of patient care through the provi-
sion of feedback to providers, patients, health care sys-
tems, and administrators. We believe the OQ-45 is ideal
for such work.
ADDITIONAL VERSIONS OF THE OQ

Given the practical constraints of applied clinical


settings, some providers have called for shorter versions
of the OQ

in hopes that the advantages of decreased


administration and scoring time would not appreciatively
reduce the reliability and validity of the Total Score. As
a consequence, we have created one shorter version to
assess outcome (OQ

10.1). The advantage of this scale


is its brevity. Psychometric data on its validity and re-
liability show that it maintains adequate psychometric
properties although it is not as reliable or valid as the
45-item version. At the present time, this version of the
OQ

is only being released for research purposes (a


manual can be obtained from APCS).
Another shorter version of the OQ

is the Life Sta-


tus Questionnaire (LSQ

-30), a 30-item version of the


OQ

. Thirty of the original 45 questions were selected


based on their sensitivity-to-change scores, and the fif-
teen questions least sensitive to change were dropped.
The LSQ

-30 is also only being released for research


purposes at this time.
Currently, research is being conducted on a newly
devised Severe Outcome Questionnaire (SOQ

-45),
which will be used with individuals who are persistently
and severely mentally ill.
Foreign Versions of the OQ

The OQ

has been translated into several foreign


languages: Spanish, German, French, Dutch, Swed-
ish, Norwegian, Hebrew, J apanese, and others. The
German and Spanish versions have been validated. A
full report of the German study can be found in Lam-
bert, Hannover et al. (2002).
OQ

45.2 Administration and Scoring Manual 30


References
Abe, J . S., & Zane, N. W. (1990). Psychological
maladjustment among Asian and White American col-
lege students: Controlling for confounds. Journal of
Counseling Psychology, 37, 437444.
Anderson, E. M., & Lambert, M. J . (2001). A sur-
vival analysis of clinically significant change in outpa-
tient psychotherapy. Journal of Clinical Psychology,
57, 875888.
Andrews, F. M., & Withey, S. B. (1974). Develop-
ing measures of perceived life quality: Results from
several national surveys. Social Indicators Research,
1, 126.
Barkham, M., Margison, F., Leach, C., Lucock,
M., Mellor-Clark, J ., Evans, C. et al. (2001). Service
profiling and outcomes benchmarking using the
CORE_OM: Toward practice-based evidence in the
psychological therapies. Journal of Consulting and
Clinical Psychology, 69, 184196.
Bauer, S., Lambert, M. J ., & Nielsen, S. L. (2004).
Clinical significance methods: A comparison of statis-
tical techniques. Journal of Personality Assessment,
82,60-70)
Beck, A. T., Ward, C. H., Mendelson, M., Mock,
J ., & Erbaugh, J . (1961). An inventory for measuring
depression. Archives for General Psychology, 4, 53
63.
Beckstead, D. J ., Hatch, A. L., Lambert, M. J .,
Eggett, D. L., Goates, M. K., & Vermeersch, D. A.
(2003). Clinical significance of the Outcome Question-
naire (OQ-45.2). The Behavior Analyst Today, 4, 79
90.
Beiser, M. (1983). Components and correlates of
mental well being. Journal of Health and Social Be-
havior, 15, 320327.
Beutler, L. E. (2001). Comparisons among quality
assurance systems: From outcome assessment to clini-
cal utility. Journal of Consulting and Clinical Psychol-
ogy, 69, 197204.
Blau, T.H. (1977). Quality of life, social interac-
tion, and criteria of change. Professional Psychology,
8, 464473.
Booth, H. (1999). Gender, power and social change:
Youth suicide among Fiji Indians and Western Samo-
ans. The Journal of the Polynesian Society, 108, 30
68.
Brown, G., Burlingame, G. M., Lambert, M. J .,
J oan, E., & Vaccaro, J . (2001). Pushing the quality en-
velope: A new outcomes management system. Psychi-
atric Services, 52, 925934.
Brown, G., & Lambert, M. J . (J une, 1998). Track-
ing patient progress: Decision making for cases who
are not benefiting from psychotherapy. Paper presented
at the annual meetings of the Society for Psychotherapy
Research, Snowbird, Utah.
Burlingame, G.M., Lambert, M.J ., Reisinger, C.W.,
Neff, W.M., & Mosier, J . (1995). Pragmatics of track-
ing mental health outcomes in a managed care setting.
The Journal of Mental Health Administration, 22, 226
236.
Cheng, D., Leong, F. T., & Geist, R. (1993). Cul-
tural differences in psychological distress between Asian
and Caucasian American college students. Journal of
Multicultural Counseling and Development, 21, 182-
190
Cho, M. J ., & Kim, K. H. (1998). Use of the Center
for Epidemiologic Studies Depression (CES-D) Scale
in Korea. The Journal of Nervous and Mental Disease,
186, 304310.
Cohen, J ., & Cohen P. (1983). Applied multiple
regression / correlation analysis for the behavioral
sciences (2nd ed.). New J ersey: Lawrence Erlbaum
Associates.
Cohen, J . & Cohen P. (1983). Applied Multiple
Regression/Correlation Analysis for the Behavioral
Sciences. (2nd ed.). New J ersey: Lawrence Erlbaum
Associates.
Cronbach, L.M. (1951). Coefficient Alph and the
Internal Structure of Tests. Psychometrica, 16, 297
334.
Dana, R. H. (1998). Multicultural assessment of
personality and psychopathology in the United States:
Still art, not yet science, and controversial. European
Journal of Psychological Assessment, 14, 6270.
De la Para, G. & von Bergen, A. (J une, 2002). Use
of the Outcome Questionnaire-45 in the Spanish lan-
guage version as applied in Chile. Paper presented at
the annual meetings of the Society for Psychotherapy
Research, Santa Barbara, CA.
Derogatis , L. R. (1983). The SCL-90-R: Adminis-
tration, Scoring and Procedures Manual-II. Towson,
MD: Clinical Psychometric Research.
Derogatis, L. R. (1977). The SCL 90 manual:
Scoring, administration and procedures for the SCL
90. Baltimore: J ohns Hopkins University School of
Medicine, Clinical Psychometrics Unit.
Diener, E. (1984). Subjective well being. Psycho-
logical Bulletin, 95(3), 542575.
Doerfler, L. A., Addis, M. E., & Moran, P. W.
(2002). Evaluating mental health outcomes in an inpa-
OQ

45.2 Administration and Scoring Manual 31


tient setting: Convergent and divergent validity of the
OQ-45 and BASIS-32. The Journal of Behavioral
Health Services and Research, 29, 394404.
Drum, D. J ., & Baron, A. (1998). Highlights of the
Research Consortium outcomes Project. Paper pre-
sented at the annual meeting of the Research Consor-
tium of Counseling and Psychological Services in Higher
Education. Santa Fe, New Mexico.
Durham, C. J ., McGrath, L. D., Burlingame, G. M.,
Schaalje, G. B., Lambert, M. J ., & Davies, D. R. (2002).
The effects of repeated administrations on self report
and parent report scales. Journal of Psychoeducational
Assessment, 20, 240257.
Finch, A. E., Lambert, M. J ., & Schaalje, B. G.
(2001). Psychotherapy quality control: The statistical
generation of expected recovery curves for integration
in an early warning system. Clinical Psychology and
Psychotherapy, 8, 231242.
Feldman, L. A. (1993). Distinguishing depression
and anxiety in self report: Evidence from confirmatory
factor analysis on nonclinical and clinical samples.
Journal of Consulting and Clinical Psychology, 61,
631638.
Friedman, P.H. (1994). Process of change, thera-
peutic bond and outcome measures of psychotherapy:
An integrative approach. Foundation for Well Being
Research Bulletin, 105.
Frisch, M. B., Cornell, J ., Villanueva, M., &
Retzlaff, P. J . (1992). Clinical validation of the quality
of life inventory: A measure of life satisfaction for use
in treatment planning and outcome assessment. Psy-
chological Assessment, 4, 92101.
Frisch, B. M. (1988). Quality of Life Inventory.
Unpublished manuscript.
Gregersen, A. T., Nebeker, R. S., Seely, K. I., &
Lambert, M. J . (2005). Social validation of the Out-
come Questionnaire: An assessment of Asian and Pa-
cific Islander college students. Journal of Multicultural
Counseling and Development,33 (2)
Haas, E., Hill, R., Lambert, M. J ., & Morrell, B.
(2002). Do early responders to psychotherapy maintain
treatment gains? Journal of Clinical Psychology, 58,
11571172.
Hansen, N. B., Lambert, M. J ., & Forman, E. V.
(2002). The psychotherapy dose-response effect and its
implications for treatment delivery services. Clinical
Psychology: Science and Practice, 9, 329343.
Hansen, N. B., Umphress, V., & Lambert, M. J .
(1998). The reliability and validity of a short form of
the Inventory of Interpersonal Problems. Journal of
Psychoeducational Assessment, 16, 201214.
Harlinger, D. D., Auger, C., Garcia, A., &
Rodriguez, J . (2002). Cuestionario Resltados de
Intervencion (Assessing Outcome in Clinical Practice).
Hatfield, D. R., & Ogles, B. M. (2002). The cur-
rent climate of outcome measures use in clinical prac-
tice. Unpublished Manuscript. Department of Psychol-
ogy. Ohio University, Athens, OH.
Hawkins, E. J ., Lambert, M. J ., Vermeersch, D. A.,
& Slade, K. (in press). The therapeutic effects of pro-
viding client progress information to clients and thera-
pists. Psychotherapy Research,
Hawkins, E. J ., Lambert, M. J ., Vermeersch, D., &
Slade, K. (J une, 2002). The effects of providing patient
progress information to therapists and patients. Paper
presented at the annual meeting of the Society for Psy-
chotherapy Research, Santa Barbara, CA.
Horowitz, L.M. (1979). On the cognitive structure
of interpersonal problems treated in psychotherapy. Jour-
nal of Consulting and Clinical Psychology, 47, 515.
Horowitz, L. M., Locke, K. D., Morse, M. B.,
Waikar, S. V., Dryer, D. C., Tarnow, E. et al. (1991).
Self derogations and the integration theory. Journal of
Personality and Social Psychology, 61, 6879.
Horowitz, L. M., Rosenberg, S. E., Baer, B. A.,
Ureno, G., & Villasenor, V. S. (1988). Inventory of in-
terpersonal problems: Psychometric properties and clini-
cal applications. Journal of Consulting and Clinical
Psychology, 56, 885892.
Howard, K. I., Kopta, S. M., Krause, M. S. &
Orlinsky, D. E. (1986). The dose-effect relationship in
psychotherapy. American Psychologist, 41, 159164.
Howard, K. I., Moras, K., Brill, P. L., Martinovich,
Z., & Lutz, W. (1996). Efficacy, effectiveness, and cli-
ent progress. American Psychologist, 51, 10591064.
Hsu, L. K., & Folstein, M. F. (1997). Somatoform
disorders in Caucasian and Chinese Americans. The
Journal of Nervous and Mental Disease, 185, 382387.
J acobson, N. S., Follette, W. C., & Revenstorf, D.
(1984). Psychotherapy outcome research: Methods for
reporting variability and evaluating clinical significance.
Behavior Therapy, 15, 336352.
J acobson, N. S., Roberts, L. J ., Berns, S. B., &
McGlinchey, J . B. (1999). Methods for defining and
determining the clinical significance of treatment effects:
Description, application, and alternatives. Journal of
Consulting and Clinical Psychology, 67, 300307.
J acobson, N. S., & Truax, P. (1991). Clinical sig-
nificance: A statistical approach to defining meaningful
change in psychotherapy research. Journal of Consult-
ing and Clinical Psychology, 59, 1219.
Kadera, S. W., Lambert, M. J ., & Andrews, A. A.
OQ

45.2 Administration and Scoring Manual 32


(1996). How much therapy is really enough? A ses-
sion-by-session analysis of the psychotherapy dose-ef-
fect relationship. Journal of Psychotherapy Practice
and Research, 5, 132151.
Kaufman, M. B. (1997). Effects of therapist self-
monitoring on therapeutic alliance and subsequent thera-
peutic outcome. Doctoral Dissertation, Department of
Psychology, Seton Hall University.
Kendall, P. C. (1999). Clinical Significance. Jour-
nal of Consulting and Clinical Psychology, 67, 283
285.
Kendall, P. C., Marrs-Garcia, A., Nath, S. R., &
Sheldrick, R. C. (1999). Normative comparisons for the
evaluation of clinical significance. Journal of Consult-
ing and Clinical Psychology, 67, 285299.
Kopta, S.M., Howard, K.I., Lowrey, J.L., & Beutler,
L.E. (1994). Patterns of symptomatic recovery in psy-
chotherapy. Journal of Consulting and Clinical Psy-
chology, 62, 10091016.
Kordy, H., Hannver, W., & Richard, M. (2001).
Computer-assisted feedback-driven quality management
for psychotherapy: The Stuttgart-Heidelberg model.
Journal of Consulting and Clinical Psychology, 69,
173183.
Lambert, M. J . (1983). Introduction to assessment
of psychotherapy outcome: Historical perspective and
current issues. In M. J . Lambert, E. R. Christensen,
and S. S. DeJ ulio (Eds.), The assessment of psycho-
therapy outcome. (332). New York: J ohn Wiley and
Sons.
Lambert, M. J. (2001). Psychotherapy outcome and
quality improvement: Introduction to the special sec-
tion on client-focused research. Journal of Consulting
and Clinical Psychology, 69, 147149.
Lambert, M. J ., Burlingame, G. M., Umphress, V.
J ., Hansen, N. B., Vermeersch, D., Clouse, G., &
Yanchar, S. (1996). The reliability and validity of the
Outcome Questionnaire. Clinical Psychology and Psy-
chotherapy, 3, 106116.
Lambert, M. J ., Hannover, W., Nisslmuller, K.,
Richard, M., & Kordy, H. (2002). Fragebogen zum
ergebnis von psychotherpie: Zur reliabilitat und validitat
der deutschen ubersetzung des Outcome Questionnaire
45.2 (OQ-45.2) [Questionnaire on the results of psy-
chotherapy: Reliability and validity of the German trans-
lation of the Outcome Questionnaire 45.2 (OQ45.2)].
Zeitschrift fur Klinische Psychologie und
Psychotherapie, 31, 4047.
Lambert, M. J ., Hansen, N. B., & Finch, A. E.
(2001). Client-focused research: Using client outcome
data to enhance treatment effects. Journal of Consult-
ing and Clinical Psychology, 69, 159172.
Lambert, M. J ., Hansen, N. B., Umphress, V.,
Lunnen, K., Okiishi, J ., Burlingame, G. et al. (1996).
Administration and scoring manual for the Outcome
Questionnaire (OQ 45.2). Wilmington, DL: American
Professional Credentialing Services.
Lambert, M.J ., & Huefner, J .C. (1996). Measur-
ing clinically significant improvement in the EAP envi-
ronment. EAP, 6, 2223.
Lambert, M. J ., & Lambert, J . M. (1999). Use of
psychological tests for assessing treatment outcome. In
M. E. Maruish (ed.). The use of psychological testing
for treatment planning and outcome assessment (2nd
ed.). Mahwah, NJ : Lawrence Erlbaum.
Lambert, M. J ., & Ogles, B. M. (2004). The effi-
cacy and effectiveness of psychotherapy. In M. J . Lam-
bert (Ed.), Bergin and Garfields Handbook of psycho-
therapy and behavior change (5th Ed., pp. 805821).
New York, NY: Wiley.
Lambert, M.J ., Okiishi, J .C., Finch, A.E., &
J ohnson, L., (1998). Outcome assessment: From
conceptualization to implementation. Professional Psy-
chology: Practice and Research, 29, 6370.
Lambert, M. J ., Shapiro, D.A., & Bergin, A. E.
(1986). The effects of psychotherapy. In S.L. Garfield
& A. E. Bergin (Eds.). Handbook of Psychotherapy
and Behavior Change. (3rd ed.) New York: John Wiley
and Sons.
Lambert, M. J ., Whipple, J . L., Bishop, M. J .,
Vermeersch, D. A., Gray, G. V., & Finch, A. E. (2002).
Comparison of empirically derived and rationally de-
rived methods for identifying clients at risk for treat-
ment failure. Clinical Psychology and Psychotherapy,
9,149164.
Lambert, M. J ., Whipple, J . L., Smart, D. W.,
Vermeersch, D. A., Nielsen, S. L., & Hawkins, E. J .
(2001). The effects of providing therapists with feed-
back on client progress during psychotherapy: Are out-
comes enhanced? Psychotherapy Research, 11, 4968.
Lambert, M. J ., Whipple, J . L., Vermeersch, D. A.,
Smart, D. W., Hawkins, E. J ., Nielsen, S. L., & Goates,
M. K. (2002). Enhancing psychotherapy outcomes via
providing feedback on client progress: A replication.
Clinical Psychology and Psychotherapy, 9, 91103.
Leuger, R. J ., Howard, K. I., Martinovich Z., Lutz,
W., Anderson, E.E., & Grissom, G. (2001). Assessing
treatment progress of individual clients using expected
treatment response models. Journal of Consulting and
Clinical Psychology, 69, 150158.
Lipsey, M. W. (1990). Design sensitivity. Newbury
Park: Sage Publications.
OQ

45.2 Administration and Scoring Manual 33


Ray, J . W., & Shadish, W. R. (1996). How inter-
changeable are different estimators of effect size? Jour-
nal of Consulting and Clinical Psychology, 64(6),
13161325
Lunnen, C., & Ogles, B. M. (1998). A multi-per-
spective, multi-variable evaluation of reliable change.
Journal of Consulting and Clinical Psychology, 66,
400-410.
Meuller, R., Lambert, M. J ., & Burlingame, B. M.
(1998). The Outcome Questionnaire: A confirmatory
factor analysis. Journal of Personality Assessment, 70,
246262.
Meyer, G. J. (1998, August). A consumers perspec-
tive on outcome assessment. In K. L. Moreland (Chair),
Different perspectives on the measurement of treatment
outcomes. Presented at the 106th annual convention of
the American Psychological Association, San Francisco,
CA.
Meyer, F. & Schulte, D. (2002). Zur validitat der
beurteilung des therapieerfolgs durch therapeutin.
Zeitschift fur Klinische Psychology und Psychotherpie,
31, 5361.
Mohr, D. C. (1995). Negative outcome in psycho-
therapy: A critical review. Clinical Psychology: Science
and Practice, 2, 127.
Nebeker, R.S., Lambert, M.J ., & Heufner, J .C.
(1995). Ethnic differences on the Outcome Question-
naire. Psychological Reports, 77, 875879.
Nielsen, S. L., Smart, D. W., Isakson, R., Worthen,
V., Gregersen, A. T., & Lambert, M. J . (in press). The
Consumer Reports effectiveness score: What did con-
sumers report? Journal of Counseling Psychology.
Ogles, B. M., Lunnen, K. M., & Bonesteel, K.
(2001). Clinical significance: History, definitions and
applications. Clinical Psychology Review, 21, 421446.
Okazaki, S. (1997). Sources of ethnic differences
between Asian American and White American college
students on measures of depression and social anxiety.
Journal of Abnormal Psychology, 106, 5260.
Okazaki, S., & Sue, S. (2000). Implications of test
revisions for assessment with Asian Americans. Psy-
chological Assessment, 12, 272280.
Ostle, B., & Malone, L. C. (1996). Statistics in re-
search: Basic concepts and techniques and techniques
for research workers (4th ed., pp. 110144). Ames, IA:
Iowa State University Press.
Park, K. B., Upshaw, H. S., & Koh, S. D. (1988).
East Asians responses to Western health items. Jour-
nal of Cross-Cultural Psychology, 21, 423427.
Percevic, R., Lambert, M.J ., & Kordy, H. (2004).
Computer assisted monitoring of psychotherapy out-
come. Journal of Clinical Psychology, 60 (3).
Ray, J . W., & Shadish, W. R. (1996). How inter-
changeable are different estimators of effect size? Jour-
nal of Consulting and Clinical Psychology, 64(6),
13161325.
Regier, D. A., Boyd, J . H., Burke, J . D., J r., Rae,
D. S., Myers, J . K., Kramer, M. et al. (1988). One
month prevalence of mental disorders in the United
States. Archives of General Psychiatry, 45, 977986.
Spielberger, C. D. (1983). Manual for the State
Trait Anxiety Inventory STAI (Form Y). Palo Alto, CA:
Consulting Psychologists Press.
Strupp, H. H. (1996). The tripartite model and the
Consumer Reports study. American Psychologist, 51,
10171024.
Taylor, J .A. (1953). A personality scale of manifest
anxiety. Journal of Abnormal and Social Psychology,
48, 285290.
Tingey, R. C., Lambert, M. J ., Burlingame, G. M.,
& Hansen, N. B. (1996). Assessing clinical significance:
Proposed extensions to method. Psychotherapy Re-
search, 6, 109123.
Tsushima, W. T., & Onorato, V. A. (1982). Com-
parison of MMPI scores of White and J apanese-Ameri-
can medical patients. Journal of Consulting and Clini-
cal Psychology, 50(1), 150151.
Umphress, V. J ., Lambert, M. J ., Smart, D. W.,
Barlow, S. H., & Clouse, G. (1997). Concurrent and
construct validity of the Outcome Questionnaire. Jour-
nal of Psychoeducational Assessment, 15, 4055.
Veit, C. T., & Ware, J . E. (1983). The structure of
psychological distress and well being in general popu-
lations. Journal of Consulting and Clinical Psychol-
ogy, 51(5), 730742.
Vermeersch, D. A., Lambert, M. J ., & Burlingame,
G. M. (2000). Outcome Questionaire-45: Item sensitiv-
ity to change. Journal of Personality Assessment, 74,
242261.
Vermeersch, D. A. & Lambert, M. J . (2003). A re-
search agenda for Humanistic psychology in the 21st
Century. Journal of Humanistic Psychology, 43:106-
120.
Vermeersch, D A., Whipple, J . L., Lambert, M.J .,
Hawkins, E. J ., Burchfield, C. M., & Okiishi, J . C.
(2004). Outcome Questionnaire: Item sensitivity to
changes in counseling center clients. Journal of Coun-
seling Psychology, 51.
Ware, J .E., Snow, K.K., Kasinski, M., & Gandek,
B. (1994). SF 36 Health Survey Manual and Interpre-
tation Guide. Boston: The Health Institute, New En-
gland Medical Center.
OQ

45.2 Administration and Scoring Manual 34


Wiessman, M. M. & Bothwell, S. (1976). Assess-
ment of social adjustment by patient self report. Ar-
chives of General Psychiatry, 33, 11111115.
Weissman, M. M., Prusoff, B., Thompson, D.,
Harding, P., & Myers, J . (1978). Social adjustment by
self-report in a community sample and in psychiatric
outpatients. Journal of Nervous and Mental Disease,
166, 317326.
Whipple, J . L., Lambert, M. J ., Vermeersch, D. A.,
Smart, D. W., Nielsen, S. L. et al. (2003). Improving
the effects of psychotherapy: The use of early identifi-
cation of treatment failure and problem solving strate-
gies in routine practice. Journal of Counseling Psychol-
ogy, 50, 5968.
Wolgast, B. M., Lambert, M. J ., & Puschner, B.
(2003). The dose response relationship in a college coun-
seling center: Implications for setting session limits.
Journal of College Student Psychotherapy, 8.
Ying, Y. (1988). Depressive symptomatology
among Chinese-Americans as measured by the CES-D.
Journal of Clinical Psychology, 44, 739746.
Zane, N., Hall, G. C. N., Sue, S., Young, K.,&
Nunez J . (2003). Research on psychotherapy with cul-
turally diverse populations. In M. J . Lambert (Ed.),
Bergin and Garfields Handbook of psychotherapy and
behavior change (5th ed., pp. 767804). New York:
Wiley.
Zautra, A. J . (1983). Social resources and quality
of life. American Journal of Community Psychology,
11, 275290.
Zung, W. W. (1965). A self-rating depression scale.
Archives of General Psychiatry, 12, 6370.
Zung, W. W. (1971). A rating instrument for anxi-
ety disorders. Psychosomatics, 6, 371379.
OQ

45.2 Administration and Scoring Manual 35


Technical Report 1: Factor Analytic Study of the OQ

-45.2
Study Conducted by Reed M. Meuler, Summarized by Cade Napierski and Ian Kellems
INTRODUCTION
Many of the current outcome measures focus on
one of three domains: participantive discomfort, inter-
personal relationships, or social role performance (Lam-
bert & Hill, 1994). Even though these areas are impor-
tant to assess, there is a general lack of research assess-
ment of the interpersonal and social role functioning in
favor of evaluating only participantive distress (Lam-
bert, Ogles, & Masters, 1992). It would be time and
cost effective to use a measure that attempts to evaluate
all three of the above domains concurrently. In addi-
tion, a single measure that assesses the above areas at
one time could be use across studies and would assist
researchers in assessing all three of the important con-
tent domains (Lambert, Ogles, & Masters, 1992).
Another important consideration in outcome re-
search is how often to measure change occurring in a
patient while she or he is undergoing psychotherapy. It
may be an asset to the outcome researcher to be able to
measure change at many different occasions. Doing so
with a measure that is sensitive to change would allow
the researcher to study the effects of therapy on a dose
response basis as well as evaluate the quality of ongo-
ing treatment for quality assurance reasons (Burlingame,
Lambert, Reisinger, Neff, & Mosier, 1995; Lambert &
Hill, 1994). In regard to dose response studies, some of
the multiscale measures presently available are far too
long to administer frequently and those which are short
enough to administer on a per session basis may not
assess the three crucial domains suggested above (Lam-
bert & Hill, 1994).
Because of the considerations already discussed, it
seems important in the field of outcome research and
quality assurance that a self report questionnaire be
developed that will assess participantive discomfort,
interpersonal relationships, and social role performance
in patients over frequent administrations. These areas
of functioning suggest a continuum from how the per-
son feels inside, how they are getting along with signifi-
cant others, and how they are doing in important life
tasks such as work and school (Lambert et al., 1994b,
1). The measure should also be sensitive to clinically
significant change. The OQ

has been designed spe-


cifically for these purposes.
Participantive discomfort, interpersonal relation-
ships, and social role performance are assessed through
their own subscale in the instrument, and a summation
of these subscales gives an overall score of adjustment
from which clinically significant change can be assessed.
The 22 item Participantive Discomfort subscale evalu-
ates symptoms such as depression and anxiety. The
Interpersonal Relationships subscale consists of 11 items
that attempt to assess functioning in interpersonal rela-
tionships by measuring friction, conflict, isolation, in-
adequacy, and withdrawal. (Lambert et al., 1994b, 8).
The final subscale, entitled Social Role, consists of nine
items and attempts to measure performance in tasks such
as work and leisure.
Even though the research already completed en-
hances confidence in the OQ

, the theoretical factor


structure underlying its interpretation has yet to be em-
pirically tested. A critical aspect in developing a new
clinical measure is to demonstrate its ability to measure
the theorized latent constructs that it purports to assess:
this is the crux of the issue of construct validity (Allen
& Yen, 1979). One important method of establishing
construct validity is through factor analytic studies of
the measure in question (Kazdin, 1992; Cole, 1987;
Cronbach, 1984). Confirmatory factor analytic (CFA)
studies can test a hypothesis regarding a specific latent
factor structure and the relationships between such fac-
tors (Kazdin, 1992).
METHOD
Participants
The total sample (n=1085) consisted of clinical
(n=504; 46.45%), acute clinical (n=168; 15.48%), com-
munity (n=104; 9.59%), and college student (n=309;
28.48%) participants. It included 63.2% female and
36.8% male participants, with a mean age for all par-
ticipants of 32.41 (SD=11.89) years. Of the partici-
pants, 87.9% were Caucasian, 7.7% were African
American, 3.4% were Asian or Pacific Islander, and
1.0% were Hispanic.
Procedure
The CFA procedure used within the proposed study
is based upon the work of J oreskog and Sorbom (1989).
The results of CFAs are analyzed using several types of
specific results and overall indices. In general, results
are reviewed according to the general mathematical ap-
propriateness of the solution, via construct loadings
OQ

45.2 Administration and Scoring Manual 36


developed from the model in question, and with regard
to the global fit of the model. Global fit of the model
can be assessed by using many indices that fall into two
categories: goodness of fit and parsimonious goodness
of fit (Loehlin, 1992). Kline (1991, 478), writing about
what he terms goodness of fit tunnel vision states that
researchers must understand that global goodness of
fit indices provide limited information about the ad-
equacy of path models its possible to obtain respect-
able values of fit indices even though some parameter
estimates are nonsensical. Others have suggested that
assessment of fit should involve assessment of the en-
tire model, not simply the goodness of fit criteria
(Breckler, 1990; Mulaik et al., 1989). In their 1993
article, Reise, Widaman, and Pugh, stated that no CFA
model should be accepted or rejected on statistical
grounds alone: theory, judgment, and persuasive argu-
ment should play a key role in defending the adequacy
of any estimated CFA model. (554).
The present research used the entire data pool (i.e.,
both patient, community, and student populations) to
create two random groups from which to conduct initial
and secondary CFA studies. This has been suggested as
a way to assess factor invariance, which can add to con-
fidence in the results of factor analytic studies (Comrey,
1988). Chi square analyses of sex and race differences
and t tests of age between groups suggested no signifi-
cant differences.
The study proceeded into two phases after the groups
had been formed:
Phase 1: Model Development. The first phase initi-
ated formal CFA studies on Group 1. It was hypoth-
esized that a three-factor solution, with item loadings
from the OQ

on their respective theorized OQ

subscales, would be shown to be a viable model by a


CFA. The CFA procedures in Phase 1 followed Coles
(1987) suggestions.
In the present study, the first model tested fol-
lowed the theoretical position of the developers, the three
factor model presented in the introduction. Its signifi-
cance was tested and, following the test, alternate theo-
retical models were generated and tested.
Phase 2: Large Sample Cross Validation. This phase
proceeded to cross validate the reliable CFA models from
Phase 1 on the second half of the split sample (Group
2).
RESULTS
Phase 1: Model Development
Prior to the analysis of each model, a summary of
the model will be provided in figure form.
Model 1. This model was constructed in accordance
with the construct system reviewed in the OQ

s manual,
with all items loading upon their theoretical scales.
Review of the analysis reveals a model with no gross
errors regarding the appropriateness of the solution; the
model solution was appropriate. However, two items,
14 and 32, were very poorly loaded on the Social Role
factor. In order to improve the global fit of the model
the above items were eliminated.
Review of the data suggests that the model is suffi-
cient with regard to the solution derived. The correla-
tion between the factors was very high. Participantive
Discomfort was correlated at .92 with Interpersonal
Relations and .89 with Social Role. Interpersonal Re-
lations and Social Role correlated at .84.
The model exceeded cutoff scores on two of the six
fit indicators. Because the integrity of the model solu-
tion was established, all items contributed to the appro-
priate constructs in a significant manner, more than one
cutoff statistic was surpassed, and this general construct
system has been endorsed in the OQ

manual.
Model 2. The factor correlation matrix observed in
the first model suggested that the Social Role and Inter-
personal Relations scales could be collapsed into one
scale as they were highly correlated. These two subscales
also appear to have similar content domains. Due to
these observations, Model 2 was constructed with two
factors: (1) Participantive Discomfort (i.e., the patient
reporting on his/her internal world), and (2) Life Func-
tioning (i.e., the patient reporting on his/her external
world).
The model derived a solution in an appropriate
manner. Three of six global fit indicators surpassed
their cutoff marks, and one more (Chi-Square/df) ap-
proached it cutoff of 2.0. There was a correlation be-
tween the two factors of .96. It appears that this model
is sound, is adequately loaded on the appropriate fac-
tors, and has support of fit indices.
Model 3. This model, which collapsed all items
into one factor, was constructed based upon the intro-
duction and findings in the previous models. More spe-
cifically, the high correlation between the two factors in
Model 2 led to the construct used in model three. To
make this a more accurate model, three items were ex-
cluded from this analysis due to a factor loading below
.30 (11, 14, and 32).
The solution obtained was appropriate for further
examination. Three of six global fit indicators (RMSR,
AGFI, & CN) were significant, suggesting that this
model is appropriate for inclusion in the cross valida-
tion phase of the research.
OQ

45.2 Administration and Scoring Manual 37


Phase 2: Large Sample Cross Validation
In Phase 2, Models 1, 2, and 3 were cross vali-
dated on the second half of the split sample. Model
specifications were not changed.
Model 1. This model derived a solution in a man-
ner that allows for further discussion of the data. The
CD of .991 suggests that the items were adequate mea-
sures of the latent constructs. The factor correlations
were high: Participantive Discomfort/Interpersonal Re-
lations =.93; Participantive Discomfort/Social Role =
.87; and Interpersonal Relations/Social Role =.83. In
addition, three of the six indicators of global fit (RMSR,
AGFI, CN) validated the model with this half of the
split sample while one (Chi Square/df) approached the
cutoff level.
Model 2. The replication of this model derived an
appropriate model for review. The correlation was high
(.950) between the Participantive Discomfort and Life
Situation factors. The global fit indices supported the
viability of this model in the cross validation sample.
Three, again RMSR, AGFI, and CN, exceeded their
cutoff levels and Chi Square/df approached its level.
Model 3. This models replication also derived an
appropriate solution. Three of five (RMSR, AGFI, CN)
indices of fit exceeded their cutoff scores.
DISCUSSION
Implication of Results
None of the models assessed passed all six of the
goodness of fit criteria in Phase 1. Only those statistics
that tended to be most biased toward large samples (Chi
square) or models with many parameters (GFI) were
never at a sufficient level to support the model. How-
ever, those indices which accounted for model size or
were robust against large samples did support the three
models. All three were cross validated successfully in
Phase 2. These models will be discussed with regard to
their factor structure and implications for using of the
OQ

.
Model 1 is closest to the original structure of the
OQ

. All three factors were retained as were 40 of the


45 items included on the three scales. While barely ad-
equate with goodness of fit criteria in Phase 1, it was
not rejected as a participant in the cross validation phase
of the study. Review of the literature supports this deci-
sion as this was a theoretically derived model (Reise,
Widaman, & Pugh, 1993; Cole, 1987).
As has been found in previous research on the OQ

,
one problem with the model is the high correlation be-
tween factors (subscales). A possible explanation for
this is that the items on both the Interpersonal Relations
and Social Role Performance scales involve the evalua-
tion of aspects of a persons external situations. In other
words, the participant is evaluating his or her life situa-
tion on these two scales while questions from the
Participantive Discomfort scale prompt answers regard-
ing the persons internal state. Thus, there appears to
be a dichotomy in the questions as they selectively fo-
cus on either external (life situation) or internal
(participantive distress) events.
Model 2 was developed because of high correla-
tions between Interpersonal Relations and Social Role
Performance scales. It maintained the whole of the
Participantive Discomfort factor while incorporating
both the Interpersonal Relations and Social role Perfor-
mance items into one Life Functioning factor. Model 2
was adequate with regard to goodness of fit criteria when
used in both haves of the sample. Thus, it appears that
Life Functioning may be an adequate construct to pro-
vide a foundation for combining the two scales.
As was found in Model 1, a high correlation was
observed between the factors. It is possible, then, that
the OQ

falls into the category of many other measures


that intimate a multidimensional structure when, in fact,
they are unidimensional. Model 3 was developed to
force the OQTM to fall into one factor and then be as-
sessed with regard to adequacy of fit.
Five items that had poor loadings on a previous it-
eration of this one factor model were excluded from the
final run. Results indicated a model that was adequate
with regard to fit and construct loadings. It appears
that the hypothesis that the OQ

is unidimensional has
not been disproved at this point in the study. Most highly
loading items included those from the original
Participantive Discomfort scale and suggest that the one
factor solution might best be seen as a global severity
factor.
Overall, two of the three models tested (1 and 2)
were multifactorial in nature. While the multidimen-
sional models of the OQ

are appealing as based upon


theory, the results of the present study failed to support
them. The underlying theory may not be incorrect, but
the domains appear to be so highly correlated that they
effectively represent a single factor.
Clinical use of the OQ

may continue to use the


three existing scales. There was no problem regarding
item groupings on those scales. The content groupings
will provide clinicians with valuable information regard-
ing various aspects of their patients lives in a manner
that clinicians can readily incorporate into treatment. It
may be important to note that the present study had sev-
OQ

45.2 Administration and Scoring Manual 38


eral limitations, one of which was methodology (i.e.,
CFA). For example, with large samples, such as the
one used in this study, there is an increased Type II er-
ror rate.
CONCLUSION
Although the multidimensional construct system has
not been statistically supported, in general clinical use
the three subscales developed for the OQ

may still be
of use to the interested clinician.
References
Allen, M. & Yen, W. (1979). Introduction to mea-
surement theory. Belmont, MA: Wadsworth, Inc.
Breckler, S.J . (1990). Applications of covariance
structure modeling in psychology: Cause for concern?
Psychological Bulletin, 107, 260273.
Burlingame, G.M., Lambert, M.J ., Reisinger, C.W.,
Neff, J ., & Mosier, J . (1995). Pragmatics of tracking
mental health outcomes in a managed care setting. Jour-
nal of Mental Health Administration, 22, 226236.
Cole, D. (1987). Utility of confirmatory factor
analysis in test validation research. J ournal of Consult-
ing and Clinical Psychology, 55, 584594.
Comrey, A. (1988). Factor analytic methods of scale
development in personality and clinical psychology. Jour-
nal of Consulting and Clinical Psychology, 56, 754
761.
Cronbach, L. (1984). Essentials of psychological
testing (4th ed.). New York, NY: Harper and Row Pub-
lishers.
J oreskog, K. & Sorbom, D. (1989). LISREL 7: A
guide to the program and applications (2nd ed.). Chi-
cago, IL: SPSS, Inc.
Kazdin, A. (1992). Methodological issues and strat-
egies in clinical research. Washington, DC: American
Psychological Association.
Kline, R.B. (1991). Latent variable path analysis
in clinical research: A Beginners tour guide. J ournal of
Clinical Psychology, 47, 471 484.
Lambert, M.J ., Burlingame, G.M., Umphress, V.,
Hansen, N., Yanchar, S., Vermeersch, D., & Clouse, G.
(1994). The reliability and validity of Outcome Ques-
tionnaire. Clinical Psychology and Psychotherapy, 3,
106116.
Lambert, M.J . & Hill, C. (1994). Assessing psy-
chotherapy outcomes and processes. In A.E. Bergin and
S.L. Garfield (Eds.), Handbook of psychotherapy and
behavior change (4th ed., pp. 72113). New York: J ohn
Wiley and Sons.
Lambert, M.J ., Ogles, B., & Masters, K. (1992).
Choosing outcome assessment devices: An organiza-
tional and conceptual scheme. J ournal of Counseling
and Development, 70, 527539.
Loehlin, J .C. (1992). Latent variable models: An
Introduction to factor, Path, and structural analysis (2nd
ed.). New J ersey: Lawrence Erlbaum Associates, Pub-
lishers.
Mulaik, S.A., J ames, L.R., VanAlstine, J ., Bennett,
N. Lind, S., & Stilwell, C.D. (1989). Evaluation of good-
ness of fit indices for structural equation models. Psy-
chological Bulletin, 105, 430445.
Reise, S., Widaman, K., & Pugh, R. (1993). Con-
firmatory factor analysis and item response theory: Two
approaches for exploring measurement invariance. Psy-
chological Bulletin, 114, 552556.
OQ

45.2 Administration and Scoring Manual 39


METHOD
Participants
Participants included three normal (student and com-
munity) and two clinical groups. A student sample of
268 undergraduates was collected from the psychology
department of the University of Amsterdam. A group of
266 students filled in the OQ for the second time after
two weeks.
A clinical outpatient sample of 192 persons was
collected at two locations of GGZ Noord-Holland-
Noord, a public mental health center. The data was
colleted by a professional after the intake. Data from
the two groups have been combined. A group of 43 pa-
tients also filled in the OQ before they received a treat-
ment advice, 2-3 weeks after the intake. A different group
of 108 patients filled in the SCL-90 together with the
OQ.
A community subsample of 123 individuals was
collected in Noord-Holland. Participants were contacted
by phone by choosing each first name of the even pages
in the phone directory of Zaanstreek-Waterland. All
adults in the household were asked if they would fill out
the questionnaire. If they consented to participate, ques-
tionnaires and consent forms were mailed to them.
Another subsample of 362 individuals was collected
from 14 commercial and non-profit business settings.
Completion of the test was voluntary and anonymous.
A community subsample of 66 individuals was re-
moved from the total sample, as it was biased by a large
group of highly educated people. Also, their scores were
significantly higher than the other community samples.
MISSING VALUES
Missing values were replaced with
substitutes, computed by the mean of the remaining
domain items and rounding it to the nearest number.
Measures
All samples completed the OQ. The OQ requires
subjects to rate their feelings on a five point Likert scale
ranging from 0 to 4. Three domain scores Symptom-
atic Distress, Interpersonal Relations and Social Role
and a total score were calculated.
Technical Report 2: Psychometric Properties of the Dutch OQ-45.2, Preliminary Results
By Kim de Jong, GGZ Noord-Holland-Noord, the Netherlands
k.dejong@ggz-nhn.nl
The Groningse Vragenlijst Sociaal Gedrag
1
, 45-
items version (GVSG-45; (J ong & Lubbe, 2001) was
used to validate the Interpersonal Relations and Social
Role domain scores of the OQ. The GVSG-45 requires
subjects to rate 45 statements on a five-point Likert scale
ranging from 1 to 5. The GVSG-45 has 9 domain scores:
Parents, Partner, Children of 15 years or younger of
age, Children older than 15 years, Friends, School, Oc-
cupation, Housework and Leisure. For each domain a
cut off point is known, indicating whether there is a prob-
lem in this specific domain or not. For this research two
measures that are not in the original questionnaire were
calculated. As a measure of interpersonal problems we
used the mean score of the Parents, Partner, both Chil-
dren domains and the Friends domain, further referred
to as Functioning on Interpersonal Relationships (FIR).
As a measure of problems with the social role we calcu-
lated the mean score of the School, Occupation, House-
work and Leisure domains, further referred to as Func-
tioning on Social Role (FSR). The two questionnaires
used in the US for validation of the SR and IR domain
scores, the Inventory of Interpersonal Problems (IIP)
and the Social Adjustment Scale (SAS) were both not
available in Dutch.
The Dutch translation of the SCL-90 (Arrindell &
Ettema, 1975) was used to validate the Symptomatic
Distress domain score of the Outcome Questionnaire.
The General Severity Index (GSI) was produced by cal-
culating the mean total score, and ranged from 1 to 5
points. The same measure was used in the US for vali-
dation research of the original OQ.
OQ

45.2 Administration and Scoring Manual 40


M SD M SD M SD
Symptom distress 46.8 14.1 27.4 11.5 21.8 10.2
Interpersonal relations 16.1 6.2 11.5 5.1 8.1 4.6
Social role 13.4 5.5 10.4 3.7 8 3.5
OQ Total score 76.2 22.6 49.2 18.3 37.9 16
community sample
(n=485)
outpatient
(n=192)
student
(n=268)
RESULTS
Table 1: Means and standard deviations for the OQ Total and Domain scores
Table 2: Reliability for the OQ Total and Domain scores
Table 3: Validity estimates
Symptom distress 0.89 0.9 0.89 0.76 0.81
Interpersonal relations 0.74 0.74 0.76 0.83 0.71
Social role 0.7 0.61 0.58 0.74 0.73
OQ Total score 0.92 0.92 0.91 0.79 0.82
student
(n=268)
communit
y (n=485)
student
(n=266)
outpatient(
n=192)
outpatient
(n=42)
Internal consistency
(Cronbachs )
Test-retest reliability
(pmcc )
outpatient
(n=108)
GSI
1
GSI
1
FIR
2
FSR
3
Symptom distress 0.82 0.78 0.42 0.54
Interpersonal relations 0.67 0.59 0.51 0.51
Social role 0.53 0.57 0.38 0.55
OQ Total score 0.82 0.77 0.49 0.6
student (n=268)
Cutoff score
Students were not included in the comparison of
community sample and clinical sample, as they may not
be reflective of the normal community population. Cut-
offs for the OQ Total score and the domain scores are as
follows: Total: 54; Symptom Distress: 33; Interpersonal
relations: 12; Social role: 11.
Reliable change indices
The RCIs are calculated in two different ways. The
first method uses the internal consistencies of the OQ
total score (.92) and the domain scores (.74 to .89) in
the outpatient sample to compute the standard errors of
measurement. The RCI for the OQ Total score is 18.
The RCIs for each of the subscales are as follows: Symp-
tom Distress- 13, Interpersonal Relations- 9 and Social
Role- 9.
The second method uses the test-retest reliability
values of the OQ Total score (.79) and domain scores
(.74 to .84) in the outpatient sample. The RCI for the
OQ Total score is 29. The RCIs for each of the subscales
are as follows: Symptom Distress- 20, Interpersonal
Relations- 8 and Social Role- 8.
Sensitivity to psychopathology
There was a statistically significant difference be-
tween the community sample and clinical samples on
the OQ Total score (F (3, 940) =205, p<.001) as well
as the domain scores (SD: F (3, 940) =218, p<.001;
IR: F(3, 940)=115, p<001; SR: F (3, 940)=84.6,
p<.001).
Sensitivity and specificity
Sensitivity =0.84
Specificity =0.86
OQ

45.2 Administration and Scoring Manual 41


The OQ

in Spanish
A Spanish language version of the OQ

is presented
on the following page. Although it is identical to the
OQ

in content, it should be noted that there has been


little data published on this version and it has not yet
been standardized for a North American Spanish speak-
ing population.
OQ

45.2 Administration and Scoring Manual 42


Appendices
A - G
The OQ

-45.2 is the latest update of the original


Outcome Questionnaire (OQ

). The items on this ver-


sion remain essentially the same as those found on the
original form, with a few cosmetic alterations. The most
significant change in this version is found in a new scor-
ing method that eliminates the need for scoring templates.
It is hoped that this new scoring method will streamline
the processing of the OQ

, making it faster and easier


to arrive at a total score for each patient.
Scoring
The new scoring procedure for the OQ

is intended
to be straightforward and efficient. Following each of
the item statements are five small boxes for the patient
to mark a response. To the right of each patient re-
sponse box is a numerical score value assigned to each
possible reply. To score the OQ

, simply find the scor-


ing value of the response marked by the patient, and
then write that number in the small box found to the
right of each item in the shaded area on the right side of
the questionnaire form. There is one scoring rectangle
for each item that will automatically place the score for
a item into its specific subscale category. When the
score for each item has been written in the correspond-
ing box, add up each vertical column of numbers, pref-
erably with a calculator to increase accuracy, and write
the total for each column in the space provided in the
bottom right hand corner of the sheet. This will leave
three column totals, each representing one of the three
subscales for the OQ

. When these three column totals


are added together, a total score for the questionnaire
will be obtained which should then be written in the
total box found at the bottom. When scoring the OQ

,
please be aware of those items that are reverse scored,
with the score values running from four to zero rather
than zero to four. Improper scoring of those items will
result in an inaccurate assessment score.
Appendix A
Scoring the Outcome Questionnaire (OQ

-45.2)
Appendix B
Sample of OQ

-45.2 Scoring Method


[See Next Page]
OQ

45.2 Administration and Scoring Manual 43


OQ

45.2 Administration and Scoring Manual 44


OUTCOME QUESTIONNAIRE (OQ

-45.2)
NAME:____________________________________________________________ ID NUMBER:_________________________________
AGE:_____________ SEX: M F SESSION #:_____________ DATE: _____________
INSTRUCTIONS: Looking back over the last week, inlcuding today, help us understand how you have been feeling. Read each item and
mark the answer that best describes your current situation. For this questionnaire, work is defined as employment, school, housework,
volunteer work, and so forth. Please do not make any marks in the column DO NO MARK BELOW.
1. I get along well with others ...............................................................................................
2. I tire quickly. .......................................................................................................................
3. I feel no interest in things.. ...............................................................................................
4. I feel stressed at work/school.. ........................................................................................
5. I blame myself for things. .................................................................................................
6. I feel irritated. .....................................................................................................................
7. I feel unhappy in my marriage/significant relationship. ..................................................
8. I have thoughts of ending my life. ....................................................................................
9. I feel weak. ........................................................................................................................
10. I feel fearful.. ......................................................................................................................
11. After heavy drinking, I need a drink the next morning to get going
(If you do not drink, mark never) ....................................................................................
12. I find my work/school satisfying. .......................................................................................
13. I am a happy person. ........................................................................................................
14. I work/study too much. ......................................................................................................
15. I feel worthless.. ................................................................................................................
16. I am concerned about family troubles. ............................................................................
17. I have an unfulfilling sex life. ............................................................................................
18. I feel lonely. ........................................................................................................................
19. I have frequent arguments. ...............................................................................................
20. I feel loved and wanted .....................................................................................................
21. I enjoy my spare time. .......................................................................................................
22. I have difficulty concentrating. ..........................................................................................
23. I feel hopeless about the future. .......................................................................................
24. I like myself. .......................................................................................................................
25. Disturbing thoughts come into my mind that I cannot get rid of. ...................................
26. I feel annoyed by people who criticize my drinking (or drug use.)
(If not applicable, mark never) .......................................................................................
27. I have an upset stomach. ..................................................................................................
28. I am not working/studying as well as I used to.. .............................................................
29. My heart pounds too much. ..............................................................................................
30. I have trouble getting along with friends and close acquaintances. ..............................
31. I am satisfied with my life. ................................................................................................
32. I have trouble at work/school because of drinking or drug use
(If not applicable, mark never) .......................................................................................
33. I feel that something bad is going to happen ..................................................................
34. I have sore muscles ..........................................................................................................
35. I feel afraid of open spaces, of driving or of being on buses, subways and so forth. ..
36. I feel nervous .....................................................................................................................
37. I feel my love relationships are full and complete. .........................................................
38. I feel that I am not doing well at work/school ..................................................................
39. I have too many disagreements at work/school ..............................................................
40. I feel something is wrong with my mind. .........................................................................
41. I have trouble falling or staying asleep ............................................................................
42. I feel blue. ........................................................................................................................
43. I am satisfied with my relationships with others .............................................................
44. I feel angry enough at work/school to do something I might regret ..............................
45. I have headaches ..............................................................................................................
Almost
Never Rarely Sometimes Frequently Always
4 3 2 1 0
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
4 3 2 1 0
4 3 2 1 0
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
4 3 2 1 0
4 3 2 1 0
0 1 2 3 4
0 1 2 3 4
4 3 2 1 0
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
0 2 2 3 4
0 1 2 3 4
4 3 2 1 0
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
4 3 2 1 0
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
0 1 2 3 4
4 3 2 1 0
0 1 2 3 4
0 1 2 3 4
DO NOT MARK
THIS SECTION
SD IR SR
Total=
r
r
r
r
r
r
r
r
r
+ +
Developed by Michael J. Lambert, Ph.D. and Gary M. Burlingame, Ph.D.
Copyright 1999, 2003 American Professional Credentialing Services, LLC
All rights reserved. License Required For All Users.
P.O. Box 970354, Orem, UTAH 84097-0354 PH: 888.647.2673
1
2
3
4
1
3
2
4
4
4
4
4
4
4
1
1
1
3
3
3
2
2
2
2
0
3
3
3
3
2
2
2
3
3
3
3
3
3
3
4
4
1
1
1
1
64 28 23
115
CRAIG
43
x
1 J an 1, 2004
CH-00234
S
A
M
P
L
E

S
C
O
R
I
N
G
Appendix C
Interpretive Graph for OQ

-45.2 Total Score


TOTAL SCORE GRAPH WITH NORMATIVE CUTOFF
160 and
up
150
140
130
120
110
100
90
80
70
cutoff-63
60
50
40
30
20
10
0
Date 1 Date 2 Date 3 Date 4 Date 5 Date 6 Date 7 Date 8 Date 9 Date 10
Patient Name:
Total Score:
Session Number
OQ

45.2 Administration and Scoring Manual 45


Appendix D
Interpretive Graph for OQ

-45.2 Symptom Distress (SD) Subscale


SCALE 1: SYMPTOM DISTRESS (SD) GRAPH WITH NORMATIVE CUTOFF
Patient Name:
Symptom Distress Score:
Session Number
Appendix E
Interpretive Graph for OQ

-45.2 Interpersonal Relations (IR) Subscale


SCALE 2: INTERPERSONAL RELATIONSHIPS (IR) GRAPH
WITH NORMATIVE CUTOFF
Patient Name:
Relationship Score:
Session Number
90
85
80
75
70
65
60
55
50
45
40
cutoff-36
30
25
20
15
10
5
0
Date 1 Date 2 Date 3 Date 4 Date 5 Date 6 Date 7 Date 8 Date 9 Date 10
45
40
35
30
25
20
cutoff-15
10
5
0
Date 1 Date 2 Date 3 Date 4 Date 5 Date 6 Date 7 Date 8 Date 9 Date10
OQ

45.2 Administration and Scoring Manual 46


Appendix F
Interpretive Graph for OQ

-45.2 Social Role (SR) Subscale


SCALE 3: SOCIAL ROLE (SR) GRAPH WITH NORMATIVE CUTOFF
Patient Name:
Social Role Score:
Session Number
40
35
30
25
20
cutoff-12
10
5
0
Date 1 Date 2 Date 3 Date 4 Date 5 Date 6 Date 7 Date 8 Date 9 Date 10
OQ

45.2 Administration and Scoring Manual 47


Appendix G
Order Form, License Application and License Agreement
American Professional Credentialing Services
APPLICATION FOR LICENSE TO USE OQ

-45.2
Applicant:
Applicants Address:
Telephone: ( )
INSTRUCTIONS:
1. Read Application and License Agreement
2. Complete and Sign Application
3. Return Application with Fee to:
American Professional Credentialing Services LLC
PO BOX 970354
Orem, UT. 84097-0354
4. If the Application is not accepted, the Fee will be refunded.
TERMS AND CONDITIONS:
Applicant has read the License Agreement on the reverse side of this Application and accepts and agrees to the
License Agreement. The IHC Center for Behavioral Healthcare Efficacy of IHC Hospitals, Inc. (the Center)
accepts this Application and grants this License to Applicant participant to the License Agreement. The License
is not effective or granted unless this Application is signed by an authorized officer or representative of the Center.
AGREED TO AND ACCEPTED BY:
_____________________________________________
Applicant
_____________________________________________
Authorized Signature
_____________________________________________
Print Name and Title
_____________________________________________
Date
OQ

-45.2 LICENSE AGREEMENT


OQ

45.2 Administration and Scoring Manual 48


1. Licensee. If the American Professional Credentialing Services LLC (hereafter APCS) has
accepted and signed the Application of the Applicant (see reverse side of this agreement), then the Applicant is the
Licensee under this License Agreement.
2. OQ

-45.2. OQ

-45.2 means the mental health care protocol, outcome tracking measures, and
work of authorship provided by APCS to Licensee under the designation OQ

-45.2.
3. License. Participant to the terms and conditions of this Agreement, APCS grants to Licensee a
license to use, copy, and distribute OQ

-45.2, but only in connection with Licensees bona fide mental health care
practice (the License) as the Applicant has applied and been approved for.
4. Modifications. Licensee may not modify or change the content, wording, or organization of
OQ

-45.2 or create any derivative work based on OQ

-45.2. Licensee may, however, scan OQ

-45.2 or put it
into other formats, provided that the content, wording and organization are not substantively modified or changed.
5. Copies, Notices and Credits. Any and all copies of the OQ

-45.2 made by Licensee must include


the copyright notice and other notices and credits in the OQ

-45.2. Such notices may not be deleted, omitted,


obscured or changed by Licensee.
6. Use, Distribution and Charges. The OQ

-45.2 may only be used and distributed by Licensee in


connection with Licensees bona fide mental health care practice and may not be used or distributed for any other
purpose. Without limiting the generality of the foregoing, Licensee may not distribute copies of the OQ

-45.2 to
other persons for use by other persons. Such other persons should apply to APCS for a license to use OQ

-45.2.
Licensee may not charge any client, patient, organization or other entity for use of the OQ

-45.2.
7. Responsibility. BEFORE USING OR RELYING UPON THE OQ

-45.2 IT IS THE RESPON-


SIBILITY OF LICENSEE TO ASCERTAIN THE SUITABILITY OF THE OQ

-45.2 FOR ANY AND ALL


USES MADE BY LICENSEE. THE OQ

-45.2 IS NOT A DIAGNOSTIC TOOL AND SHOULD NOT BE


USED AS SUCH. THE OQ

-45.2 IS NOT A SUBSTITUTE FOR AN INDEPENDENT MEDICAL OR OTHER


APPROPRIATE PROFESSIONAL EVALUATION. ANY AND ALL USE OF AND RELIANCE ON THE
OQ

-45.2 BY LICENSEE IS AT LICENSEES SOLE RISK AND IS LICENSEES SOLE RESPONSIBILITY.


LICENSEE SHALL INDEMNIFY APCS AND ITS OFFICERS, DIRECTORS, EMPLOYEES, AND REPRE-
SENTATIVES, AND THE AUTHORS OF THE OQ

-45.2 AGAINST, AND HOLD THEM HARMLESS


FROM, ANY AND ALL CLAIMS AND LAW SUITS ARISING FROM OR RELATING TO ANY USE OF OR
RELIANCE ON THE OQ

-45.2 PROVIDED BY APCS TO LICENSEE. THIS OBLIGATION TO INDEM-


NIFY AND HOLD HARMLESS INCLUDES A PROMISE TO PAY ANY AND ALL J UDGMENTS, DAM-
AGES, ATTORNEYS FEES, COSTS AND EXPENSES ARISING FROM ANY SUCH CLAIM OR LAW
SUIT.
8. Refund and Disclaimer. If after reviewing the OQ

-45.2 and before using it, Licensee finds it


unsatisfactory, Licensee may return the OQ

-45.2 to APCS within 30 days of purchase for a full refund of the Fee.
In the event of a return, the licensee shall terminate. LICENSEE ACCEPTS THE OQ

-45.2 AS IS WITHOUT
WARRANTY OF ANY KIND. APCS DISCLAIMS ANY AND ALL IMPLIED WARRANTIES, INCLUDING
IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND
NONINFRINGEMENT. APCS DOES NOT WARRANT THAT THE OQ

-45.2 IS WITHOUT ERROR OR


DEFECT. APCS SHALL NOT BE LIABLE FOR ANY CONSEQUENTIAL, INDIRECT, SPECIAL, INCI-
DENTAL OR PUNITIVE DAMAGES. THE AGGREGATE LIABILITY OF APCS FOR ANY AND ALL
CAUSES OF ACTION (INCLUDING THOSE BASED ON CONTRACT, WARRANTY, TORT, NEGLI-
GENCE, STRICT LIABILITY, FRAUD, MALPRACTICE, OR OTHERWISE) SHALL NOT EXCEED THE
FEE PAID BY LICENSEE TO APCS. THIS LICENSE AGREEMENT, AND SECTIONS 7 AND 8 IN PAR-
OQ

45.2 Administration and Scoring Manual 49


TICULAR, DEFINE A MUTUALLY AGREED UPON ALLOCATION OF RISK. THE FEE REFLECTS
SUCH ALLOCATION OF RISK.
9. Construction. The language used in this Agreement is the language chosen by the Parties to
express their mutual intent, and no rule of strict construction shall be applied against any Party.
10. Entire Agreement. This Agreement is the entire agreement of the Parties relating to the OQ

-45.2.
11. Governing Law. This Agreement is made and entered into in the state of Delaware and shall be
governed by the laws of the state of Delaware. In the event of any litigation or arbitration between the Parties,
such litigation or arbitration shall be conducted in Delaware and the Parties hereby agree and submit to such
jurisdiction and venue. Notice to commence any litigation or arbitration should be directed to: American Profes-
sional Credentialing Services LLC,
12. Modification. This Agreement may only be modified or amended in writing and must be signed
by both Parties.
OQ

45.2 Administration and Scoring Manual 50

Potrebbero piacerti anche