Rogers 1999

Law and Human Behavior, Vol. 23, No.
4, 1999
Validation of the Millon Clinical Multiaxial Inventory for

Axis II Disorders: Does It Meet the Daubert Standard?
Richard Rogers,1,3 Randall T. Salekin,2 and Kenneth W. Sewell1
Relevant to forensic practice, the Supreme Court in Daubert v. Merrell Dow Pharma-
ceuticals, Inc. (1993) established the boundaries for the admissibility of scientific
evidence that take into account its trustworthiness as assessed via evidentiary reliabil-
ity. In conducting forensic evaluations, psychologists and other mental health profes-
sionals must be able to offer valid diagnoses, including Axis II disorders. The most
widely available measure of personality disorders is the Million Clinical Multiaxial
Inventory (MCMI) and its subsequent revisions (MCMI-II and MCMI-III). We
address the critical question, "Do the MCMI-II and MCMI-III meet the requirements
of Daubert?" Fundamental problems in the scientific validity and error rates for
MCMI-III appear to preclude its admissibility under Daubert for the assessment of
Axis II disorders. We address the construct validity for the MCMI and MCMI-II via
a meta-analysis of 33 studies. The resulting multitrait-multimethod approach allowed
us to address their convergent and discriminant validity through method effects
(Marsh, 1990). With reference to Daubert, the results suggest a circumscribed use
for the MCMI-II with good evidence of construct validity for Avoidant, Schizotypal,
and Borderline personality disorders.
The Supreme Court's decision in Daubert v. Merrell Dow Pharmaceuticals, Inc.

(1993; hereinafter cited as Daubert) changed the landscape of expert testimony.
Recent commentaries on Daubert have emphasized its general implications for
psychological (Goodman-Delahunty, 1997; Rotgers & Barren, 1996) and psychiatric
(Zonana, 1994) evidence, but do not grapple with the admissibility of specific
psychological tests. As an overview, the relevance of Daubert and subsequent cases
to psychological practice is examined. We then consider the development and
validation of the Millon Clinical Multiaxial Inventory (MCMI) in light of Daubert.
1
Department of Psychology, University of North Texas, Denton, Texas 76203-1280.
2
Departmentof Psychology, Florida International University, Miami, Florida 33199.
3
Correspondence should be addressed to Richard Rogers, Department of Psychology, University of
North Texas, P.O. Box 311280, Denton, Texas 76203-1280.
425
0147-7307/99/0800-0425$16.00/1 1999 American Psychology-Law Society/Division 41 of the American Psychology Association
426 Rogers, Salekin, and Sewell
DAUBERT AND SUBSEQUENT CASES
According to Daubert, scientific knowledge is based on the scientific method

and must meet the standard of evidentiary reliability. As explained in Daubert's
Footnote 9, evidentiary reliability is construed as trustworthiness of the data; for
scientific evidence, "evidentiary reliability will be based on scientific validity"
(p. 2795; italics in the original). From this perspective, four guidelines were promul-
gated in Daubert (pp. 2796-2997) for the admissibility of scientific evidence: (a)
testing hypotheses to see if they can be falsified, (b) subjecting theory and methods
to peer review and publication, (c) considering the known or potential rate of error,
and (d) appraising the general acceptance within the relevant scientific community.
Richardson, Ginsburg, Gatowski, and Dobbin (1995) provided an insightful
analysis of Daubert and its implications for expert psychological testimony. In
overturning Frye v. United States (1923), the Supreme Court provided extensive
guidelines on the admissibility of expert testimony. While an argument can be
advanced that some psychological evidence is "clinical" rather than scientific (Mel-
ton, Petrila, Poythress, & Slobogin, 1997, pp. 21-22), formal psychological tests are
based on empirical validation. More specifically, interpretations of tests (e.g., Millon
Clinical Multiaxial Inventory [MCMI] and its revisions) should be considered under
Daubert because of their nomothetic basis, grounded in reliability, validity, and
generalizability.
Following Daubert, the Supreme Court of Texas in E.I. du Pont de Nemours
and Company Inc. v. Robinson (1995; hereinafter Robinson) applied the epithet of
"junk science" to expert testimony that does not pass muster. The majority opinion
augmented the Daubert standard with two potential refinements regarding the
admissibility on scientific evidence: (a) "the extent to which the technique relies
upon subjective interpretation of the expert" (p. 557) and (b) whether research
on the technique extends beyond its courtroom application. The first potential
refinement has far-reaching implications for interpretation of psychological testing;
even with multiscale inventories, different conclusions are often reached about the
same protocol. Although the Robinson case involved the effects of pesticide, its
commentary on scientific method is germane. The majority opinion faulted the
expert's methodology, which included hypotheses and an experimental design. They
ruled that the expert (a) had not excluded other possible causes beyond pesticides,
(b) had completed the study only for the purposes of litigation, and (c) did not
have established error rates. They affirmed that error rates must be established for
the methodology and not simply the results of an isolated study (p. 559).
Several appellate decisions have grappled with the admissibility of psychologi-
cal tests and measures in light of Daubert. Importantly, each decision rested on the
measure's validity with reference to its specific application. In Gier by and through
Gier v. Educational Services Unit (1995), the 8th Circuit Court of Appeals found
that the Child Behavior Checklist had not been validated for either the relevant
population (i.e., mentally retarded children) or evidentiary purpose (i.e., the detec-
tion of child abuse). The Court also expressed concern regarding the establishment
of error rates, given the lack of a gold standard for child abuse identification.
Two appellate decisions have directly addressed the admissibility of the MCMI.
MCMI and the Daubert Standard 427
In S.V. v. R.V. (1996), the Supreme Court of Texas considered the admissibility of
expert testimony on recovered memories. As a relatively minor point in the Court's
lengthy decision, it dismissed the use of the MCMI and MMPI because (a) their
results were inconclusive and, more importantly, (b) a profile of a sex abuser does
not prove sex abuse. With the reliance of multiscale inventories on clinical correlates,
the true ramifications of this decision remain to be explored. Unlike S. V. v. R. V.,
the Supreme Court of New Hamsphire focused directly on the admissibility of the
MCMI-II and MMPI-2 in State v. Cavaliere (1995). Addressing sex offender profiles,
the Court determined that the heterogeneity of test data for sex offenders precluded
its admissibility. In considering the issue of validation, the Court expressed a desire
for data directly germane to sex offender profiles for defendants not admitting to
their offenses. Regarding such profiles, it questioned whether studies of sex offend-
ers admitting to their offenses applied to those denying them. In both MCMI cases,
the courts appear to require that accurate classifications be rendered, based on
reliable and scientifically acceptable metholodology.
The Daubert standard places the onus on trial courts to establish the admissibil-
ity of scientific testimony. The frequency and range of issues facing trial courts in
applying Daubert is unknown. However, trial courts seem willing to grapple with
established psychological tests and their admissibility. For example, Reed (1996)
reviewed Chappie v. Ganger (1994) in which the trial judge excluded certain test
batteries for neuropsychological assessment. According to Reed's analysis, the court
appeared to rely on the Standards for educational and psychological testing (Ameri-
can Psychological Association, 1985) for establishing test validity in light of Daubert.
The U.S. Supreme Court continues to hear cases with relevance to the Daubert
standard. In U.S. v. Scheffer (1998), the Court considered the constitutionality of
a per se rule against the admission of polygraph evidence in military court. In citing
Daubert and other decisions on the admissibility of evidence, the majority opinion
concluded that the polygraph had insufficient validity to overturn the per se rule.
In his dissent, Justice Stevens argued that the majority opinion was inconsistent
with Daubert because it obviated the trial judge's flexible inquiry on the scientific
merits of expert testimony. Despite Justice Stevens's contention, U.S. v. Scheffer
leaves intact the Daubert standard for scientific testimony. In General Electric
Company v. Joiner (1997), the Supreme Court reviewed four epidemiological studies
of lung cancer and observed that studies either did not focus sufficiently on exposure
to polychlorinated biphenyls (PCB) or that the investigators had not drawn specific
exclusions about the role of PCBs and lung cancer. In light of Daubert, they affirmed
that judges must be careful about unwarranted extrapolations from the data based
only on "the ipse dixit of the expert" (p. 519). More recently, the U.S. Supreme
Court has granted certiorari to review Carmichael v. Samyang Tire, Inc. (1997)
under the case name of Kumho Tire Co. v. Carmichael. In this case, the 11th Circuit
Court of Appeals had ruled that Daubert did not apply because the expert on
defective tires was not "scientific," but rather relied on years of experience and
expertise at "telltale markings" (p. 1436). If the Court of Appeals is upheld, the
question remains whether experienced clinicians can rely on their own "telltale"
signs to circumvent Daubert.
The application of Daubert to psychological tests is inevasible, given their
reliance on scientific principles. The American Psychological Association (1985)

set forth the essential principles for the empirical validation of psychological tests.
As a primary standard, multiple forms of validation (i.e., some combination of
criterion-related, construct, and content validity) should be established for each
test. As an important parallel to Daubert and more recent cases, the American
Psychological Association (1985) affirmed that a test is validated for specific applica-
tions (e.g., Axis II disorders). Put succintly, blanket statements about the test itself
are unacceptable: "It is improper to use the unqualified phrase 'the validity of the
test'" (American Psychological Association, 1985, p. 13).
OVERVIEW OF THE MCMI AND ITS REVISIONS
McCann and Dyer (1996) recently advocated the use of the Millon Clinical
Multiaxial Inventory-II (MCMI-II) to address a broad spectrum of forensic issues,
both civil (e.g., child custody, personal injury, fitness for duty, and fitness to parent)
and criminal (e.g., sex offenders, domestic violence, competency to stand trial, and
insanity). This advocacy of forensic applications appears to have the full blessing
and support of Millon (1996). At present, a substantial proportion of forensic
psychologists (see, e.g., Borum & Grisso, 1995) already use the MCMI or MCMI-
II in their evaluations for the courts and subsequent expert testimony. A question
yet to be fully addressed is whether any version of MCMI is sufficiently validated
as to (a) warrant its use in forensic assessments and (b) meet the standards of
admissibility as set forth in Daubert.
A primary focus of this paper is a critical examination of the MCMI and the
validity of its three versions MCMI (Millon, 1983), MCMI-II (Millon, 1987), and
MCMI-HI (Millon, 1994). The MCMI is distinguished from other multiscale invento-
ries by two core features: (a) its systematic evaluation of Axis II disorders and (b)
its direct linkage with current diagnostic nomenclature, i.e., DSM-III, DSM-III-
R, and DSM-IV (American Psychiatric Association, 1980, 1987, 1994). Without
established diagnostic accuracy, its admissibility is brought into question under
Daubert.
This review addresses specifically the usefulness of the MCMI versions in
determining Axis II disorders for several reasons. First and foremost, the major
contribution of the MCMI and MCMI-II is to the assessment of personality disor-
ders. Within the MCMI-II conceptualization, Axis I disorders are accorded a subsid-
iary role, "In contrast to the personality disorders (Axis II), the clinical syndrome
disorders comprising Axis I are best seen as extensions or distortions of the patients'
basic personality patterns" (Millon, 1987, p. 30). Second, the determination of
antisocial personality disorder and other severe personality disorders often plays a
critical role in forensic evaluations. Third, for psychopathology associated with Axis
I disorders, forensic psychologists are likely to select other multiscale inventories,
i.e., the MMPI-2 (Butcher, Dahlstrom, Graham, Tellegen, & Kaemmer, 1989) or
the Personality Assessment Inventory (PAI; Morey, 1991), that have much more
extensive validation with forensic and correctional populations.
A basic issue is whether the MCMI, MCMI-II, and MCMI-III should be consid-
Table 1. Descriptive Characteristics of the MCMI, MCMI-II, and MCMI-III

New New Item Scale
Version New items (%) scales weighting overlapb length
MCMI NA NA NA 4.21 scales 36.80 items
MCMI-II 45 (27.5%) 5 Yes 5.44 scales 38.08 items
MCMI-III 95 (54.3%) 2 Yes 2.50 scales 16.80 items
aItem overlap refers to the average number of scales on which each item is included.
ered successive refinements of the same measure or as three distinct measures.

Urging the former, Millon (1994; see also Millon & Davis, 1997) described the
succeeding versions as "an evolving assessment instrument, significantly upgraded
and refined" (p. 1). From a more dispassionate stance, the modifications of MCMI-
II and MCMI-III are both sweeping and fundamental. As summarized in Table 1,
integral changes are made in the items, scale composition, and scoring. Specifically,
more than half the items (54.3%) on the MCMI-III are different, two new scales
are added, the weighting of items has changed integrally (i.e., dropped from two
levels of prototypicality on MCMI-II to one level on MCMI-III), and the scales
are less than half (55.9% fewer items) their previous length. Based on Table 1, we
are forced to conclude that MCMI, MCMI-II, and MCMI-III are essentially different
measures. Therefore, the validation of each MCMI revision must stand on its own
and cannot "borrow" from other versions.
The remainder of the paper is organized into three sections. First, we examine
the current version (i.e., the MCMI-III) with attention to its diagnostic and construct
validity. Second, we evaluate systematically the construct validity of the earlier
versions, the MCMI and MCMI-II, via a meta-analysis with a focus on Axis II
disorders. These earlier versions have a substantial advantage over the MCMI-III
because of their extensive validation studies. Third and finally, we address Daubert
as a framework for understanding the admissibility of the MCMI versions.
MCMI-III
McCann and Dyer (1996) recommend against the use of the MCMI-III in
forensic evaluations. Although hedging slightly,4 they concluded that the MCMI-
III lacks sufficient factor-analytic and criterion-validity studies to be used in forensic
practice. We concur with McCann and Dyer and augment their position with serious
concerns about the diagnostic accuracy and construct validity of the MCMI-III.
The current validation of the MCMI-III for Axis II disorders is based on Millon's
(1994) validation and several additional studies; Millon's research has come under
severe criticism.
4They suggest that the MCMI-III might be used in "forensic settings when defending the test on cross-
examination is not required" (McCann & Dyer, 1996, p. 21). However, even their own example (i.e.,
sentencing evaluations) does not preclude the possibility of expert testimony with cross-examination.
Moreover, this logic is not compelling. Why would psychologists ever wish to use a test in a forensic
setting that they cannot adequately justify?
Retzlaff (1996, p. 436) contended that less-than-rigorous standards were em-

ployed in Millon's (1994) criterion-related validity study for the MCMI-III. First,
clinicians were asked to make diagnoses when the criteria were unavailable or
poorly established. Second, clinicians were not fully informed regarding the purpose
of the study, which may have affected their patient selection. Third, diagnoses were
often based on limited contact with unknown reliability and under pressure to get
"enough" cases.
Retzlaff (1996) calculated the positive predictive power (PPP) for personality
disorders based on Millon's research. PPP provides a critical estimate of diagnostic
usefulness; it supplies the likelihood that an elevated score correctly identifies the
intended disorder. Unfortunately, the PPP values were disconcertingly low for Axis
II disorders with a range from .07 to .32 and a median of .18. Simply put, Millon's
own data, as recalculated by Retzlaff (1996), suggest that elevated scores are likely
to lead to the wrong diagnosis in more than 4 of 5 cases.
For the purposes of this article, we examined the MCMI-III construct validity
with a reanalysis of Millon's (1994) own data on 998 clinical participants recruited
from several hundred clincians. We augmented the Millon data with two additional
MCMI-III studies: Davis and Hays (1997) examined three MCMI-III scales on 283
psychiatric inpatients. Dyce, O'Connor, Parkins, and Janzen (1997) examined the
intercorrelations of MCMI-III personality disorder scales on 614 university partici-
pants. In keeping with accepted standards of construct validity (American Psycho-
logical Association, 1985, Standards 1.8 and 1.9; see also Campbell & Fiske, 1959),
we evaluated the MCMI-III in terms of convergent and discriminant validity. Mea-
sures of the same disorder (i.e., convergent validity) were expected to have moder-
ately high correlations that are greater in magnitude than either correlations with
different disorders or scale intercorrelations (i.e., discriminant validity).
As reported in Table 2, we combined the three studies to evaluate convergent
and discriminant validity of the MCMI-III. Each of the three studies has method-
ological limitations. In addition to the previously outlined problems with Millon's
(1994) research, Davis and Hays (1997) focused on only three personality disorder
scales, while Dyce et al. (1997) employed a nonclinical sample. For purposes of
clarity, we also report Millon's data separately so that readers could inspect directly
his MCMI-III validation.5 Table 2 addresses convergent validity and one form of
discriminant validity, namely the intercorrelations of MCMI-III scales (i.e., hetero-
trait-monomethod correlations). The second component of discriminant validity is
whether the other measures (e.g., clinician ratings) are more correlated with the
target scales than other scales (i.e., heterotrait-heteromethod correlations); these
critical data were not presented by Millon (1994) or the other MCMI-III studies.
As a further evidence of construct validity, Bagozzi and Yi (1991) recommended
that the proportion of cases in which discriminant validity correlations exceed
convergent validity coefficients (termed "comparison violations") be computed.
When comparison violations exceed 33%, construct validity is considered "low."
'Million (1994) also reports correlations between MCMI-III Axis II scales and measures of psychopathol-
ogy that are typically associated with Axis I syndromes (e.g., Beck Depression Inventory, State-Trait
Anxiety Inventory, SCL-90-R, MMPI-2 clinical scales). As expected, these correlations do not appear
to form any particular pattern with MCMI-III Axis II scales. For the specific validation of the MCMI-
III for Axis II disorders, these comparisons are uninformative.
Table 2. Evidence of Convergent and Discriminant Validity of the MCMI-III Axis II Scales for the
Assessment of Personality Disorders
Construct validity
MCMI-III scale Convergent Discriminant Comparison violations
1 Schizoid .24 (.24) .41 (.28) .74 (.77)
2A Avoidant .24 (.24) .35 (.36) .73 (.77)
2B Depressive .18 (.18) .34 (.36) .77 (.77)
3 Dependent .07 (.07) .42 (.29) .77 (.77)
4 Histrionic .31 (.31) -.31 (-.37) .12 (.15)
5 Narcissistic .27 (.27) -.12 (-.24) .12 (.15)
6A Antisocial .25 (.25) .27 (.28) .70 (.77)
6B Aggressive .14 (.14) .34 (.30) .77 (.77)
7 Compulsive .30 (.30) -.31 (-.33) .08 (.15)
8A Passive- Aggressive .23 (.23) .41 (.40) .77 (.77)
8B Self-Defeating .20 (.20) .46 (.36) .77 (.77)
S Schizotypal .16 (.16) .38 (.36) .77 (.77)
C Borderline .24 (.24) .41 (.38) .77 (.77)
P Paranoid .19 (.19) .38 (.34) .77 (.77)
M of Axis II Scales .22 (.22) .25 (.24) .62 (.64)
Note. Combined data from three MCMI-III validity studies (Davis & Hays, 1997; Dyce, O'Connor,
Parkins, & Janzen, 1997; Milton, 1994). For purposes of comparison, correlations based on Millon's
(1994) own validation data are presented in parentheses. Convergent validity, correlations between
external criteria and target MCMI-III scales. Discriminant validity, the intercorrelations of the MCMI-
III Axis II scales (i.e., heterotrait-monomethod correlations). Comparison violations, proportion of
discriminant correlations that exceed convergent correlations.
As summarized in Table 2, we failed to establish construct validity on three

grounds:
1. The convergent validities were disconcertingly low, ranging from .07 to .31
(M r = .22). In keeping with the Fiske and Campbell (1992) guidelines,
only 2 of the 14 scales evidenced even marginal convergent validity (Scale
4, r = .31; Scale 7, r = .30). However, each scale accounted for only a
nominal percentage (<10%) of the variance.
2. For 11 scales (i.e., 1, 2A, 2B, 3, 6A, 6B, 8A, 8B, S, C, and P), the M
discriminant correlations are higher than convergent validities.
3. Axis II scales averaged an unacceptable 62% comparison violations.
Beyond the reported efforts to establish construct and criterion-related validity,
Millon (1994) also addressed the content validity of the MCMI-III by the creation
of 95 new items to "ensure that all MCMI-III scales paralleled the diagnostic criteria
proposed for DSM-IV" (p. 17). The challenge of content validity is not simply
superficial similarities in content, but whether the item accurately ascertains the
domain. For Axis II disorders, the challenge is threefold: (a) Can we assess the
relationship between item endorsement and the presence of that personality charac-
teristic? (b) Can we establish the necessary "clinically significant distress or impair-
ment" resulting from a constellation of personality characteristics, i.e., Criterion C
of personality disorders (American Psychiatric Association, 1994, p. 630)? (c) Can
we establish its enduring presence from adolescence or early adulthood (i.e., Crite-
rion D)? Unfortunately, Millon (1994) did not describe the procedures in sufficient
detail for either (a) the process of selecting items or (b) expert judgments regarding
item content. Therefore, we are not in a position to evaluate the rigor of the
methodology or the level of consensus achieved by experts.6 We also heed the
warning of Anastasi (1988) that content validation of personality measures is often
necessary for their development, but insufficient for their validation.
In summary, MCMI-III is fundamentally different in item content, scale compo-
sition, and scoring than its predecessors. The basic MCMI-III validation rests on
one, albeit very large, study that has been assailed by Retzlaff (1996) for its lack
of methodological rigor. Millon (1994) and the two more recent MCMI-III studies
have failed to establish its construct validity. Finally, Retzlaff's (1996) reanalysis
of Millon's own data indicated diagnostic inaccuracy of the MCMI-III for Axis
II disorders.
MCMI AND MCMI-II
A challenge in assessing the current diagnostic validity of the MCMI and

MCMI-II is that their external criteria are often outmoded. Studies with formal
diagnoses are based on outdated diagnostic standards (i.e., DSM-III or DSM-III-
R) or earlier versions of structured interviews, e.g., Structured Interview for DSM-
III Personality Disorders (SIDP; Pfohl, Stangl, & Zimmerman, 1982). We believe
that discarding these studies in toto would be unduly stringent. Although estimates
of diagnostic efficiency (e.g., PPP and Negative Predictive Power [NPP]) are neces-
sarily constrained, construct validity can still be established via a multitrait-
multimethod (MTMM) approach to convergent and discriminant validity (Camp-
bell & Fiske, 1959). In addressing the standards of scientific validity, the American
Psychological Association (1985, p. 46) declared that construct validity is "of pri-
mary importance for clinical and personality tests."
The MTMM model provides estimates of convergent validity (monotrait-heter-
omethod) in comparison to two components of discriminant validity: (a) intercorre-
lations for the same measure (heterotrait-monomethod) and (b) correlations across
different scales and measures (heterotrait-heteromethod). Two studies (Morey &
Le Vine, 1988; Wise, 1994) provide useful illustrations of this approach; unfortu-
nately, both studies were restricted to only two measures, namely, the MCMI and
the MMPI personality disorder scales (Morey, Waugh, & Blashfield, 1985). For the
current study, we utilized method effects (Marsh, 1990) to examine specifically the
convergent and discriminant validity of the MCMI and MCMI-II.
Method
Compilation of MCMI and MCMI-II Studies
We performed computerized bibliographic searches for articles on all versions
of the MCMI. We then reviewed the studies to determine whether each article
'American Psychological Association (1985) stated that logical and empirical procedures might be used
instead of expert ratings. Still, these procedures need to be explicated.
pertained to personality disorders. Specifically, our search procedures included (a)

a search of PsychLit for the inclusive years of 1983-1997 and (b) examination of
MCMI test manuals, previous MCMI reviews, and empirical studies for references
omitted from PsychLit. As a final step, we reviewed recent issues (i.e., last 12
months) of major assessment journals and personality journals (i.e., Assessment,
Journal of Clinical Psychology, Journal of Personality Assessment, Journal of Per-
sonality Disorders, and Psychological Assessment) for any studies or reviews that
were too recent to be included in the PsychLit data base. Studies were checked to
ensure that the same data were not reported in multiple investigations (Beamon,
1991). For studies with insufficient data on effect sizes, we contacted the principal
investigators for complete data on a minimum of two occasions. In total, we found
33 investigations that addressed MCMI or MCMI-II personality disorders in rela-
tionship to other measures of Axis II disorders. Four additional studies were
dropped for insufficient data.
Effect Sizes
In the present study, we provided effect size estimates to allow systematic
comparisons across studies. First, we expressed the relationship between a version
of the MCMI and a second personality measure in terms of the product-moment
correlations. Rosenthal (1991) recommended the use of r over d for three reasons:
(a) no adjustments are needed as occurs when moving from independent to corre-
lated t tests, (b) data for the calculation of d are often not provided and difficult
to obtain, and (c) r affords a simpler and more practical estimate than d (e.g., direct
estimate of shared variance).
We also transformed the correlations to Fisher's zs to normalize the distribu-
tion. We then combined the Fisher zs across studies for each of the personality
disorders after they had been weighted by sample size using a weighted least-
squares procedure. This provided a mean weighted Fisher's z for each personality
disorder and a test of significance of that z. After the inferential statistical tests
were completed, we transformed the mean Fisher's z for a characteristic back into
product-moment correlations for descriptive purposes.
We followed the conventions of Campbell and Fiske (1959) for establishing
the effect sizes for convergent and discriminant validity. For convergent validity
(monotrait-heteromethod), correlations were recorded for specific MCMI Axis II
scales and the corresponding personality disorder as measured by an external crite-
rion. Two forms of discriminant validity were examined: (a) heterotrait-mono-
method correlations were inspected for each scale via the mean intercorrelations
of other MCMI Axis II scales with that scale, and (b) heterotrait-heteromethod
correlations for each scale were addressed by the mean correlation of noncorre-
sponding scales from an external measure.
Interpretation of Convergent and Discriminant Validity
We established a priori standards for effect sizes based on conventions from
the convergent-discriminant validity literature. We employed Browne's (1989) and
Bagozzi and Yi's (1991) recommendation that "significant" effect sizes for conver-
gent validity account for at least 50% of the variance (i.e., rs > .70). We also
followed the Fiske and Campbell (1992) guidelines and described as "modest"
effect sizes ranging from .30 to .50; while statistically significant, they account for
only a nominal percentage of the variance.
The basic premise of discriminant validity is that the convergent validity (mono-
trait-heteromethod) must exceed the discriminant validity (both heterotrait-mono-
method and heterotrait-heteromethod). As an example, correlations for Scale 5
(Narcissistic) should be higher for the narcissistic personality disorder (i.e., mono-
trait-heteromethod) than for such disorders as borderline or histrionic (i.e., hetero-
trait-heteromethod) or intercorrelations among MCMI scales (heterotrait-mono-
method). Cases where this does not occur are termed "comparison violations." We
followed the guidelines of Bagozzi and Yi (1991) and Byrne and Goffin (1993) for
the percentage of comparison violations: high (<5%), moderate (5-33%), and low
(>33%) discriminant validity.
Results
Table 3 is a compilation of 33 MCMI and MCMI-II studies that includes
demographic data and a methodological summary of mean effect sizes. The majority
Table 3. Demographic and Methodological Characteristics for 33 MCMI and MCMI-II Studies of
Personality Disorders with Effect Sizes (M r)
MCMI Criterion Axis II Age
Authors Versioii measure Sample examined N (years;) % Male Mean r
Auerbach (1984) I NPI Undergraduate 1 50 36.1 51.0 .55
RNPI students .59
Cantrell & Dana (1987) I CL Outpatients 11 72 33.7 40.0 .36(k)
Chatham, Tibbals, & I MMPI-Ash Outpatients 1 70 100.0 -.26
Harrington (1993) MMPI-W&G .37
MMPI-Morey .66
MMPI-Raskin .68
MMPI-Mfl -.04
Chick, Sheaffer, & I PSCL Inpatients and 11 101 44.0 93.0 .08
Goggin (1993) outpatients
Dubro & Wetzler 1 MMPI-PD Inpatients and 11 56 42.4 93.0 .40
(1989) outpatients
Hart, Dutton, & II PDE Outpatients 11 85 36.1 100.0 .27
Newlove (1993)
Hart, Forth, & II PCL-R Inmates 1 119 30.3 100.0 .45
Hare (1991)
Hogg, Jackson, Rudd, I SIDP Inpatients 11 40 26.3 80.0 .27
& Edwards (1990)
Jackson, Gazis, Rudd, 1 SIDP Inpatients 11 82 31.2 62.0 .29
& Edwards (1991)
Kennedy, McVey, & I DIB Inpatients 1 44 26.4 0.0 .54
Katz (1990)
King (1994) I CL Outpatients 12 82 >21 97.6 0.0(k)
Lewis & Harder I CL Outpatients 1 60 29.0 45.0 .37
(1991) DIB .43
Kernberg .77
BSI .77
Marlowe, Husband, II SCID-II Outpatients 13 144 32.7 66.0 .36
Bonieskie, Kirby.
& Platt (1997)
McCann (1989) I MMPI-PD Inpatients 11 47 28.3 55.0 .47
McCann (1991) II MMPI-PD Inpatients 11 80 30.1 53.0 .62
Millon (1983) I CL Inpatients and 14 978 58.0
outpatients
Table 3. (Continued)
MCMI Criterion Axis II Age
Authors Versior i measure Sample examinee1 N (years)1 % Male Mean r
Millon (1987) II CL Inpatients and 13 859
outpatients
Morey & LeVine (1988) I MMPI-PD Inpatients and 11 76 37.9 27.6 .52
outpatients
Nazikian, Rudd, I SIDP Inpatients 11 31 33.0 52.0 .31
Edwards & Jackson
(1990)
Patrick (1993) I SIDP Inpatients and 1 198 35.4 56.0 .37
outpatients
Prifitera & Ryan (1984) I NPI Inpatients 1 50 36.1 96.0 .66
Reich, Noyes, & II SIDP Community 11 88 37.7 41.0 .24
Troughton (1987) volunteers
Renneberg, Chambless, I SCID-II Outpatients 8 54 34.4 22.2 .28
Dowdall, Fauerbach,
& Gracely (1992)
Sansone & Fine (1992) I BSI Inpatients and 1 28 33.2 0.0 .75
DIB outpatients
Schuler, Snibbe, & I MMPI-PD Inpatients 11 104 36.9 40.4 .48
Buckwalter (1994)
Smith-Silberman, Roth, II CATI Inpatients 13 30 63.3 40.0 .43
Segal, & Burns (1997)
Soldz, Budman, II PDE Outpatients 12 97 36.8 45.0 .41
Demby, & Merry
(1993)
Torgersen & Alnaes I SIDP Outpatients 10 272 30.0 .26
(1990)
Widiger & Sanderson I PIQ Inpatients 4 53 27.1 60.0 .40
(1987)
Wierzbicki & Gorman II PDQ-R Students 11 113 23.9 .45
(1995)

Wise (1994) I MMPI-PD Inpatients 11 72 47.0 31.0 .44
Wise (1996) II MMPI-2-PD Inpatients 11 72 45.0 29.0 .56
Zarrella, Schuerger. I MMPI-MWB Outpatients 11 100 36.7 51.0 .48
& Ritz (1990) Inmates 11 212 29.7 100.0 .36
Note. Four additional studies (Curtis & Cowell, 1993; Divac-Jovanovic, Svrakic, & Lecic-Tosevski, 1993;
Inch & Crossley, 1993; Piersma, 1987) of the MCMI and personality disorders are not included because
insufficient data were available in publications and attempts to secure additional data from authors were
unsuccessful. CL, Clinician diagnoses; BSI, Borderline Syndrome Index; DIB-R, Diagnostic Interview for
Borderlines-Revised; MMPI-PD, Morey et al.'s personality scales for the MMPI; NPI, Narcissistic
Personality Inventory; PCL-R, Psychopathy Checklist-Revised; PDE, Personality Disorder Examination;
PDI, Personality Disorder Inventory; PDQ, Personality Disorder Questionnaire; PIQ, Personality Inter-
view Questions; PSCL, Personality Symptom Checklist; RNPI, Revised Narcissistic Personality Inven-
tory; SCID-II, Structured Clinical Interview for DSM-III-R Axis II; SIDP, Structured Interview for
DSM-III Personality Disorders; CATI, Coolidge Axis II Inventory; Es, average effect size across the
different personality disorders estimated by Pearson r, k, agreement between clinician diagnosis and
MCMI personality disorders calculated with a kappa coefficient.
of Axis-II studies were devoted to the original MCMI (23 of 33, or 70.0%). Notably,
most of the studies were roughly comparable on the basis of gender. The meta-
analysis also provided a good representation of clinical samples: inpatients only
(12, or 36.4%), outpatients only (10, or 30.3%), and inpatient and outpatients com-
bined (7, or 21.2%). The remaining four studies (12.1%) were composed of inmates
and community samples.
Table 4 presents a summary of the convergent and discriminant validity of

MCMI scales for personality disorders. It provides information (correlations, and
weighted and unweighted Fisher z scores) for convergent and discriminant validity.
In addition, Table 4 also displays the percentage of comparison violations for each
personality disorder. Interestingly, M zr, and zrw are almost identical, indicating that
differences in the size of individual samples do not have an appreciable bearing on
effect sizes. Averaging across disorders, we found that MCMI-II achieved moder-
ately higher convergent correlations than MCMI (i.e., .45 versus .36), but at the
expense of discriminant validity (pooled correlations of .20 versus .06). Reassuringly,
all significant convergent validities on MCMI remained significant on MCMI-II.
Striking differences were found for comparison violations. MCMI had no scales
with good discriminant validity (<5% comparison violations); in contrast, MCMI-
II had five scales (2A, 3, 4, S, and P) with good discriminant validity.
In summary, the convergent validity estimates range markedly for both the
MCMI (rs from -.11 to .56) and MCMI-II (rs from -.05 to .69). As observed, the
MCMI-II has superior convergent validity to the original MCMI. In addition, the
MCMI-II appears to have improved its discriminant validity by reducing the compar-
ison violations from .20 in the original version to .11. More specifically, both the
heterotrait-monomethod (M r = .21) and heterotrait-heteromethod (M r = .18)
remain low for 9 of the 13 MCMI-II scales.
APPLICATION OF DALBERT
In clinical and forensic practice, most MCMI interpretations are based on

positive findings (e.g., the presence of an Axis II disorder). In such cases, the most
relevant statistic for error rate a la Daubert is false-positives. When negative findings
(e.g., the absence of an Axis II disorder) are offered, then the most relevant statistic
is false-negatives. Although the Court did not specify any minimum standard for
considering the error rate, we submit that the scientific community is unlikely to
support the "trustworthiness" of any finding that is more likely than not to be false
(Marlowe, 1995; Rogers & Shuman, in press). To achieve this minimum standard,
a test must achieve at least a positive predictive power (PPP) 2= .50 for positive
findings, and negative predictive power (NPP) > .50 for negative findings.
The Court was unambiguous in its discussion of scientific validity as a specific
rather than global concept. As articulated by Justice Blackmun (Daubert, 1993, p.
2796), "scientific validity for one purpose in not necessarily scientific validity for
other unrelated purposes." The litmus test is the "fit" between a relevant issue and
the scientific validity of a measure for this issue. In this context, the scientific validity
of MCMI versions is not a unitary determination, but rather a series of specific
decisions related to discrete disorders and particular forensic applications.
Admissibility of the MCMI-III

To demonstrate the presence of an Axis II disorder, the MCMI-III does not
appear to meet Daubert for admissibility. Unlike the MCMI and MCMI-II, direct
data are available on the MCMI-III and DSM-IV disorders. As summarized by

Retzlaff (1996), any presentation of positive findings for Axis II disorders is likely
to be wrong 4 of 5 times (i.e., PPP = .18; false-positives = 82%).7 For negative
findings, the MCMI-III appears very effective (i.e., NPP = .93; false-negatives
= 7%).
The MCMI-III does not appear to reach Daubert's threshold for scientific
validity with respect to criterion-related or construct validity. As previously re-
viewed, the study to establish its criterion-related validity (Millon, 1994) has come
under severe criticism for its methodology and has yielded less than satisfactory
results (Retzlaff, 1996). On the matter of construct validity, Millon's own data
suggest that convergent validities for Axis II disorders are low (M r = .22), account-
ing for a minuscule percentage of the variance (4.8%). In the majority of comparisons
(62%), the three MCMI-III studies found that the discriminant validities exceed
the convergent validities. In light of the widely accepted Campbell and Fiske model,
the MCMI-III does not meet the minimum standards of construct validity.8
MCMI and MCMI-II

In the absence of DSM-IV studies, direct comparisons of MCMI and MCMI-
II scales to current diagnoses are not possible. However, substantial data exist on
construct validity of Axis II disorders. As highlighted in Table 4, convergent and
discriminant validity was found for 8 of the 12 original MCMI scales. Most notably,
Scale 2A (avoidant) evidenced moderate convergent validity with an average r of
.56 that accounted for 31.4% of the variance. Seven other scales evidenced modest
convergent validities (M rs from .38 to .49); the limitation of these convergent rs
is that they account for only modest percentages (i.e., 14.4-24.0%) of the variance.
Critical to construct validity, all MCMI scales averaged much lower on both types of
discriminant validity (i.e., heterotrait-monomethod and heterotrait-heteromethod
correlations) than convergent correlations.
The MCMI-II outperformed the MCMI with respect to both convergent validity
and comparison violations. Three scales (2A, S, and C) yielded moderate convergent
rs that exceeded the .50; these correlations accounted for 26.0-36.0% of the variance.
For two scales (2A and S), the discriminant validity remained high (<5% comparison
violations), while the third scale fell in the upper range of moderate discriminant
validity (8% comparison violations). All the scales on the MCMI-II with the excep-
tion of Scale 7 (-.05) manifested modest validity coefficients, as defined by Fiske
and Campbell (1992), and moderate to high discriminant validity.
Does the MCMI-II9 measure up to Daubertl We have found no court rulings
on the admissibility of the MCMI-II Axis II classifications under Daubert. Based
on our analysis of the MCMI-II, we offer the following observations:
'Although the article focuses on Axis II disorders, the MCMI-III produces very modest results for
clinical and severe syndromes with a median PPP of .31 (false-positives = 69%).
'Although Millon (1994) addressed the convergent validity of clinical syndromes, Axis II disorders were
only addressed indirectly with less than satisfactory results (see Appendix I, pp. 126-130).
'In light of the superior results for MCMI-II and coupled with the ethical requirement to avoid outmoded
tests (American Psychological Association, 1992), we limited this discussion to MCMI-II.
1. The MCMI-II cannot be equated with DSM-IV diagnoses. At present, the

error rate has not been established.
2. For Avoidant (2A), Schizotypal (S), and Borderline (C) personality disor-
ders, the MCMI-II offers good construct validity and can provide descriptive
data about these disorders.10
3. For Schizoid (1), Dependent (3) Histrionic (4), Narcissistic (5), Antisocial
(6A), Aggressive (6B), Negativistic (8A), Self-Defeating (8B), and Paranoid
(P) scales, modest construct validity was established. Clinicians may offer
hypotheses, but not conclusions about these scales. This observation is based
on the need to describe accurately the validity and reliability of tests (Ameri-
can Psychological Association, 1992, Standard 2.08) and acknowledge the
limitation of their data and conclusions (Standard 7.04[b]).
4. Compulsive Personality Disorder (7) has no demonstrable convergent-dis-
criminant validity and should not be used in clinical interpretation.
In closing, we respectfully differ from McCann and Dyer's (1996) declarations
about the wide applicability of the MCMI-II to criminal and civil forensic issues.
While the MCMI-II offers useful data on three Axis II disorders, its applicability
to forensic issues remains virtually untested. Many of the forensic issues (e.g., sanity)
have not been systematically evaluated with the MCMI-II (Rogers & Shuman, in
press). Therefore, the MCMI-II cannot be employed to address elements of legal
standards. In light of Daubert, the MCMI-II provides useful clinical data on Avoid-
ant, Schizotypal, and Borderline personality disorders.
CONCLUSIONS
Forensic psychologists have a professional obligation to review their use of

available psychological measures in light of Daubert. Although the burden in Daub-
ert falls on the shoulders of trial judges to establish the admissibility of scientific
evidence, critical reviews of specific tests, such as the MCMI-II and MCMI-III, may
provide the necessary framework to both experts and the courts. In the current
paper, we first examined the scientific evidence for the validity of the most current
version, MCMI-III, for Axis II disorders and found it markedly deficient with
respect to both criterion-related and construct validity. Our subsequent reviews of
the MCMI and MCMI-II focused specifically on construct validity. On a positive
note, we found good evidence of construct validity for three MCMI-II personality
disorders. On a sobering note, the remaining Axis II disorders have insufficient
construct validity for rendering any firm conclusions. In light of the standards for
test validation (American Psychological Association, 1985, p. 13) coupled with the
dictum on scientific validity in Daubert (1993, p. 2796), validity only exists in specific
10
As a cautionary note, findings of Libb, Stankovic, Freeman, Sokol, Switzer, and Houck (1990) suggest
that Axis I-Axis II interactions may invalidate these scales. On the MCMI, depressed outpatients
warranted 55.5% fewer Schizoid, Avoidant, and Borderline personality disorders after only 12 weeks
of treatment for their depression. This potential confound of Axis I-Axis II disorders deserves further
investigation on the MCMI-II.
applications. With the MCMI-II, these applications are narrowly defined by its
circumscribed validity for Axis II disorders.
Within a larger framework, we envision the need to reevaluate systematically
commonly used psychological tests and measures in light of the Daubert standard.
As observed in post-Daubert decisions, appellate courts rely on the professional
literature in their determinations of falsifiability, error rate, and scientific acceptabil-
ity of psychological tests. Systematic reviews a la Daubert may address either general
issues of diagnostic validity or focus specifically on psycholegal constructs (e.g.,
competency to stand trial). Such scholarly efforts will assist both forensic psycholo-
gists in selecting psychological measures and the trial courts in determining their ac-
ceptability.
ACKNOWLEDGMENTS
Regarding Daubert and subsequent appellate cases, we would like to thank

Daniel W. Shuman, J. D., Keith R. Cruise, M. A., M. L. S., and two anonymous
reviewers for their insightful comments and review.
REFERENCES11
American Psychiatric Association (1968). Diagnostic and statistical manual of mental disorders (2nd
ed.). Washington, DC: Author.
American Psychiatric Association (1980). Diagnostic and statistical manual of mental disorders (3rd ed.).
Washington, DC: Author.
American Psychiatric Association (1987). Diagnostic and statistical manual of mental disorders (3rd ed.-
rev.). Washington, DC: Author.
American Psychiatric Association (1994). Diagnostic and statistical manual of mental disorders (4th ed.).
Washington, DC: Author.
American Psychological Association (1985). Standards for educational and psychological testing. Wash-
ington, DC: Author.
American Psychological Association (1992). Ethical principles of psychologists and code of conduct.
American Psychologist, 47, 1597-1611.
Anastasi, A. (1988). Psychological testing (6th ed.). New York: Macmillan.
* Auerbach, J. S. (1984). Validation of two scales for narcissistic personality disorder. Journal of Personal-
ity Assessment, 48, 649-653.
Bagozzi, R. P., & Yi, Y. (1991). Multitrait-multimethod matrices in consumer research. Journal of
Consumer Research, 17, 426-439.
Beamon, A. L. (1991). An empirical comparison of meta-analytic and traditional reviews. Personality
and Social Psychology Bulletin, 17, 252-257.
Borum, R., & Grisso, T. (1995). Psychological test use in criminal forensic evaluations. Professional
Psychology: Research and Practice, 26, 465-473.
Browne, M. W. (1989). Relationships between an additive model and a multiplicative model for multitrait-
multimethod matrices. In R. Coppi & S. Bolasco (Eds.), Multiway data analysis (pp. 507-520).
Amsterdam: Elsevier.
Butcher, J. N., Dahlstrom, W. G., Graham, J. R., Tellegen, A., & Kaemmer, B. (1989). Manual for the
administration and scoring of the MMP1-2. Minneapolis, MN: University of Minnesota Press.
Byrne, B. M., & Goffin, R. D. (1993). Modeling MTMM data from additive and multiplicative covariance
structures: An audit of construct validity concordance. Multivariate Behavioral Research, 28, 67-96.
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-
multimethod matrix. Psychological Bulletin, 56, 81-105.
11Asterisks denote references used in the meta-analysis.

*Cantrell, J. D., & Dana, R. H. (1987). Use of the Millon Clinical Multiaxial Inventory (MCMI) as a
screening instrument at a community mental health center. Journal of Clinical Psychology, 43,
366-375.
Carmichael v. Samyang Tire, Inc., 131 F.3d 1433 (llth Cir. 1997).
Chappie v. Ganger, 851 F. Supp. 1481 (E. D. Wash, 1994).
*Chatham, P. M, Tibbals, C. J., & Harrington, M. E. (1993). The MMPI and the MCMI in the evaluation
of narcissism in a clinical sample. Journal of Personality Assessment, 60, 239-251.
*Chick, D., Sheaffer, C. I., & Goggin, W. C. (1993). The relationship between MCMI personality
scales and clinician generated DSM-III-R personality disorder diagnoses. Journal of Personality
Assessment, 61, 264-276.
Curtis, J. M., & Cowell, D. R. (1993). Relation of birth order and scores on measures of pathological
narcissism. Psychological Reports, 72, 311-315.
Daubert v. Merrell Dow Pharmaceuticals, Inc., 113 S.Ct. 2786 (1993).
Davis, S. E., & Hays, L. W. (1997). An examination of the clinical validity of the MCMI-III Depressive
Personality Scale. Journal of Clinical Psychology, S3, 15-23.
Divac-Jovanovic, M., Svrakic, D., & Lecic-Tosevski, D. (1993). Personality disorders: Model for concep-
tual approach and classification. American Journal of Psychotherapy, 47, 558-571.
*Dubro, A. F., & Wetzler, S. (1989). An external validity study of the MMPI personality disorder scales.
Journal of Clinical Psychology, 45, 570-575.
*Dyce, J. A., O'Connor, B. P., Parkins, S. Y., & Janzen, H. L. (1997). Correlational structure of the
MCMI-III personality disorder scales and comparison with other data sets. Journal of Personality
E.I. du Pont de Nemours and Company Inc. v. Robinson, 923 S.W..2d 549 (Tex. 1995).
Fiske, D. W., & Campbell, D. T. (1992). Citations do not solve problems. Psychological Bulletin,
112, 393-395.
Frye v. United States, 293 F. 1013 (D.C. Cir. 1923).
General Electric Company v. Joiner, 118 S.Ct. 512 (1997).
Gier by and through Gier v. Educational Services Unit, 66 F.3d 940 (8th Cir 1995).
Goodman-Delahunty, J. (1997). Forensic expertise in the wake of Daubert. Law and Human Behavior,
21, 121-140.
*Guthrie, P. C., & Mobley, B. D. (1994). A comparison of the differential diagnostic efficacy of three
personality disorder inventories. Journal of Clinical Psychology, 50, 656-665.
*Hart, S. D., Dutton, D. G., & Newlove, T. (1993). The prevalence of personality disorder among wife
assaulters. Journal of Personality Disorders, 7, 329-341.
*Hart, S. D., Forth, A. E., & Hare, R. D. (1991). The MCMI and psychopathy. Journal of Personality
Disorders, 5, 318-327.
*Hogg, B., Jackson, H. J., Rudd, R. P., & Edwards, J. (1990). Diagnosing personality disorders in recent-
onset schizophrenia. Journal of Nervous and Mental Disease, 178, 194-199.
Inch, R., & Crossley, M. (1993). Diagnostic utility of the MCMI-I and MCMI-II with psychiatric
outpatients. Journal of Clinical Psychology, 49, 358-366.
*Jackson, H. J., Gazis, J., Rudd, R. P., & Edwards, J. (1991). Concordance between two personality
disorder instruments with psychiatric inpatients. Comprehensive Psychiatry, 32, 252-260.
Kennedy, S. H., McVey, G., & Katz, R. (1990). Personality disorders in anorexia nervosa and bulimia
nervosa. Psychiatric Research, 24, 259-269.
King, R. E. (1994). Assessing aviators for personality pathology with the Millon Clinical Multiaxial
Inventory (MCMI). Aviation, Space, and Environmental Medicine, 65, 227-231.
Lewis, S. J., & Harder, D. W. (1991). A comparison of four measures to diagnose DSM-III-R borderline
personality disorder in outpatients. Journal of Nervous and Mental Disease, 179, 329-337.
Libb, J. W., Stankovic, S., Freeman, A., Sokol, R., Switzer, P., & Houck, C. (1990). Personality disorders
among depressed outpatients as identified by the MCMI. Journal of Clinical Psychology, 46,277-284.
Marlowe, D. B. (1995). A hybrid decision framework for evaluating psychometric evidence. Behavioral
Sciences and the Law, 13, 207-259.
Marlowe, D. B., Husband, S. D., Bonieskie, L. M., Kirby, K. C., & Platt, O. (1997). Structured interview
versus self-report test vantages for the assessment of personality pathology in cocaine dependence.
Journal of Personality Disorders, 11, 177-190.
Marsh, H. W. (1990). Confirmatory factor analysis of multitrait-multimethod data: The construct valida-
tion of multidimensional self-concept responses. Journal of Personality, 58, 661-692.
McCann, J. T. (1989). MMPI personality disorder scales and the MCMI: Concurrent validity. Journal
of Clinical Psychology, 45, 365-369.
McCann, J. T. (1990). A multitrait-multimethod analysis of the MCMI-II clinical syndrome scales.
Journal of Personality Assessment, 55, 465-476.
*McCann, J. T. (1991). Convergent and discriminant validity of the MCMI-II and MMPI personality
disorder scales. Psychological Assessment: A Journal of Consulting and Clinical Psychology, 3, 9-18.
McCann, J. T., & Dyer, F. J. (1996). Forensic assessment with the Millon inventories. New York: Guilford.
Melton, O. B., Petrila, J., Poythress, N. G., & Slobogin C. (1997). Psychological evaluations for the
courts (2nd ed.). New York: Guilford.
Millon, T. (1981). Disorders of personality: DSM-1II, Axis II. New York: Wiley.
*Millon, T. (1983). The Millon Clinical Multiaxial Inventory manual (3rd ed.). Minneapolis: National
Computer Systems.
*Millon,T. (1987). Manual for the Millon Clinical Multiaxial Inventory-H(2nd ed.). Minneapolis: National
Computer Systems.
*Millon, T. (1994). The Millon Clinical Multiaxial Inventory-Ill manual. Minneapolis: National Com-
puter Systems.
Millon, T. (1996). Forward. In J. T. McCann, & F. J. Dyer, Forensic assessment with the Millon inventories
(pp. vii-ix). New York: Guilford.
Millon, T., & Davis, R. D. (1997). The MCMI-III: Present and future directions. Journal of Personality
Morey, L. C. (1991). Personality Assessment Inventory: Professional manual. Tampa: Psychological
Assessment Resources.
*Morey, L. C., & Le Vine, D. J. (1988). A multitrait-multimethod examination of Minnesota Multiphasic
Personality Inventory (MMPI) and Millon Clinical Multiaxial Inventory (MCMI). Journal of Psycho-
pathology and Behavioral Assessment, 10, 333-344.
Morey, L. C., Waugh, M. H., & Blashfield, R. K. (1985). MMPI scales for DSM-III personality disorders:
A preliminary validation study. Journal of Personality Assessment, 49, 245-251.
*Nazikian, H., Rudd, R. P., Edwards, J., & Jackson, H. J. (1990). Australian and New Zealand Journal
of Psychiatry, 24, 37-46.
*Patrick, J. (1993). Validation of the MCMI-I borderline personality disorder scale with a well-defined
criterion sample. Journal of Clinical Psychology, 49, 28-32.
Pfohl, B., Stangl, D., & Zimmerman, M. (1982). The Structured Interview for DSM-III Personality
Disorders (SIDP). Iowa City, IA: University of Iowa Press.
*Piersma, H. L. (1987). The MCMI as a measure of DSM-III Axis II diagnoses: An empirical comparison.
Journal of Clinical Psychology, 43, 478-483.
*Prifitera, A., & Ryan, J. J. (1984). Validity of the Narcissistic Personality Inventory (NPI) in a psychiatric
sample. Journal of Clinical Psychology, 40, 140-142.
Reed, J. E. (1996). Fixed vs. flexible neuropsychological test batteries under the Daubert standard for
the admissibility of scientific evidence. Behavioral Sciences and the Law, 14, 315-322.
*Reich, J., Noyes, R., & Troughton, E. (1987). Dependent personality disorder associated with phobic
avoidance in patients with panic disorder. American Journal of Psychiatry, 144, 323-326.
*Renneberg, B., Chambless, D. L., Dowdall, D. J., Fauerbach, J. A., & Gracely, E. J. (1992). The
Structured Clinical Interview for DSM-III-R, Axis II and the Millon Clinical Multiaxial Inventory:
A concurrent validity study of personality disorders among anxious patients. Journal of Personality
Disorders, 6, 117-124.
Retzlaff, P. (1996). MCMI-III validity: Bad test or bad validity. Journal of Personality Assessment,
66, 431-437.
Richardson, J. T., Ginsburg, G. P., Gatowski, S., & Dobbin, S. (1995). The problems of applying Daubert
to psychological syndrome evidence. Judicature, 79, 10-16.
Rogers, R. (1995). Diagnostic and structured interviewing: A handbook for psychologists. Odessa, FL:
Psychological Assessment Resources.
Rogers, R., & Shuman, D. W. (in press). Conducting insanity evaluations (2nd ed.). New York: Guilford.
Rosenthal, R. (1991). Meta-analytic procedures for social research. Beverly Hills, CA: Sage.
Rotgers, F., & Barrett, D. (1996). Daubert v. Merrell Dow and expert testimony by clinical psychologists:
Implications and recommendations for practice. Professional Psychology: Research and Practice,
27, 467-474.
*Sansone, R. A., & Fine, M. A. (1992). Borderline personality as a predictor of outcome in women
with eating disorders. Journal of Personality Disorders, 6, 176-186.
*Schuler, C. E., Snibbe, J. R., & Buckwalter, J. G. (1994). Validity of the MMPI Personality Disorder
scales (MMPI-PD). Journal of Clinical Psychology, 50, 220-227.
*Smith-Silberman, C., Roth, L., Segal, D. L., & Burns, W. J. (1997). Relationship between the Millon
Clinical Multiaxial Inventory-II and the Coolidge Axis II Inventory in chemically dependent mentally
ill older adults: A pilot study. Journal of Clinical Psychology, 53, 559-566.
*Soldz, S., Budman, S., Demby, A., & Merry, J. (1993). Diagnostic agreement between the Personality
Disorder Examination and the MCMI-II. Journal of Personality Assessment, 60, 486-499.
Spitzer, R. L., Williams, J. B. W., & Gibbon, M. (1987). Structured Clinical Interview for DSM-III-R
Personality Disorders (SCID-ll). Washington, DC: American Psychiatric Association Press.
State v. Cavaliere, 663 A.2d % (N.H. 1995).
S.V. v. R. V., 933 S.W.2d 1 (Tex. 1996).
*orgersen, S. & Alnaes, R. (1990). The relationship between the MCMI personality scales and DSM-
III, Axis II. Journal of Personality Assessment, 55, 698-707.
U.S. v. Scheffer, 118 S.Ct. 1261 (1998).
*Wetzler, S., & Dubro, A. (1990). Diagnosis of personality disorders by the Millon Clinical Multiaxial
Inventory. Journal of Nervous and Mental Disease, 178, 261-263.
*Widiger, T. A., & Sanderson, C. (1987). The convergent and discriminant validity of the MCMI as a
measure of the DSM-III personality disorders. Journal of Personality Assessment, 51, 228-242.
Widiger, T. A., Williams, J., Spitzer, R. L., & Frances, A. (1985). The MCMI as a measure of DSM-
III. Journal of Personality Assessment, 49, 366-378.
*Wierzbicki, M., & Gorman, J. L. (1995). Correspondence between student scores on the Millon Clinical
Multiaxial Inventory-II and Personality Diagnostic QuestionnaireRevised. Psychological Reports,
77, 1079-1082.
*Wise, E. G. (1994). Managed care and the psychometric validity of the MMPI and MCMI personality
disorder scales. Psychotherapy in Private Practice, 13, 81-97.
*Wise, E. G. (19%). Comparative validity of MMPI-2 and MCMI-II personality disorder classifications.
Journal of Personality Assessment, 66, 569-582.
*Zarrella, K. L., Schuerger, J. M., & Ritz, G. H. (1990). Estimation of MCMI DSM-III Axis II constructs
from MMPI scales and subscales. Journal of Personality Assessment, 55, 195-201.
Zonana, H. (1994). Daubert v. Merrell Dow Pharmaceuticals: A new standard for scientific evidence in
the courts? Bulletin of the American Academy of Psychiatry and the Law, 22, 309-325.

Rogers 1999

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Rogers 1999

Caricato da

Copyright:

Formati disponibili

Law and Human Behavior, Vol. 23, No.

Validation of the Millon Clinical Multiaxial Inventory for

The Supreme Court's decision in Daubert v. Merrell Dow Pharmaceuticals, Inc.

DAUBERT AND SUBSEQUENT CASES

According to Daubert, scientific knowledge is based on the scientific method

reliance on scientific principles. The American Psychological Association (1985)

OVERVIEW OF THE MCMI AND ITS REVISIONS

Table 1. Descriptive Characteristics of the MCMI, MCMI-II, and MCMI-III

ered successive refinements of the same measure or as three distinct measures.

Retzlaff (1996, p. 436) contended that less-than-rigorous standards were em-

As summarized in Table 2, we failed to establish construct validity on three

MCMI AND MCMI-II

A challenge in assessing the current diagnostic validity of the MCMI and

pertained to personality disorders. Specifically, our search procedures included (a)

Table 4 presents a summary of the convergent and discriminant validity of

In clinical and forensic practice, most MCMI interpretations are based on

Admissibility of the MCMI-III

data are available on the MCMI-III and DSM-IV disorders. As summarized by

MCMI and MCMI-II

1. The MCMI-II cannot be equated with DSM-IV diagnoses. At present, the

Forensic psychologists have a professional obligation to review their use of

Regarding Daubert and subsequent appellate cases, we would like to thank

11Asterisks denote references used in the meta-analysis.

Potrebbero piacerti anche