Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
EDWARD M. WHITE
California State University
As we move to the end of this century, the issues and problems in writing
assessment have shifted and become more complicated. Or perhaps it
would be more accurate to say that the complicated issues and problems
behind writing assessment have finally become evident, after a century in
which they have been concealed by special interests and technical prob-
lems. Teacher concern for assessment that reflects their teaching has often
concealed an unwillingness to risk any assessment at all; testing firms
devotion to multiple-choice tests has in part protected investments in
expensive instruments; statistical faith in the true score and reliability
coefficients has concealed the naive positivism inherent in statistics as a
means of discovering reality. As Gere (1985) has pointed out, the first
English Placement Test, at Harvard in 1874, served to define university
writing instruction as an administrative nuisance rather than a college field
of study; and, as Berlin (1987) has argued, that test also served to protect
the social status of the university.
I do not mean to suggest wickedness by proposing that a certain self-in-
terest lies behind the varying approaches to writing assessment issues that
have emerged over the last century. Such self-interest is an inevitable and
proper result of a particular profession, a particular language community,
and a particular way of seeing the world. We need to recognize these
differences, and our own situatedness, if we are to come to terms with
perspectives other than our own. We define the issues and problems in our
field according to the view we take from our own vantage point. And we
tend to see the limitations and self-interest of other perspectives more
clearly than we do our own; the view from our own vantage point seems
nothing more than simple common sense to us and to those standing in the
same place. Thus writing teachers take it for granted that writing must
appear on any writing assessment, while those immersed in the business of
testing assume that test items will be scored by computer.
Correspondence and requests for reprints should be sent to Edward M. White, Department
of English, California State University, San Bernardino, CA, 92407.
11
12 EM. White
Furthermore, each special group sees itself not as one among a variety
of interested parties but as the sole guardian of truth. Commercial or
non-profit test administrators are prone to scoff at teacher preferences
for direct writing measures while many teachers see such statistical issues
as reliability as mere technicalities that actually interfere with useful meas-
urement [as does Elbow (1991) in his Foreword to Belanoff and Dickson
(1991)]. Teachers become furious or frustrated when assessments take
away their teaching time for what they see as useless bureaucratic pur-
poses, while those conducting such assessments see teacher opposition to
them as self-serving and destructive. We need to move beyond such atti-
tudes to recognize the appropriateness and value of views other than our
own, and we should recognize the necessity for mediating conflict in defin-
ing a new agenda. Until we come to understand the political play of these
forces and the need for conversations among those representing them, the
dialectic will remain constricted.
Certainly, the key problem in writing assessment is beyond specific
issues, problems, or technical concerns and is one this new journal may help
to resolve: Which group or groups will accumulate the power to define the
issues and problems of writing assessment? The diverse and often conflict-
ing stakeholders not only come from different perspectives on assessment
but also have developed different definitions of the purposes of writing.
For example:
. Writing teachers concerned about individual student growth and the
writing process as a means of discovery are at odds with state and local
governing bodies interested in reducing costs and identifying minimally
proficient individuals and the programs which produce them.
. The conservative tendencies of the commercial and non-profit testing
firms conflict with the innovative tendencies of theorists.
. The needs of researchers for precise measurement tools contradict the
experience of generations of creative writers maintaining that writing is
more an elusive art than a measurable skill.
. The practical demands for reliable scores, cumulative data, and short
turnaround time put enormous pressure on the more theoretical re-
quirements for validity, most particularly for a sufficiently complex defi-
nition of the construct writing and for attention to the consequences of
assessment on the teaching of writing.
I propose here to set out the assumptions, perspectives, and demands of
several of the principal interest groups concerned with writing assessment.
I hope this journal can promote a respectful conversation among them and
thus foster orderly progress in writing assessment, which has reflected
political power struggles and educational faddism as much as the findings
of research over the last several decades.
Issues and Problems 13
ient teaching groups, and is also valuable for teachers, since-when the
test is driven by the curriculum-it gives them relatively homogenous
groups to work with.1 But most of the testing devices that bother writing
teachers have strict institutional purposes and only negative effects on
students: proficiency tests designed to identify failures, admissions tests
to screen out the unworthy, value-added tests to convince legislators
that students have accumulated information, and the like. Such tests
seem to teachers to have little to do with the curriculum, their students,
or their own goals as teachers of writing, though other interest groups
find them of value. When teachers are forced-as they often are-to
choose between teaching to an inappropriate institutional test and help-
ing their students learn how to write, they are bound to consider evalu-
ation as an intrusion into the classroom.
But writing teachers also experience considerable internal conflict be-
cause they probably do more evaluation of student work than anyone else
on the faculty. At the same time that they condemn assessment, they know
that some sort of evaluation is crucial for learning (Howm I doing?) ,
and most teachers of writing spend many hours a week assessing student
work in an attempt to help students write better. Clearly, despite their usual
dislike of tests and grading, writing teachers need and use assessment to do
their jobs. Some kinds of evaluation are obviously more valuable-and less
offensive-to them than others. When they are convinced that assessment
will enhance student writing and support their teaching, teachers are per-
fectly ready to adopt assessment-perhaps even too ready. Most writing
teachers have a relatively uncritical faith in their own assessments of
student work, on which they spend inordinate amounts of time.2 It is not at
all unusual for writing teachers to spend the greater part of their evenings
and weekends grading student papers and writing comments on them in
the hope that an individual personal response to student writing will lead
to revision or, more likely, to better work on the next paper.
Many writing teachers resent as well as honor this time-consuming and
onerous work, for which they receive no additional compensation, and
express an undifferentiated anger about it in their metaphors; for example,
some will speak of bleeding all over a set of themes. The grading of
papers is a powerful aspect of what Stephen North (1987) calls lore, the
Even this apparently benign use of assessment has been arousing substantial opposition
from some writing teachers, who argue that placement creates negative groupings and la-
bels for which the usual weak instruction does not compensate. See Elbow (in press, a) and
Haswell (1991).
Teacher reliability in grading is probably not high (see Diederich (1974),Sommers (1982))
but students obviously value teacher comments and appear to learn from them. See Lundsford
and Straub (1994) for an analysis of tcachcr responding patterns.
Issues and Problems 15
way things are done by the profession. This lore has a power of its own,
connected with professional identity as much as with assessment; the red
pen, the grade at the end, the mystifying abbreviations (k, fs) scrawled in
the margins, the avuncular advice and cautions at the end-all define the
writing teacher as much as they describe the writing. Writing assessment
that fails to recognize the depth and meaning of this lore will have little
effect on teacher practice. Although assessment research can reduce their
excessive workload and help them teach more effectively, writing teachers
are rarely able to overcome their resistance to assessment and actually use
some assessment concepts in their teaching.
Perhaps the thorniest objection teachers have to assessment challenges
the very nature of the activity, at least in its psychometrically-driven form:
its need to simplify and sample phenomena in order to measure them. If
there is one common bond among writing teachers at all levels, it is their
sense that writing and its pedagogy are exceedingly complex, that ever-
shifting individual and group contexts govern effective communication.
What works with one class fails in another; the same student turns out fine
work today and abysmal stuff tomorrow; the same piece of writing strikes
me as wonderful and you as ordinary. And yet it is the essence of traditional
assessment to sample a universe, to reduce variety to measurable units, to
present meaningful comparative data. Anyone seeking the support of writ-
ing teachers for assessment must deal as well with this resistance to over-
simplification and reductionism.
What, then, do teachers want from assessment? Something like the
following:
which would reflect the current curriculum and its underlying writing
theory. Although this teacher-driven assessment design would provide im-
mediate useful feedback to students and qualitative information, it would
not produce the kind of data that most other interest groups are likely to
value.
The issues and problems that would emerge from such an assessment
design would focus on pedagogical implications, such as the following:
However much sympathy 1 have for this perspective, I do not think the
nation can or will allow writing teachers to set the entire agenda for writing
assessment-any more than it will allow physicians to determine health
care policy. The predominant concern for teaching and learning omits too
many financial and outcomes issues, and the acceptance of current educa-
tional theories and patterns inhibits the development of innovations. The
teacher perspective is important, but it is one among many and it omits too
many matters of urgent importance to other interest groups.
know more about a subject if they write about it. What, the researchers
asked, does the word know mean in such a context, and what kinds of tests
could ascertain whether such knowing had occurred? After defining with
narrow precision the kinds of knowledge involved (drawing on theories of
knowledge change), the authors propose specific tests-which do not in-
volve writing-on such matters as ordered trees from recall, multi-di-
mension scaling, and hierarchical clustering analysis. In a similar way,
researcher assessments are likely to be indirect, include many different
kinds of writing subskills, allow numerical reduction, be time-consuming
and expensive, and focus on outcomes as a major part of the construct
evaluated. Teachers would not participate in the scoring of such assess-
ments, since as interested parties they would be less reliable than disinter-
ested researchers, and individual students would not receive responses to
their writing, since group findings would be the main concern. Although
this researcher-driven assessment design would provide data to support a
certain kind of research and theory-generation, it would not provide useful
feedback to students or to writing programs, and the data would be too
complex and specialized for institutional or governmental use.
The issues and problems that would emerge from an assessment para-
digm reflecting theorists and researchers would focus on problems such as
the following:
The researchers and theorists are a tiny group compared with teachers,
but their voice is a powerful one. However, creators of knowledge, as the
atomic scientists in the Manhattan project learned all too well, do not
determine the use of that knowledge. The research agenda will surely play
itself out in funded projects and continue to make its influence felt. But its
agenda is even less likely to dominate than that of the teachers, since the
practical demands made of writing assessment seem to be increasing in size
and scope.
20 E.M.White
--
%he unusual concern by a testing firm for the consequences of a test and the scores it
produces echoes a pervasive teacher interest as well. It further reflects an evolving notion of
validity, sometimes called consequential validity, presently being advanced by such scholars
as Messick (1989) and Cronbach (1988).
Issues and Problems 21
candidates ability level, thus abbreviating the number of items that must
be answered), depends on the accumulated test items in the ETS item
bank-a vault of treasures dating back many years, heavily weighted to-
ward surface features of edited American English. Theories disputing the
value of this Fort Knox of testing get short shrift at ETS; the items have
shown they work and thus will be used. Like their phrenological fore-
bears (see Gould, 1981) the testing firms rely on their ability to produce
measurements that meet current needs, and theoretical niceties are expen-
sive fringes that are only sometimes attended to. ETS is so prominent that
it makes an easy target, but it is more responsible than most of its competi-
tors in the business of testing, whose profit line is even more tenuous and
whose commitment to theory is even slimmer.
The governing bodies that use the tests produced by ETS, the American
College Testing Service (ACT), and dozens of other commercial or non-
profit testing firms want scores and are not particularly concerned about
how those scores are obtained as long as they do not cost too much. They
want to know how many students failed a particular test and who should be
blamed; or they want to know what steps might improve scores at no
expense. One example of this cavalier attitude toward tests is the wide-
spread use of the SAT (formerly the Scholastic Aptitude Test, now the
Scholastic Assessment Tests, sponsored by the College Board and adminis-
tered by ETS) as a writing placement test-a purpose neither designed nor
proposed by the test designers and an entirely improper use.
For these reasons, we should expect a writing assessment designed by
testing firms for the use of governing bodies to look something like the
following:
If testing firms and their institutional clients were to control and define
assessment practice for the future, we would be likely to have as the ideal
assessment a series of multiple-choice tests, with occasional impromptu
essays added (as they are now in several programs such as Advanced
Placement) mainly as window dressing to please the teachers. Cooper
22 E.M.White
(1984) asserts that indirect measures are sufficient for accurate measure-
ment, and test scaling procedures in some programs even now minimize
the contribution of direct measures to the students total score. Such assess-
ments would be passive, indirect, focus on surface scribal and dialectal
features, allow numerical reduction, be efficient and cheap, and use cor-
rectness as a major part of the construct evaluated. Teachers would partici-
pate in the scoring of essay portions (which would count for little in the
final scaled scores), and individual students would receive scores but no
responses to their writing. Although this testing-firm-driven assessment
design would provide data to support governmental and institutional pol-
icy, it would not provide useful feedback to students or to writing programs,
and teachers would ignore it as irrelevant to their instructional aims.
The issues and problems that would emerge from such an assessment
design would focus on institutional problems such as the following:
Once again, it is obvious that enough forces are opposed to the power of
the testing firms to keep them from setting the national agenda, despite
their great influence with those running American institutions. But this
voice is ever present, reminding teachers, researchers, and theorists that the
testing of writing is big business in America, with important links to eco-
nomic and political conservatism.
Many of these issues have already become part of the national assess-
ment agenda, but it is by no means clear that they will receive the attention
required to resolve them. Indeed, this assessment design is so sharply at
odds with that of other interest groups that conflicts are sure to develop.
With the reduction in funds available for education in recent years, an
unspoken alliance seems to be developing between conservative governing
bodies, seeking to reduce educational costs by reducing enrollments and
increasing fees, and relatively liberal faculties, seeking to maintain educa-
tional quality; this alliance seems likely to reduce access to higher educa-
tion and to oppose the design of assessment in the interest of students and
marginalized groups in the name of standards.
REFERENCES
Belanoff, P., & Dickson, M. (Eds). (1991). Portfolios: Process and product Portsmouth, NH:
BoyntonlCook.
Berlin, J. (1987). Rhetoric and reality. Carbondale, IL: Southern Illinois University Press.
Breland, H., Camp, R., Jones, R., Morris, M., & Rock, D. (1987). Assessing writing skill. New
York: College Entrance Examination Board.
Camp, R. (1993). Changing the model for the direct assessment of writing. In M. Williamson
& B. Huot (Eds), Validating holistic scoring for writing assessment: Theoretical and
empirical foundations (pp. 45-78). Cresskill, NJ: Hampton Press.
Christensen, E (1967). Notes toward a new rhetoric. New York: Harper & Row.
Connors, R. (1981). The rise and fall of the modes of discourse. College Composition and
Communication, 32,444-463.
Cooper, P L. (1984). The assessment of writing ability: A review of research. (ETS Report No.
84-12). Princeton, NJ: Educational Testing Service.
Cronbach, L. (1988). Five perspectives on validity argument. In H. Wainer (Ed.), Test validity
(pp. 3-17). Hillsdale, NJ: Erlbaum.
Diederich, P (1974). Measuring growth in English. Urbana: National Council of Teachers of
English (NCTE).
Elbow, I? (1991). Foreword. In P. Belanoff & M. Dickson (Eds.), Portfolios: Process and
product (pp. ix-xvi). Portsmouth, NH: BoyntonlCook.
Elbow, P. (in press, a). Do it better; Do it less. In E. White, W. Lutz, & S. Kamusikiri (Eds.), The
policies and politics of assessment in writing. New York: MLA.
Elbow, P. (in press, b). Writing assessment in the 21st century: A utopian view. In L. Bloom, D.
Daiker, & E. White (Eds.), Composition in the 21st century: Crisis and Changes.
Carbondale, IL: Southern Illinois University Press.
Farr, M., & Daniels, H. (1986). Language diversity and writing. New York: ERIC Clearing-
house on Reading and Communication Skills.
Gere, A.R. (1985). Empirical research in composition. In B. McClelland & T. Donovan (Eds.),
Perspectives on research and scholarship in composition (pp. 110-124). New York:
MLA.
Godshalk,, F., Swineford, E., & Coffman, W. (1966). The measurement of writing ability. New
York: The College Entrance Examination Board.
Gould, S.J. (1981). The mismeasure of a man. New York: Norton.
Guba,E., & Lincoln,Y. (1989). Fourth generation evaluation. Newbury Park, CA: Sage.
Haswell, R. H. (1991). Gaining ground in college writing: Tales of development and interpreta-
tion. Dallas, TX: Southern University Press.
Holland, N. (1968). Five readers reading. New Haven: Yale University.
Hunt, K. (1977). Early blooming and late blooming syntactic structures. In C. Cooper & L.
Ode11 (Eds), Evaluating writing: Describing, measuring, judging. Urbana, II: National
Council of Teachers of English (NCTE).
Kamusukiri, S. (in press). African American English and writing assessment: An Afrocentric
approach. In E. White, W. Lutz, & S. Kamasukiri (Eds.), The Policies and Politics of
Assessment in Writing. New York: MLA.
Koenig, J., & Mitchell, K. (1988). Interim Report on the MCAT essay pilot project. Journal of
Medical Education 6,21-29.
Lundsford, R., & Straub, R. (1994). Twelve Readers Reading. Cresskill, NJ: Hampton Press.
Mahala, D., & Vivion, M. (1993). The role of AP and the composition program. WPA: Writing
Program Administration 17, 43-56.
Messick, S. (1989). Meaning and values in test validation: The science and ethics of assessment.
Educational Researcher, 18, 5-11.
Issues and Problems 27
Mitchell, K., & Anderson, J. (1986). Reliability of essay scoring for the MCAT Essay. Educa-
tional and Psychological Measurement, 46,771-775.
North, S. (1987). The making of knowledge in composition: Portrait of an emerging field.
Upper Montclair, NJ: BoyntonlCook.
Schumacher, G., & Gradwohl Nash, J. (1991). Conceptualizing and measuring knowledge
change due to writing. Research in the Teaching of English, 25, 67-96.
Sommers, N. (1982). Responding to student writing. College Composition and Communicu-
tion, 33,148-156.
White, E. (1994). Teaching and assessing writing (2nd ed.). San Fransisco: Jossey-Bass.
White, E., & Polin, L. (1986). Research in effective teaching of writing: Final report. (Report
Nos. NIE-G-81-0011& NIE-G-82-0024). Washington, DC: National Institute of Educa-
tion. (ERIC Document No. ED 275 007)
White, E., & Thomas, L. (1981). Racial minorities and writing skills assessment in the Califor-
nia State University and Colleges. College English, 42,276-283.
Williamson, M. (1993). An introduction to holistic scoring: The social, historical and theoreti-
cal context for writing assessment. In M. Williamson & B. Huot (Eds.), Validating
holisitic scoring for writing assessment: Theoretical and empirical foundations (pp.
143). Cresskill, NJ: Hampton Press.