Exposing The Distorted and Deceptive Numbers Game': William Spady

Chapter 3
Exposing the Distorted and Deceptive ‘Numbers Game’

William Spady
In recent days, I’ve been accused of advocating “Hard Core” OBE. To tell you the truth, I’ve
never heard of there being a version of ‘real’ OBE that isn’t ‘hard core’, whatever that means.
As I explained in the Introduction and again in Chapter 1, there are imitation versions of OBE
that I call the ‘CBO Syndrome’, and I include Outcomes Accreditation as one of those kinds.
Maybe I should have named them “No Core” OBE. And since the only other kind of OBE I’m
aware of is Bloom’s Mastery Learning model, maybe I should have named it “Soft Core” OBE,
especially since I’ve been forthright about it lacking compelling culminating Outcomes
frameworks and a fully systemic grounding.
Then even more recently, I read that there are two legitimate kinds of OBE: ‘Strong OBE’ and
‘weak obe’, with the OBE letters being so weak that they were presented in lower case, not caps.
The explanations offered for those two kinds reflect a full awareness of what ‘real’ OBE is – the
strong one – accompanied by a badly distorted description of what it implies for universities that
wish to retain their time-based regularities and methods of accounting, such as ‘units of
learning’ per dollar spent per year. To that I have no rejoinder because I don’t know what a
‘unit of learning’ is or how you go about defining one unless, like accountants, you are
hopelessly caught in education’s all-pervasive Numbers Game. The description of weak obe
seemed to offer permission to anyone in higher education to do anything that they wanted in the
name of ‘obe’, and it would be Okay.
So, if you happen to be deeply concerned with ‘units of learning per dollar spent per year’ or
anything like it, I invite you take a journey with me into the Qualitative side of ‘real’ OBE and
explore why ‘hard core’ OBE advocates like myself don’t believe that you can average words
(and Outcomes) the way you average numbers, or why our approach to ALIGNMENT is so
unconventionally STRONG and exacting. It may be because we’re so committed to realizing
Bloom’s philosophy of learning and instruction:
All students can learn and succeed if you show them clearly what it is, give them
time and targeted support to assimilate and practice it, and use exactly what you
showed them as the criterion when you assess them.
It’s very much related to what in the U.S. is called TRUTH IN ADVERTISING. Moreover,
based on what’s revealed in Chapter 2, Bloom’s ideals are much more likely to unfold if students
are engaged in developing life-performance abilities that they see as relevant to the future and
the role responsibilities that await them after graduation.
It’s all painfully straightforward, really, but it’s not how ‘weak obe’, the CBO Syndrome, or
Outcomes Accreditation operate. With them you have to wade through mountains of scores and
numbers that distort and disguise what the real learning is to find out who got the most points
1
‘that count’. We ‘hard core’ OBE people would rather know what they know right up front by
what their performance transcripts report, and I think the students would too.
Among all the shifts in paradigmatic thinking and action that ‘real’ Outcomes of Significance
precipitate, then, it’s likely that none is more profound than the shifts involving the assessment
and credentialing of students’ learning. To address that issue I’m devoting this chapter to
exposing and dispelling the false assumptions, distortions, and myths on which these practices,
The Numbes Game and Education’s Sorting and Selecting functions are all based. And to get
started on this paradigm-shifting mission, I’m going to introduce you to one of the most
significant ‘Spadyisms’ ever:
YOU CAN’T ASSESS WHAT YOU HAVEN’T DEFINED
Of course you can’t, but . . .
 Millions of educators do it all the time,
 By assigning points and percents to whatever things they consider relevant
learning tasks and achievement to be;
 Then they pretend that all points for all of that dissimilar learning and achievement
are created equal, so they average them and give them labels called grades,
 Then they rank students according to these contrived averages and grades,
 Then they reward the people with the highest invalid and distorted rankings with
either scholarships or the best jobs,
 Then they accredit and rank institutions on all of these distortions in the name of
Outcomes and OBE, even though they don’t know what the students in those
institutions actually know and can do competently.
This happens because the entire Educational enterprise has been caught in the grand illusion of
the Numbers Game for over a century, and it seems to have no alternative but to keep on
perpetuating itself because the entire culture plays it, and wouldn’t know what to do without it.
My colleagues and I do, and I hope that this chapter begins paradigmatically dismantling the
Game for you forthwith.
You Can’t Assess What You Haven’t Defined

Since you can’t assess what you haven’t defined, let’s start there by clarifying what Assess
means, and that, I believe, requires us to be thinking ‘substance’ not ‘numbers’. My
decades of experience with Outcomes has taught me that ‘Assess’ means:
Gathering evidence that directly matches and relates to a specified phenomenon,
and comparing that evidence against the phenomenon as it is defined.
If we can agree on the accuracy and intent of this definition, its relevance and usefulness requires
us to start with the phenomenon in question and work backwards – in this case a learning
Outcome. Without a tangible, observable demonstration of learning, there will be nothing for
us to assess. And, as we learned in Chapter 2:
2
 Demonstrations are tangible and observable actions,
 Actions are defined by words, not numbers,
 Demonstrations require action verbs, and
 Not all action verbs are created equal.
So let’s consider what it will take to ‘gather evidence that directly matches and reflects’ each
of these four defining elements of an Outcome of Significance.
Demonstrations are Tangible and Observable Actions
If my OBE colleagues from 1986 were here to discuss this issue, I think they’d be unanimous
that answers on a paper-pencil test do not qualify as tangible and observable actions. To be sure,
they ‘are’ products of the various domains of mental processing that Bloom identified in his
Taxonomy of Cognitive Objectives, but mental processing isn’t action – and that was the most
important implication of the definition developed in 1986. From the very beginning it meant that
students would have to DO something that was unambiguously an observable and assessable
action, which, in turn, meant that the action itself contained and manifested the tangible, visible,
observable evidence that could then be compared against the pertinent Outcome statement.
This doesn’t mean, of course, that answers on paper-pencil tests are meaningless in term of
student learning; it simply means that they’re only evidence of a one-dimensional cognitive
learning process, not the rich array of things that subsequently emanated out of our 1986
definition of an Outcome.
Actions are Defined by Words, not Numbers
My 1986 colleagues were also explicit about two things: 1) Numbers and scores are NOT
Outcomes; they’re symbols that we attach to demonstrations of learning that don’t define or
assess demonstrations in any way; and 2) Outcomes are carefully defined actions in which every
word matters because the presence or absence of any given word inevitably changes the
content, complexity and implications of what the action is expected to embody. This emphasis
on alignment wasn’t meant to turn OBE into a rigid and mechanistic approach to learning. It
was, instead, an attempt to improve teachers’ clarity and consistency in what and how they
taught, and then how they assessed the resulting student learning.
To all involved, these two things clearly meant that the actions defined and contained in
Outcomes statements were to be interpreted, taught, demonstrated and assessed word for word
BECAUSE each word is an essential element in the demonstration, and essential means
ESSENTIAL. That’s why they thought every word in the Outcome statement had to be present
in the demonstration and its assessment; otherwise you didn’t have that demonstration, you had
something else. Hence, OBE implementers have consistently used the words ‘criterion defined’
to describe what Outcome statements were and how they were to be treated. Translated into
today’s Outcomes Accreditation challenges, we would say that:
Every word in an Outcomes statement is an essential criterion in comparing
and assessing any tangible demonstration against the statement’s content
3
and meaning. Therefore, a faulty or missing criterion means that you don’t
have the fully completed demonstration in front of you yet. As it stands, the
demonstration is still unfinished and incomplete.
Consequently, being criterion defined has nothing to do with taking points off for what’s
missing. Since every word/element/criterion is an essential component of the expected Outcome,
anything that’s missing profoundly alters the whole and must be restored. A missing criterion is
the equivalent of taking 100 points off, even though points don’t apply to words and essential
criteria in the first place, which I’ll explain more fully later in the chapter.
Demonstrations Require Action Verbs
All words in an Outcome statement are essential, but no word or words are more critical to
shaping its required demonstration than the verb or verbs used. This was probably the most
difficult thing for faculty and students to understand about OBE because their world had been
dominated by the mental processing verbs know and understand. As noted in Chapter 2,
virtually every classroom learning goal existing in the early 1980s began with words like: “The
student will know,” or “The student will understand.” So everywhere I went, I’d ask teachers,
“How would you ‘know’ they know or understand? What do they need to show you that
convinces you that they know or understand, and what forms might that ‘showing’ take? Which
action verbs clearly indicate that ‘knowing’ or ‘understanding’ has occurred?” At a bare
minimum, I’d suggest ‘describe’ and ‘explain’ as valuable alternatives to know and understand.
So, if teachers decided to use ‘describe’ in framing an Outcome, it meant that the assessment
had to be a description, and the students themselves had do the describing in order to fulfill the
Outcome, not just answer questions on a test. And for that to happen, OBE teachers were
morally and technically obligated to teach students how to implement or demonstrate the action
verb in question, in this case ‘how to describe well’, not just ‘know’ the relevant subject matter.
However, becoming a good ‘describer’ might take students months or years, which raised a
huge challenge for both instructors and curriculum directors:
You can’t teach skills the way you ‘deliver’ content. Skill-development
generally requires much more time and lots of tangible application.
So yes, having teachers explicitly teach action verbs and explicitly assess them dramatically
expands their instructional responsibilities, but notice how much deeper it makes the learning for
students and how much it expands their competences as well. Moreover, notice too how totally
inappropriate pencil-paper tests are for assessing action verbs of all kinds, especially when
you get beyond what we called the ‘wimpy’ ones presented in Chapter 2.
Not All Action Verbs Are Created Equal
Action verbs vary enormously in their power, from ‘wimpy’ – words like circle, underline,
name, and list – to ‘complex’ – words like design, create, negotiate, mediate, and produce. They
also add another important dimension to what we have called Outcomes of Significance. If
Outcomes are really supposed to matter after they’re gone, then the things that matter most to
higher education graduates are complex abilities like these that both define and enhance their
career competences and effectiveness.
4
Moreover, if we apply the Law of Alignment to that reality, it means that higher education
professors need to be carefully defining, teaching, and assessing the kinds of action verbs that
clearly belong in the ‘complex powerful’ category. And if they do that, they will soon find that
much more instruction needs to shift from the lecture hall to the practice laboratory because
that’s where the actual skills of designing, creating, negotiating, mediating, and producing get
developed and refined; and it should be where they get assessed as well. This may seem obvious
on its face, but it implies enormous changes in traditional higher education practices. Lecturing
and the equivalent of paper-pencil testing, no matter how thorough and rigorous, in no way
signify that students can design, create, negotiate, mediate, or produce.
Therefore, one implication about assessing Outcomes is clear: faculty must take meticulous
care in defining their culminating Outcomes in the first place because everything flows directly
from them: what faculty teach, how faculty teach, where faculty teach, what students must
demonstrate, how faculty assess the results of what they have taught, and how faculty keep
records about what those assessments reveal about both their own teaching and students’ current
and improving performance status. So I urge educators to take deeper look at the substance of
the learning and Outcomes they really intend to foster in their students, and to assess and validate
that substance with great care.
Dispelling the Myths that Uphold the ‘Numbers Game’

Here’s where the serious controversy in this chapter begins, especially if you believe that
Outcomes Accreditation is Outcome-Based, and that it gives universities and their programs a
clear path to lasting program effectiveness. It doesn’t because OA is a key player in keeping the
Numbers Game’s myths and illusions alive. What follows helps to expose those myths and
illusions for what they are.
Determining Rank
Over the years students get increasingly socialized by educators into believing that the scores and
percentages they receive for what they learn are more important than the learning itself. This all-
pervasive illusion increases intensively during their high school years as students compete for the
highest Rank in their class possible in order both to qualify for university admission and to
impress the admissions committee. Then that obsession is moved up the scale for four or more
years as universities and graduate schools routinely compel students to do the same things to
impress prospective employers.
So if it’s all that critical, then how is a student’s Rank determined? Via their Grade Point
Average (GPA), of course, which is often calculated to the third decimal place in order to justify
the ranking and give it the aura of scientific validity. Once that’s done, all the GPAs of all the
students in one’s reference group are compared and their Relative Ranking is determined.
Determining GPA
So if GPA is all that critical in determining rank, how is it determined? By adding up and
averaging all the grades students have received for every non-equivalent thing they’ve ever
5
done in every non-equivalent course they’ve taken over a four-year period. Hence, the numbers
needed to calculate the GPA are not only contrived in the first place, they are also attributed to
completely dissimilar things after that. This highly questionable and illusory averaging process
is actually the second step in a double numerical and symbol conversion process that begins, as
noted, with teachers assigning points and scores to all the non-equivalent assignments that
students have done and tests they’ve taken during a teacher’s course. Then teachers add up and
average all those non-equivalent points and scores, compare them against a well- established
institutional scale, and convert them into either letters or whole numbers that are called grades.
Following that, at various designated times during the year, teachers or other school officials
convert the letter grades back into whole numbers (e.g., A = 4) which they again average to
produce the GPA.
Determining Points
So, if the points and scores for the non-equivalent student assignments and tests are all that
critical in establishing a student’s grade, how are the points and scores determined? This is done
individually and subjectively by teachers who are generally free to make up and assign points
for their courses on a task-by-task basis however they and/or their departmental colleagues see
fit. The only ‘standardized’ measure that might occur is an examination given by a department
to all sections of a course, thereby overriding particular instructors’ biases.
Passing Standards
Aside from an institution’s grading scale, the only ‘standard’ that governs this whole process is
its policy on how many points a student must average over a given period of time in order to
‘pass’ either an assignment, a test, or a course. However, that critical passing score varies from
country to country. For example, it’s usually 50 in the United Kingdom, 70 in the U.S., 75 in the
Philippines, and 80 at Saudi Arabian universities. No one knows 50, or 70, or 75, or 80 of what,
of course, but that’s the dividing line between success and failure that’s observed in those
respective countries. And you can be certain that all of their students are acutely aware of what
that critical score is since disaster will result if their average is 49, or 69, or 74, or 79
respectively. Even though the difference between pass and fail is a single point, that single
point either means “Life is good,” or “All is lost.” So, in effect, that single point might as well
be a million points to the students who have to live with its consequences.
Never Make Mistakes
Meanwhile, back to that all-important GPA. What is it really? Well, we think about it this way:
A GPA is the average of ALL of the non-equivalent learning mistakes a
student has ever made during her or his high school or university career.
Otherwise their GPA would be a ‘perfect’ 4.0. So if you want a high GPA, be sure to never
make mistakes – ever – because they will always be counted against you. And given that, this
implies that ranking students reflects the order of how ‘imperfect’ each is in the eyes of the
system, with the ‘least imperfect’ ranked highest. Moreover, from OBE’s ‘Expanded
Opportunity’ perspective, GPA’s deny that further improvements on recorded learning and
performance attempts are possible and should be acknowledged beyond students’ first efforts.
6
Consequently, this routine use of GPA’s evoked the following Spadyism:
GOD FORGIVES; GPA’S DON’T!
So when you stop and think about it this way, this Numbers Game system of credentialing
reduces this entire distortion and mish-mash of non-equivalencies down into just two
numbers that carry prodigious symbolic value but almost no substantive information: Class
Rank and GPA. So how much does this deeply entrenched ‘smoke and mirrors act’ actually tell
us about student capabilities? You decide, but not until you read what comes next.
The Many False Myths Surrounding Everyday Tests
To reinforce my cynicism about The Game, I’m inviting you to take a routine ten-item test with
me – the kind used in classrooms everywhere to determine the points and scores on which all of
the above rests. What I’ll show you doesn’t depend on the length of the test or the passing score.
I’ve chosen ten items simply because it makes each item ‘worth’ ten points, and makes the test
easy to score. And I’ve chosen a passing score of 80 in honor of Benjamin Bloom’s legacy.
This means that students must get eight of the ten items correct or they ‘fail’ the test.
Before we proceed, let’s check the assumptions on which our test is based.
First, by assigning equal numerical value to each item, we’re falsely assuming that
each item represents an equal and equivalent ‘unit’ of knowledge or competence,
whatever they may be.
Second, we’re falsely assuming that the knowledge base being tested is composed
of ten, and only ten, of those units, and that they’re all equal to each other.
Third, if the knowledge base isn’t made up of ten equal units of substance, or
units of substance that can legitimately be divided into ten, then our ten items
don’t fairly represent the knowledge base in question.
Fourth, if the knowledge base in question is not, in fact, divided into ten equal units,
then this routine distribution of points and items misrepresents the knowledge base.
Fifth, all four of these assumptions assume that our test, and millions like them,
are worth 100 points. If they are worth anything, it isn’t 100 points. That’s just an
arbitrary, but enormously convenient, number that’s been in use for a couple of
centuries. The points are simply human inventions; the learning is real.
Sixth, we’re also assuming that both the knowledge base and the items used to test
it embody what is essential to know or do from that learning. In this latter case, the
word ‘essential’ means that it MUST be present, so we’ll watch for that. Otherwise,
if the learning isn’t essential, why give the test in the first place?
Alas, since most of these assumptions are highly doubtful, the legitimacy of our ten-item test,
and the millions just like it, is already in serious doubt. Nonetheless, let’s proceed, with you
answering the following questions with me as I pose them. And remember: This test is not a
trick! It’s routinely given in classrooms and lecture halls across the world. Here goes:
1. Must students get the first item right in order to pass? Answer: NO, missing the item
7
only takes ten points off, so they can still pass.
2. Must students get the second item right in order to pass? Answer: NO, missing the item
only takes ten points off, so they can still pass.
3. Must students get the third item right in order to pass? Answer: NO, missing the item
only takes ten points off, so they can still pass, but that depends on how they did on the
previous two items.
4. Must students get the fourth item right in order to pass? Answer: NO, missing the item
previous three items.
5. Must students get the fifth item right in order to pass? Answer: NO, missing the item
previous four items.
So I could continue this exercise all the way down to the tenth item if you wish, but we’ve
already seen enough about our routine ten-item test to draw the following sobering conclusions.
And remember, this was NOT a trick.
Conclusion One: It is NOT essential to get any particular item in this test right.
Students can get any two items wrong and still pass.
Conclusion Two: Therefore, there is no essential knowledge or skill in this test.
Since students can get ANY TWO items wrong and still pass, no item in
the test contains essential knowledge or skill.
Conclusion Three: The only essential thing in or about this test is for students to get
enough NON-ESSENTIAL items right. No item is essential, but you have
to get any eight of them right or you don’t pass.
Conclusion Four: NO TEST with a passing standard LESS THAN 100 is a test
of essential knowledge or skill. The only time an item is essential is when
you MUST get it right. Only a passing score of 100 will assure that.
Conclusion Five: If its passing standard was less than 100, every ten-item test
ever given in the history of the world at any grade level has been a
test of NON-ESSENTIAL knowledge and skill. Yet ‘routine’ tests
like this one have been determining students’ grades, GPA’s, class
ranks, and educational and career futures since teachers started using
tests to ‘measure’ student learning.
Given all of this, from an OBE perspective it’s tragic that that teachers have unwittingly spent all
these generations testing non-essential learning and reinforcing the myths of points, scores,
percents, and averages, when they could have been assessing Outcomes of Significance instead.
And to do that they would have only needed three things: 1) Clear Outcomes, 2) Clear criteria,
and 3) The substance they are seeking in high-quality demonstrations of student learning.
8
Replacing the ‘Numbers Game’ with FACTS
Chapter 2 described how OBE couldn’t become ‘real’ until it gave up its obsession with 80. And
to do that it needed to shift its paradigm and its focus from points, scores, and percents to
authentic substance, the kind you can define, describe, observe, and verify. Now that I have
thoroughly challenged the legitimacy and validity of this Numbers Game, you can see why
letting go of it is such a challenge, drama, and trauma for so many educators at all levels of the
system. Their professional paradigm and decades of experience are grounded on its contrived
myths, and they initially don’t know where else to turn. They’ve been thriving on giving and
scoring (ten-item) tests like this and class assignments for decades, and this widely endorsed
deceptive practice has just been exposed by this ‘hard core’ thing called ‘Outcomes’. So, once
they’re confronted with the reality of what Outcomes are, they can’t legitimately go back to the
convenience and quick fixes of The Game. Now they have to deal with the FACTS of the
matter, and that’s what’s coming next!
Getting to the FACTS of the Matter
Whether they are called Stellar Performances, or First-rate Demonstrations, or Dazzling

Displays, or Impressive Exhibitions, all high-quality examples of three-dimensional expertise
share five attributes that I’m calling their FACTS. FACTS is an acronym for five criteria that
I’m treating here as ‘essential’ components of all role-performance Outcomes of Significance,
demonstrations, and assessments. These criteria are:
Fluency Accuracy Challenge Thoroughness Style
While performances usually integrate and even blend the five, each can and should be clearly
distinguished from the others for the purpose of pinpointing and explaining assessment data to
the performer – exactly what Bloom wanted formative assessments to do. Since each is an
indicator of ‘true mastery’ in its own right, each should be made clear, explained, and modeled
for students on a continuing basis.
Fluent demonstrations are performances that flow from beginning to end without hesitations,
interruptions, backtracking, and forgetting. Fluency, like the other four criteria, reflects a
genuine ‘command’ of the pertinent material, of the skills required, and of the setting in which
the demonstration takes place. When it’s Fluent, things look and feel ‘familiar’ and ‘at ease’.
While educators are likely to pay disproportionate attention to the Accuracy of information and
skills conveyed in a demonstration, Accuracy cannot be allowed to overshadow the other four
criteria in the assessment of true performances. Nor can it be downplayed, simply because
nothing erodes a quality demonstration faster than erroneous information and faulty technique.
The ideal here is a Flawless performance.
Challenge is about two key things: 1) the degree of difficulty of the pertinent material, of the
skills required, and of managing the setting in which the demonstration takes place; and 2) their
complexity – how many different factors are involved in each and must be brought together for
the demonstration to be complete and successful. Here, success looks like managing and
coordinating a Complex three-ring circus.
9
Thoroughness lives at the intersection of Breadth and Depth, and quality performances
embody both. One involves range, inclusion and scope of perception and skill; the other
requires focused attention to underlying meaning and detail. In fact, Thoroughness may more
directly determine one’s true ‘command’ and ‘mastery’ of an endeavor than any of the other
criteria, and it may, in fact, require the greatest degree of study and practice to achieve.
Style may not sound very academic and rather superfluous to include as a criterion for Outcomes
assessments, but it adds the critical elements of flair, uniqueness, creativity, imagination,
interest, appeal, and originality to any performance. Moreover, Style stimulates the
heartstrings, not just the brain cells, and it allows the performer’s individual human qualities to
be expressed more fully. So if you want demonstrations to be more than ‘dry’ and ‘boring’, pay
attention to Style.
With these five criteria as essential components, we can forthrightly tell students that our
ultimate goal for them is to execute:
Fluent, Accurate, Challenging, Thorough, and Stylish culminating demonstrations
of learning on role-performance Outcomes of Significance.
Actually, that sounds just like what real professionals in any discipline do to gain the esteem of
their colleagues and clients, so I’ll proceed with that as our rationale.
Applying the FACTS to Assessment Rubrics
One of the most common strategies for assessing learning over the past two decades is the use of
what are called Rubrics. Conventionally, Rubrics are matrices that portray the essential
components of a performance on one axis, and the levels of performance quality on the other.
Although there is no formal rule about how many essential components there might be in a given
demonstration, educators often choose five for their Rubrics. Nor is there a rule about how many
levels or qualities of performance one should document in a Rubric, but the convention is four,
which yields a matrix with twenty cells. Finally, there’s a third convention that involves points,
scores, and weighting factors, and I’ll comment on that shortly.
The good news about Rubrics from an OBE perspective is that they can conscientiously and
consistently be used by teachers to implement its Clarity of Focus and High Expectations
principles on whatever Outcome or objective they may be addressing at the time. By seeing the
Rubric ahead of time, students learn about the Outcome’s components and what a ‘stellar’
performance will require from them. And, the Rubric guides teachers as well, for they too can
use it at the outset to define and focus in on an Outcome’s essential components and which
criteria will be used to denote their various levels of performance quality. With those factors
already mapped out, the Rubric becomes a template for guiding teaching, learning, assessment,
and even record-keeping. The hypothetical Rubric below illustrates these points.
There’s a lot to absorb in the sample Rubric I’ve created below, so I’ll explain it step by step,
first by focusing on the five FACTS that I’ve just described. They are arrayed at the top of
each column of the Rubric and, for our present purposes, represent five essential
components of the hypothetical Outcome’s performance that we’re intending to assess.
10
Although I’ve chosen to use the FACTS as essential components in this example, there may
be compelling reasons for faculty to specify essential components other than these
for their Outcomes. In that event, the names of their components would be the main
‘definers’ of the desired student demonstration, and those names would be shown at the
top of each column. The number of essential components in a Rubric can vary, but I’ve
simply followed convention and used five.
Fluency Accuracy Challenge Thoroughness Style

Seamless flow, Full command Clear mastery of Broad, detailed Pervasive
no hesitations of required complex info, command of display of
material and skills and info and skills originality and
skills setting creativity
Good flow, Minor mistakes Complexities Some related Clear elements

minor and execution integrated and details and of individuality
hesitations errors managed well skills and imagination
incorporated used
Notable Notable errors Only routine Little beyond Glimpses of

hesitations and in basic processes and defined skills individuality
uncertainty knowledge and info handled and concepts and creativity
skills well used present
Uncertainty and Significant Mastery of only Only surface Straightforward

hesitation errors and slips basic processes information and display of skills
breaks throughout and material skills used and information
throughout
Next, as we go up any of the above columns and study the words in each cell, we’ll notice a
decided step-by-step improvement in the quality of the performance it describes.
Although the specific words used will vary according to the component we are viewing, the
pattern of quality variation should be consistent within each column. The highest
‘stellar’ quality appears in the top cell and the lowest ‘fully inadequate’ quality at the
bottom. Since this pattern is repeated in every column, we can readily see that the rows of
this Rubric consistently describe the four distinctive quality levels that I’ve chosen – again
as a matter of convention.
So take a minute to trace these two patterns, noting that the words in the twenty cells are
not ‘perfect’. I claim nothing more than they ‘fit’ the criterion and quality of performance in
question reasonably well, and they ‘fit’ in the small spaces allowed. Once you’ve traced and
confirmed these substance and quality patterns, I’ll describe some further conventions regarding
11
the use of Rubrics, some of which support a criterion-defined approach to assessment, and some
of which don’t.
For starters, I’ll address what it means to ‘pass’ when you use Rubrics as criterion-defined
templates rather than using points and scores. Here I’ll partly agree with convention and say that
the words I’ve chosen create a dividing line in this Rubric between its second and third rows.
When a student’s performance on a criterion reaches the third level, we can say that she or he has
reached the passing standard for that criterion. However, any performance in which ANY one
component is assessed to be in the bottom two rows keeps the demonstration as a whole in the
Incomplete/Inadequate range. Why? Because the assessment has shown that an essential
component of the demonstration still needs substantial improvement, whatever that single
component/criterion might be. Therefore, from this criterion-defined perspective:
Inadequate performance on just a single essential component undermines
the entire demonstration and must be brought to a minimally acceptable
level before the demonstration as a whole is deemed ‘competent’.
At a minimum, then, OBE wants ‘competent’ performances to reach at least the third row up
from the bottom on ALL specified criteria.
Given that this may appear to be a ‘tough’ passing standard to some, let’s remember that a key
purpose of the Rubric is to provide substantive guidance to both teachers and students for
continued High Expectations learning and improvement. So following a given assessment, the
Rubric will indicate where the student currently stands in their learning and what the next level
of accomplishment is on each criterion. In conveying this message, the Rubric serves as both an
accurate and relevant record-keeping device and a potential learning guide, indicating to both
teacher and student what the next necessary learning steps are. OBE’s ultimate goal is having
every student eventually reach the ‘stellar’ level in the top row on ALL essential criteria,
where words like outstanding, exemplary, and masterful reside. Notice how reaching this
exemplary level brings all four of OBE’s Principles directly into play and also allows What and
Whether to override When and How as the ultimate definers of success.
Don’t Score Assessment Rubrics
So far I’ve been showing you the ‘good news’ about how well-constructed, conventional
Rubrics offer clear, useful criterion-defined information to both faculty and students. Now,
however, I’m going to break from convention and advise you NOT to score the Rubric.
Why? Because conventional ways of scoring can lead to faulty and misleading
conclusions . . . and grades. Here’s what can happen.
First, following convention, every cell in the bottom row gets 1 ‘scoring’ point; every cell in
the row above gets 2 points; every cell in the row above that gets 3 points; and the top row
cells all get 4. This conventional practice gives you a potential scoring range between 5
minimum and 20 maximum. Following this convention, if you were to rely on scores alone
to interpret the Rubric, you could easily end up with a major contradiction and deep
deception. Why?
12
Second, because there are only two scores that tell you anything of actual substance about
the demonstration. With a score of 20 you know that all five components were achieved at
the ‘stellar’ level, and with a score of 5 you know that all five were achieved at the lowest
level. But all of the other numbers, from 6 through 19, reveal nothing accurate about
the substance of the demonstration. Why?
Third, because when you consider the range of possibilities, there are five different ways of
getting a 6 or a 19; and there are fifteen different ways of getting a 7 or an 18. You’re very
likely to be impressed with the 19, of course, and will certainly want to give it a high grade.
But if you do, you won’t know from the score which component still needs further
improvement. Likewise, you’re likely to give the 6 a failing grade, even though one of the
five criteria is definitely above the failure line.
Fourth, here’s where the real danger lies: you’d also be tempted to give the score of 18 a
high grade as well, but don’t. Again, why? Because five of those fifteen ways include
getting only a 2 on one of the essential components, which means that the demonstration
falls into the Incomplete/Inadequate range. So if that discrepancy bothers you, please
know that your ability to interpret substantively what a score means and, therefore,
where to provide further support for a student, only gets worse and worse as scores
diverge further from either 5 or 20.
Consequently, I strongly support using Rubrics to strengthen OBE’s Clarity of Focus and
High Expectations principles, to shift education’s prevailing paradigm, to identify the
essential components of an Outcome, to assess and record the results of a demonstration,
to pinpoint areas for improvement, and to document levels of performance quality. But as
vehicles for generating scores and grades, they’re no better than ‘routine’ ten-item tests
and all the misleading myths that constitute the Numbers Game. So run, don’t walk, from
scoring them.
Record-Keeping that Supports OBE

In the early part of Chapter 2, I began listing catch phrases called Spadyisms that were used
to reinforce key OBE concepts. Two of them very directly relate to record-keeping, which is
an
inseparable aspect of Outcomes, opportunity, assessment, and learner success. They are:
OPPORTUNITY ENDS WHEN YOU GET GRADED IN INK
WHEN IN DOUBT, GRADE IN PENCIL
The first one describes the most prevalent kind of educational record-keeping, the
Permanent Record. Permanent Records are filled with ink grades – grades that are
deemed permanent and unchangeable once issued, usually on specified calendar dates, no
matter how much you learn about the subject or improve the skill in question after the
grade has been issued. Now link all of this back to what we read in Chapter 1 about
education being time-based and consider how it all ties together.
13
Education is time-based in large part because grades are time-based, and assessments are
time-based, and record-keeping is time-based, and CREDITS are time-based. So unless
Credits get based on something other than time, and unless record-keeping gets based on
something other than time, and unless assessments get based on something other than
time, and unless grades get based on something other than time, education is going to
remain time-based. Given that Outcomes are a viable alternative to this iron-clad
syndrome of factors, it behooves OBE implementers to break this syndrome and expand
learner opportunities. One way they can do that is by shifting the paradigm of record-
keeping.
The Shift to ‘Criterion Defined’ Open Transcripts
Back in the late 1970s, Dr. John Chaplin, Superintendent of the Johnson City Central Schools
in New York state, recognized that a different kind of record-keeping device had to be
created; otherwise, Bloom’s Mastery Learning model would never fulfill its promises of
Expanded Opportunity and Success for All Learners. His idea, which was not at all novel in
the ‘real world’ at that time, was what he called an Open Transcript. Very simply it meant
that students’ transcripts would be open to continuous modification consistent with their
growing skills and knowledge base as learners. What had to change for this to happen, of
course, was the transcript’s frame of reference. Instead of days, weeks, and months, the
reference point would have to become the categories and levels of content, concepts, and
skills imbedded in the curriculum. Anytime a specified kind of learning improved, the
change in status could be immediately recorded on the transcript, which meant that it was
being continuously updated.
Champlin’s Open Transcript idea was immediately controversial for two obvious reasons.
First, it meant that the curriculum’s intended learning had to be identified, mapped, and
brought into the light of day from beneath the covers of textbooks, and that meant tons of
new work for both administrators and teachers. Second, it violated all the traditional
norms and rules pertaining to grading.
Please consider how ‘radical’ Champlin’s paradigm-shifting notion was: Traditionally, it

was regarded as immoral and illegal to change any student grade because the ink grade
itself represented the ‘permanent’ part of permanent record. That’s what the student had
done during the school’s single-opportunity time-block performance schedule. However,
by his reasoning, it was immoral and illegal NOT to change a student’s grade because the
original grade was about learning and performance that had now improved. Hence,
Champlin and other leaders encouraged their teachers to grade learning, not time-blocks;
and to grade in pencil, not it ink. That’s the essence of the Open Transcript.
As Wajid Hussain will point out in the Chapters 7 and 8, modern technology has all but
eliminated the first objection by making computer driven Open Transcripts easy and
convenient for educators at any level of the system anywhere in the world. The only point
of controversy, then, is tied to obsolete beliefs about what grades are supposed to signify
and skepticism about pencil grading.
14
The Further Shift to Performance Portfolios
Once OBE had in hand a definition of Outcomes being tangible demonstrations, and once its
advocates realized that Outcomes took many forms, it opened the door to tangible
products being regarded as Outcomes too. Then, of course, came the realization that
conventional transcripts had no way of documenting and portraying what such Outcomes
were. So, after years of struggling with this dilemma, OBE advocates began to promote the
idea of Performance Portfolios. Portfolios were actual physical containers of various
sizes and shapes into which tangible learning products could be placed and replaced as
new and improved ones were developed – pictures, sculptures, papers, recordings,
woodwork and metalwork products, etc. So the Portfolio became a popular and paradigm-
shifting way to further expand and legitimate the meaning of competence and
‘demonstrations of learning’, and records could now be kept on every kind of knowledge,
skill, and tangible accomplishment – not just paper transcripts with grades on them.
Student-Led Conferencing Pushed the Limits
There is an element of mistrust built into all credentialing and accreditation systems. At
their core they’re asking the person or the institution to ‘prove’ that they are competent
and responsible enough to do the job they are asking to occupy, whether it’s being licensed
to practice a recognized profession, or being licensed to provide the training for that
profession. In both cases, some higher authority apart from the party involved sets the
standards that must be met and assesses and evaluates whether or not the candidate’s
performance meets or exceeds them. In other words, evaluation and credentialing
authority and power lie outside the candidate.
In large part what came to be called Student-Led Conferencing reversed that dynamic.
The assessment and evaluation roles were played by the student, while the external
parties, usually parents and the teacher, sat and listened as detailed reports about
expectations, demonstrated Outcomes, recent improvements, and new leaning goals were
presented to them. Traditional report cards were no longer needed. To shift the
Paradigm even further, the students doing the presenting were often age six or seven, and
the conference they were directing was an expression of the Self-Directed Learner
Outcome described in Chapter 2.
OBE teachers reasoned that if children could be entrusted to direct their own learning,
they could also be entrusted to assess and evaluate it in relation to clearly identified
criteria and benchmarks. And in this they were correct. So conferences like this became
commonplace as students took charge of their learning, its assessment, and its
documentation. By all reports that I ever received, parents were literally ‘thrilled and
amazed’ that their had such a command of their learning progress and how it would
continue to unfold.
As I have discussed this remarkably successful shift in responsibilities and dynamics

involving young children with university colleagues, we’ve realized what wonderful
preparation
15
Student-Led Conferencing would be when:
High school graduates face the challenge of ‘selling themselves’ in university
admissions interviews, and
As university graduates, they again have to sell themselves to prospective
employers.
Just imagine primary and secondary teachers routinely saying to their students, “Since
you’re going to be interviewed for life-changing opportunities when you’re older, let’s start
now and develop those skills. And, with your Open Transcript and Performance
Portfolio as resources, I think you’ll easily be able to convince adults of your considerable
capabilities. So let’s start now by demonstrating to your parents what you’ve actually
accomplished.”
And then imagine my fellow authors and I saying to universities facing Outcomes
Accreditation, “So let’s start now with you implementing ‘real’ Outcome-Based Education
and demonstrating to the world what your students can actually accomplish.” With
Outcomes of Significance and clear performance criteria, I bet they could.
16

Exposing The Distorted and Deceptive Numbers Game': William Spady

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Exposing The Distorted and Deceptive Numbers Game': William Spady

Caricato da

Copyright:

Formati disponibili

Chapter 3

Exposing the Distorted and Deceptive ‘Numbers Game’

You Can’t Assess What You Haven’t Defined

Demonstrations are Tangible and Observable Actions

Actions are Defined by Words, not Numbers

Demonstrations Require Action Verbs

Not All Action Verbs Are Created Equal

Dispelling the Myths that Uphold the ‘Numbers Game’

Never Make Mistakes

The Many False Myths Surrounding Everyday Tests

Getting to the FACTS of the Matter

Whether they are called Stellar Performances, or First-rate Demonstrations, or Dazzling

Applying the FACTS to Assessment Rubrics

Fluency Accuracy Challenge Thoroughness Style

Good flow, Minor mistakes Complexities Some related Clear elements

Notable Notable errors Only routine Little beyond Glimpses of

Uncertainty and Significant Mastery of only Only surface Straightforward

Don’t Score Assessment Rubrics

Record-Keeping that Supports OBE

The Shift to ‘Criterion Defined’ Open Transcripts

Please consider how ‘radical’ Champlin’s paradigm-shifting notion was: Traditionally, it

Student-Led Conferencing Pushed the Limits

As I have discussed this remarkably successful shift in responsibilities and dynamics

Potrebbero piacerti anche