Measuring complex achievement

Performance based assessment

Chapter 11

Linn, R. L., & Gronlund, N.E. (2003).

Measurement and Assessment in Teaching (9 th

ed.). Upper Saddle River, NJ: Prentice Hall

Essay tests ( previous presentations)

Example of one type of performance-based
Many other types including
artistic productions,
experiments in science,
oral presentations, and
the use of mathematics to solve real-world

emphases are on doing, not merely

knowingon process as well as product.

Performance assessments provide a basis for

to evaluate both the effectiveness of the process

or procedure used and the product resulting from

performance of a task.
Hands-on performance tasks that require students

to manipulate objects, measure outcomes, and

observe results of experimental manipulations are
sometimes essential to capture the full array of
skills needed to perform "authentic" tasks.

Performance assessments can either be

a restricted format or an extended format.

Performance assessments possess a number of

advantages and limitations.


assessments that effectively measure complex
learning outcomes requires attention to
the task development and
the ways in which performances are scored.

Criteria to be used in judging student performance are

critical for reliable, fair, and valid assessment

The specification of the criteria should begin at the
time the tasks are being selected or developed.
Both the teacher and the student need to understand
the criteria that will be used to judge performance.

Performance assessment
Performance assessment are sometimes referred to as

authentic assessment or alternative assessment

Alternative assessment highlights the contrast with

traditional paper-and-pencil tests.


assessment emphasize the

application of the tasks in real-world settings.


Performance assessment is preferable

It is mainly used to measure the learning outcomes

that cannot be measured well by objective test items.

Restricted Response Performance tasks

Narrow in definition
Instructions are more focused
Limitations on the types of performance expected

are likely to be indicated

Focus on one specific skills ( e.g. reading a
passage aloud)
Sometimes they start wit MCQ or short answer
questions, and then are extended by asking for
E.g. (pp.262-263)

Extended Performance Tasks

Require students seek more information form

different sources
(use the library; make
observation; collect and analyze data; conduct a
survey; use a computer; etc.)
Process / procedures
Product ( such as construction and presentation

of graphs and tables; use of photographs and

drawings; construction of physical models; etc)
(e.g. 265)

Clearly communicate instructional goals that

involve complex
in natural
settings in and outside schools
understanding and skills.
Measure complex learning outcomes
Provide a means of assessing process/ procedure
as well as product
Apply methods suggested by modern learning
theory (students as active participants)

Like essay questions
Unreliability of ratings
Time consuming in nature





1.Focus on learning outcomes that require complex

cognitive skills and students performances.

2. Select or develop tasks that represent both the

content and skills that are central to important

learning outcomes
3.Minimize the dependence of task performance on

skills that are irrelevant to the intended purpose of

the assessment task. (e.g. requiring reading ability)





4. Provide the necessary scaffolding for students to be
able to understand the task and what is expected
(prior knowledge)
5. Construct task directions so that the students' task
is clearly indicated
6. Clearly communicate performance expectations in
terms of the scoring rubrics by which the
performances will be judged

Performance criteria

criteria to be used in judging student

performance are critical for reliable, fair, and valid
assessment, and the specifications of the criteria
should begin at the time the tasks are selected or

Two main ways

Scoring rubrics/rating scales

Scoring rubrics/rating scales

Scoring rubric is a set of guidelines for application of

performance of students.
It consists typically of verbal descriptions of
performance or aspects of students responses that
distinguished between advanced, proficient, partially
proficient, and beginning levels of performance
Analytic vs. holistic scoring

Rating scales are limited to quality of judgment (e.g.

excellent, good, fair, or poor) or scaled frequency

judgments (e.g., always, frequently, sometimes, or
never) for each level
E.g., p272

Types of rating scales

(see the attached file)

Numerical rating scales

E.g., 274

Graphic rating scale

E.g., 274

Descriptive graphic rating scale

E.g., 275

Uses of rating scales

Two assessment areas
Process or procedure assessment
E.g., p.277

Judge on the product ( quality of writing, drawings, maps.

Graphs, )

Common errors in rating

Personal bias
Generosity error
Severity error
Central tendency error

Hallo effect ( general impression influences the

Logical error

(results when two characteristics

are rated as more alike or less alike than they are
actually are)

Principles of effective ratings

Characteristics should be educationally significant
Identify the learning outcomes that the tasks is

intended to assess
Characteristics should be directly observable
Characteristics and points on the scale should be

clearly defined


the type of scoring rubric that is most

appropriate for the task and the purpose of the

Between three and seven rating positions should be

Rate performances of all students on one task before

going on to the next one.

When possible, rate performances without knowledge

of students name
When results from a performance assessment are

likely to have long-term consequences for students,

ratings from several observers should be combined

Is similar in appearance and use to the rating scale
They are different in the type of judgment needed
Checklist calls for a simple yes-no judgment ( action

present or absent )
Useful in primary level
E.g., p.282
Useful in assessing those performance skills that
can be divided into a series of specific action
E.g., p.283
Used to assess products
Combination of techniques (e.g., p.284)

Students participation in rating

From an instructional standpoint, it is often useful

to have students rate themselves (or their

products) and then compare the ratings with those
of the teacher.
Discuss !