Sei sulla pagina 1di 35

SGDE 4013 Assessment in Learning

Introduction

Nurliyana Bukhari, Ph.D.

School of Education & Modern Languages (SEML)

Semester A191 | First Semester 2019/2020

1
OUTLINE

Measurement (Pengukuran)
Mental Measurement
Person property & Item property (ciri-ciri individu & ciri-ciri item)

Test (Ujian); Testing (Pengujian)

Assessment (Pentaksiran)

Evaluation (Penilaian)

2
WHAT IS MEASUREMENT?

The process of assigning numbers.


A procedure for assigning numbers (e.g., scores) to a specified attribute or characteristic (trait)
of a person or entity according to a specific set of rules.
For many of the characteristics measured in education and psychology, the number-assigning
rule is:
to count the correct answers
to sum points earned on an essay
The numbers describe the degree to which the person or entity possesses the attribute.
Answering questions about “how much”, and how confident we are in our obtained answers.

Definitions based on
Linn & Miller (2005); Stevens (1945) as cited in Veloo & Awang-Hashim (2017)

3
HOW MEASUREMENT WORKS? THE RICE EXAMPLE

How much rice is there?


What are we measuring?
How are we measuring it?
What do the measures mean?
4
TRAITS VS. MEASURE

TRAITS MEASURES
Distance Miles/kilometers

Temperature  Degrees: F, C, Kelvin
Height  Feet, inches, meters, centimeters
Weight /Mass  Pounds, kg, grams

Intelligence  IQ score
Achievement  Mid-term exam score, SAT score
Depression  Beck depression score

5
THE MEASUREMENT INSTRUMENT

Measurement needs a measurement instrument/tool/device:


Weighing scale
Thermometer
MUET
IQ test

The measurement instrument is the tool used to assign a numeric value to the quantity of the
trait for the entity being measured.

Mental measurement uses tests, scales, inventories as instrument.

6
THE MEASUREMENT PROCESS

The measurement process involves the interaction of the entity being measured (e.g., person)
and the measurement instrument (e.g., weighing scale; test (more specifically: items)).

This interaction is documented by how much the entity changes the measurement instrument.

e.g.,
How much a person changes the
“needle” of a weighing scale from 0,
indicates the person’s weight.

7
THE MEASUREMENT PROCESS (cont…)

Interaction of Entity
Entity & Measure
Being Measurement Instrument
of Trait
Measured

For example:

Person steps on Weighing Scale,


thus pushing down
Measure
Person the top of the Weighing Scale of
Weight
8
MENTAL MEASUREMENT

Measurement of mental traits is a little different than measurement of physical traits.


Physical traits directly and uniquely interact with measurement instrument to generate a
measurement of the trait.
Mental traits do not uniquely interact with measurement instrument. Rather, the change in
the instrument is mediated by the target trait as well as other traits or effects. These “other
traits” are both random and nonrandom effects.

Thus the change in the measurement instrument in mental measurement is a “clouded” or


“foggy” representation of the actual trait.

Instruments for mental measurement are typically tests, scales, and batteries

9
THE MENTAL MEASUREMENT PROCESS

Target Trait and Other


Entity Traits impact Measure
Being Measurement Instrument
of Trait
Measured

For example:

Level of Anxiety and Other


Traits determine responses
Measure
Person to items of an Anxiety Scale of
Anxiety
10
MEASUREMENT: A 2-WAY STREET

Intuitively, we tend to think about measurement as only concerning the determination of the
person’s trait.

But, in any measurement process, the measurement properties of the person and the properties
of the measurement instrument are intimately tied.

The person’s measure is determined by the properties of the instrument, and the instrument’s
properties are determined by the person being measured.

11
THE 2-WAY STREET: A PAIN TOLERANCE EXAMPLE

Consider a researcher measuring pain tolerance by having individuals walk across a beach of hot
sand.

Based on their reaction (i.e., behavior—how loudly they scream!), the researcher assigns a score
of pain tolerance to the person.

But the behavior (scream) also tells us how hot the sand is, and this interpretation will vary
depending on the pain tolerance of the study participants.

12
A Picture of the Pain Example

OUCH!

Does the scream measure pain tolerance of the person?


OR
Does the scream measure the temperature of the sand?
13
THE 2-WAY STREET: REVISITING THE RICE EXAMPLE

Consider again the example of measuring the amount of rice.

We can measure it by determining how many cups of rice we have.

But, we can also flip the example to consider the measurement of each cup—that is, the size of
the cup can be measured by how many cup-fills it takes to hold all the rice.

14
THE 2-WAY STREET OF MENTAL MEASUREMENT

The responses of individuals to the items of the instrument not only tells the trait level of the
individuals, but it also tells us properties of the items and the instrument as a whole.
An individual who does not correctly answer many items of test has a low ability level
An item for which only a few people correctly answer is a very difficult item

In doing measurement, it is critical to consider both the properties of the individuals being
measured, as well as the properties of the instrument (i.e., items) used in the measurement
process.
Entity being measured Instrument used to measure construct
Construct  person  hot sand
 pain tolerance  rice  cup
 amount in a  student  multiple-choice items
sack (quantity)
 math Properties Properties
achievement  can handle hot surface (i.e., don’t scream/scream less loudly)  20°C @ 30°C @ 40°C
@ cannot handle hot surface (scream loudly)  too big, too small, too high
 amount of cups needed to fill in the rice (also size of the  easy, moderate, difficult
grains: Jasmine @ long-grained Basmati @ Beras Pulut)
15
 low ability @ average @ high ability
WHY IS MENTAL MEASUREMENT IMPORTANT IN OUR CURRENT SOCIETY?

Research in education, psychology, sociology, health science, etc.


Meritocracy and advancement
Educational policy and accountability
Psychology policy (testing for psychological and psychoeducational placements)

16
THE MEASUREMENT MODEL

We determine the properties of individuals (i.e., trait level) and items (e.g.,
difficulty, discrimination) using a measurement model that converts the
responses to the items into the relevant information (measures) about people
and items.

Measurement models also provide a framework for determining the quality of


the measures of people traits (reliability, validity).

Two established measurement models:


The true score model (TSM) specified under classical/traditional test theory (CTT)
Modern item response theory (IRT) models

17
HOW MEASUREMENT WORKS

RESPONSES
PEOPLE ITEMS
MEASUREMENT
MODEL

Measures of each Measures of Item and


Person’s Trait level Instrument Properties
& &
Quality of Measures Quality of Measures
18
RECAP: WHAT IS MEASUREMENT?

19
WHAT IS A (EDUCATIONAL) TEST?

An instrument, tool, or systematic procedure for observing, measuring and describing one or
more characteristics/traits of a student.

“An evaluative device or procedure in which a systematic sample of test taker’s behavior in a
specified domain is obtained and scored using a standardized process” (AERA, APA, & NCME,
2014, p. 224).

Some perceived it as a concept which is narrower than an assessment.

But sometimes, it is used synonymously with assessment (AERA, APA, & NCME, 2014).

20
TRAITS VS. TOOLS

TRAITS TOOLS
Distance Surveyor’s wheels, yardstick, tape measure

Temperature  Thermometers (glass/electric/meat, etc.)
Height  Stadiometer or height rod
Weight/Mass  Weight/mass scale, balance scale, force gauge

Intelligence  IQ test
Achievement  Mid-term exam, SAT college admission test
Depression  Beck depression inventory

21
RECAP: WHAT IS TEST?

22
RECAP: WHAT ARE MEASUREMENT & TEST?

23
WHAT IS AN ASSESSMENT?

“Any systematic method of obtaining information, used to draw inferences about characteristics
of people, objects, or programs” (AERA, APA, & NCME, 2014, p. 216).

“A systematic process to measure or evaluate the characteristics or performance of individuals,


programs, or other entities, for purposes of drawing inferences” (AERA, APA, & NCME, 2014, p.
216).

A concept broader than a test but “sometimes used synonymously with test” (AERA, APA, &
NCME, 2014, p. 216).
SPM: Assessment
Physics Exam: Test

24
TYPES OF ASSESSMENT/TEST

Norm-Referenced vs. Criterion-Referenced Tests


Formative and Summative Assessments
School-Based Assessment
Authentic/Alternative Assessment

25
WHAT IS AN (PROGRAM) EVALUATION?

Evaluation is the process of making a value judgment about the worth of an entity’s (students,
programs) performance or product.

To evaluate means to determine the significance, worth, or condition of; usually by careful
appraisal and study (https://www.merriam-webster.com/)

Judging the merit or worth (Lincoln & Guba, 1980; Scriven, 1967)
Merit: Refers to a program’s quality based on performance (context-free)
Worth: Refers to the value a program’s performance has for society (context-dependent). Can
only be established using intimate knowledge of local social, cultural, political, and value
factors.

The main purpose of a program evaluation is to provide empirical evidence that can be used to
substantiate judgment about the merit or worth. In other words, the quality and the use of the
thing (e.g., program, products, or policies) being studied. Evaluative judgments often include
recommendations for change. 26
FORMATIVE VS. SUMMATIVE EVALUATION?

FORMATIVE SUMMATIVE
Focus on the ongoing activity Meets/Does not meet the objectives?
The process of person/product/program Making judgement on
Improvement person/product/program
… Decision making/Conclusion

27
ON FORMATIVE & SUMMATIVE ASSESSMENT/EVALUATION

28
MEASUREMENT & EVALUATION

Evaluation encompasses but goes beyond the meaning of the terms “test” and “measurement”.
Evaluation depends upon measurement but is not synonymous with it. Measurement is a
quantitative determination of how much an individual’s performance has been, while evaluation
is a qualitative judgment of how good or how satisfactory an individual’s performance has been.
Sound evaluation is based upon the results of accurate and relevant measurement.
Measurement affords the evaluator a quantitative description of externalized knowledge and
behavior.
Measurement describes a situation; evaluation judges its worth or value.
Measurement is only a tool to be used in evaluation. By itself, it is meaningless, but without it
evaluation is likely to be very erratic.

29
MEASUREMENT & EVALUATION (cont…)

Not all uses of a test or a measurement in education can be considered evaluation, for evaluation
involves appraisal in the light of some particular value, purpose, or goal.
Evaluation is a continuous and comprehensive process covering every aspect of the educative
program. An integral part of education in which pupil and teacher are partners.
Evaluation signifies a wider process of judging students’ progress.
Measurement, implies only a precise quantitative assessment of instructional outcomes.
Evaluation is integrated with the whole task of education, and not only with measurement and
examinations.
Evaluation goes beyond measurement in judging the desirability or value of the measure.
Evaluation is not only quantitative but also qualitative and includes value judgments.

30
ASSESSMENT & EVALUATION

While assessment and evaluation are highly interrelated and are often used interchangeably as
terms, they are not synonymous.
The process of assessment is to gather, summarize, interpret, and use data to decide a direction
for action.
The process of evaluation is to gather, summarize, interpret, and use data to determine the extent
to which an action was successful.

31
RECAP

Measurement refers to the set of procedures for assigning numbers (e.g., scores) to a specified attribute or characteristic (trait)
of a person or entity according to a specific set of rules.
Pengukuran ialah suatu proses untuk mendapatkan penjelasan secara numerik (melalui angka) tentang sebanyak mana
individu mempunyai sesuatu ciri yang diukur menggunakan alat/prosedur tertentu (terjemahan Open University Malaysia
(OUM), 2009)

Test is “[a]n evaluative device or procedure in which a systematic sample of test taker’s behavior in a specified domain is
obtained and scored using a standardized process” (AERA, APA, & NCME, 2014, p. 224).
Ujian ialah satu prosedur yang sistematik untuk memerhati (sampel) perlakuan atau (sampel) tingkah laku seseorang individu
dan menjelaskannya dengan bantuan skala bernombor atau satu sistem yang berkategori (Cronbach (1970), terjemahan OUM,
2009))

Assessment is “[a]ny systematic method of obtaining information [collection, analysis, and interpretation of information], used
to draw inferences about characteristics of people, objects, or programs” (AERA, APA, & NCME, 2014, p. 216).
Pentaksiran ialah sebarang prosedur sistematik untuk mendapatkan maklumat bagi membuat inferens tentang trait/ciri-
ciri/perwatakan individu, objek, atau program (terjemahan langsung)

Evaluation is the process of making a value judgment and decision about the worth of an entity’s (i.e., students, programs)
product or performance.
Penilaian ialah proses membuat pertimbangan atau keputusan dalam memberikan nilai, mutu, kualiti atau harga bagi sesuatu
entiti atau perkara (terjemahan OUM, 2009). 32
RECAP (MORE BRIEFLY)

Measurement (Pengukuran)
Mental Measurement
Person property & Item property (ciri-ciri individu & ciri-ciri item)

Test (Ujian); Testing (Pengujian)

Assessments (Pentaksiran)

Evaluation (Penilaian)

33
Related
Joke
of the Day
34
nurliyana@uum.edu.my

nurliyana.bukhari@alumni.uncg.edu

Room 291

35

Potrebbero piacerti anche