Revisi Bab 1 Fix

CHAPTER I
EVALUATION, ASSESSMENT, & MEASUREMENT
1. EVALUATION
1.1. Definition
Generally, in Oxford dictionary evaluation is the making of judgements about the

amount, number, or value of something; assessment.1 Besides, there are many opinions
from the experts who try to define what the evaluation is. First, an evaluation should be as
systematic and is impartial as possible (UNEG, 2005). Second, an evaluation is methodical,
providing information that is credible, reliable, and useful to enable the incorporation of
lessons learned into decision-making process of users and funders (OECD, 2010). Third,
evaluation is based on empirical evidence and typically on social research methods, thus on
the process of collecting and synthesizing evidence (Rossi Lipsey and Freeman, 2004).
Fourth,Conclusions made in evaluations encompass both an empirical aspect and a
normative aspect (Fournier, 2005). A.D. Jones defines evaluation as “the process of finding
the value of something”. It is the value feature that distinguishes evaluation from other
types of enquiry such as basic science research, clinical epidemiology, investigative
journalism, or public polling.2
Evaluation in learning process, The Glossary to the Berlin Summit of 2003 (ENQA,
2003) states that, evaluation is the central activity to assure quality in education. To
evaluate means to assess teaching and academic studies in a subject or department and the
related the programs. In addition, evaluation is carried out through internal or external
procedures. The process of internal evaluation is comprised of the systematic collection of
administrative data, questioning of students and graduates, as well as moderated
conversations with lecturers and students. As part of the process of external evaluation a
review team visits the department in order to review the quality of the academic studies and
teaching. External peers are lecturers or persons from vocational practice who discuss with
students and young scientists and present a final report. The evaluation of academic studies
and teaching has to be followed by an account of how effective the measures of quality
assurance are.3
From those the above statements it can be concluded that evaluation is a process of
gathering data, analyzing the data, drawing conclusion, and make a decision in learning
process
1
https://en.oxforddictionaries.com/definition/evaluation
2
https://evaluationcanada.ca/what-is-evaluation
3
http://www.qualityresearchinternational.com/glossary/evaluation.htm
1.2. Purpose of Evaluation
In generally, we need to be able to make a good decision in any aspect in our life. For
example, when you are confused to choose between two university to continuous your
study, in this case you need to do an evaluation where you have to start by analysis,
searching and collecting data about both of the university, then making the right decision to
choose the university that you will take. From the example, we can get that the general
purpose of evaluation is to make a good and accurate decision. Eventually, any student is
hoped to be able to make a good decision.
The purposes of evaluation in teaching learning proses are:
 To ensure the teaching is meeting student’s leaning needs
 To identify areas where teaching can be modified/improved
 To provide feedback and encouragement to teacher and the faculty
 To support application for promotion and career development.4
According to Doni, Sindu, and Bg Phalguna in their book Evaluasi Pendidikan the
purpose of evaluation is divided in two kinds; general purpose and special purpose.
a. General purpose
 To get evidence that can be instruction how far the ability and the success of the
students in the purpose of curricular after they through the learning process in the
time that has been giving.
 To measure and assess how far the effectiveness of teaching and method that
have been doing by the teacher and the learning process that have been doing by
the students.
b. Special purpose
 To make students have a volition to repair and increase their achievement.
 To search and get the cause of effective and the ineffectiveness of students in
learning process to find the way to repair it.
4
http://www.meddent.uwa.edu.au/teaching/faculty-evaluation/why-evaluate
1.3. Type of evaluation
This section describe five major types of evaluation for uses.5
Evaluation Definition Uses Examples

Type
 Evaluates a pro-  When starting a  How well is the
Formative gram during new program program being
development in  To assist in the delivered?
order to make early early phases of  What strategies
improvements program can we use to
 Helps to refine or development improve this
improve program program?
 Provides  To help decide  Should this

Summative information on whether to program
program continue or end continue to be
effectiveness a program funded?
 Conducted after  To help  Should we
the completion of determine expand these
the program design whether a services to all
program other after-
should be school programs
expanded to in the
other locations community?
 Determines if  To determine  Did your

Process specific program why an program meet its
strategies were established goals for
implemented as program has recruitment of
planned changed over program
 Focuses on time participants?
program  To address  Did participants
implementation inefficiencies receive the
in program specified number
delivery of of service hours?
services
 To accurately
portray to
outside parties
program
operations
(e.g., for
replication
elsewhere)
5
https://cyfar.org/different-types-evaluation
 Focuses on the  To decide  Did your
Outcomes changes in whether participants
comprehension, program/activit report the desired
attitudes, y affect changes after
behaviors, and participant’s completing a
practices that result outcomes program cycle?
from programs  What are the
activities  To establish short or long
 Can include both and measure term results
short and long term clear benefits observed among
results of the program (or reported by)
participants?
 Focuses on long  To influence  What changes in

Impact term, sustained policy your program
changes as a result  To see impact participants’
of the program in longitudinal behaviors are
activities, both studies with attributable to
positive/negative comparison your program?
and groups  What effects
intended/unintende would program
d participants miss
out on without
this program?
1.4. Models of evaluation
Following Vedung (1997) and Foss Hansen (2005) we can schematize the theoretical
mainstream in the following way:
Source: Vedung (1997), “Public Policy and Program Evaluation”, Transaction Publisher.
Each one of these has, obviously, different purpose and present advantages
and disadvantages according to the object of evaluation.
Result models focus on the results of a given performance, program or
organization and they inform on whether the goals have been realized or
not and on all the possible effects of the program, both foreseen and
unforeseen. There are at least two distinct methodologies, reflecting distinct
methodological principles: goal-bound and goal-free procedures. Broadly
speaking, goal-bound evaluation is focused on the relative degree to
which a given product effectively meets a previously specified goal, while
goal-free evaluation measures the effectiveness of a given product
exclusively in terms of its actual effects the goals and motivations of the
producer are ignored. Each approach has relative advantages and
disadvantages. On the one hand, goal-bound evaluation is ordinarily more
cost-effective than goal-free evaluation; on the other hand, measuring
effectiveness entirely in terms of the degree to which stated goals are met
can have at least two undesirable consequences: (a) since effectiveness is,
on this model, inversely proportional to expectations, effectiveness can be
raised simply by lowering expectations, and (b) deleterious or otherwise un-
wanted effects, if any, are left out of account, while unintended benefits, if
any, go unnoticed.
https://www.tillvaxtanalys.se/download/18.1af15a1f152a3475a818975/1454505626167/
Evaluation+definitions+methods+and+models-06.pdf
Economic models, on the other hand, test whether program’s
productivity, effectiveness and utility have been satisfactory in terms of
expenses. Cost analysis is currently a somewhat controversial set of
methods in program evaluation. One reason for the controversy is that
these terms cover a wide range of methods, but are often used
interchangeably. Whatever position an evaluator takes in this controversy, it
is good to have some understanding of the concepts involved, because the
cost and effort involved in producing change is a concern in most impact
evaluations (Rossi & Freeman, 1993).
• Cost allocation is a simpler concept than either cost-benefit analysis

or cost-effectiveness analysis. At the program or agency level, it
basically means setting up budgeting and accounting systems in a way
that allows program managers to determine a unit cost or cost per unit
of service. This information is primarily a management tool. However,
if the units measured are also outcomes of interest to evaluators, cost
allocation provides some of the basic information needed to conduct
more ambitious cost analyses such as cost-benefit analysis or cost-
effectiveness analysis;
• Cost-effectiveness and cost-benefit studies are often used to make

broad policy decisions, the terms might be used interchangeably, but
there are important differences between them: by definition, cost-
effectiveness analysis is comparative, while cost-benefit analysis usually
considers only one program at a time. Another important difference is
that while cost benefit analysis always compares the monetary costs and
benefits of a program, cost-effectiveness studies often compare
programs on the basis of some other common scale for measuring
outcomes. The idea behind cost- benefit analysis is simple: if all inputs
and outcomes of a proposed alternative can be reduced to a common
unit of impact, they can be aggregated and compared. If people would
be willing to pay money to have something, presumably it is a benefit; if
they would pay to avoid it, it is cost. In practice, however, assigning
monetary values to inputs and outcomes in social programs is rarely so
simple, and it is not always appropriate to do so (Weimer & Vining,
1992; Thompson, 1980; Zeckhauser, 1975).
Economic models therefore can provide estimates of what a program's

costs and benefits are likely to be, before it is implemented; they may
improve understanding of program operation, and tell what levels of
intervention are most cost-effective and they might reveal unexpected costs.
But surely, they are not free from drawbacks such as not being able to tell
whether or not the program is having a significant net effect on the desired
outcomes and whether the least expensive alternative is always the best
alternative.
Finally, actors’ models are based upon the actors’ own criteria for
assessment. As the term suggests, they emphasize the central importance of
the evaluation participants, especially clients and users of the program or
technology. Client- centered and stakeholder approaches are examples of
participant-oriented models, as are consumer-oriented evaluation systems.
With all of these strategies to choose from, how can an evaluator decide?
De- bates that rage within the evaluation profession are generally battles
between these different strategists, with each claiming the superiority of
their position; but most of the recent development in the debate have
focused on the recognition that there is no inherent incompatibility between
these broad strategies and each of them brings something valuable to the
evaluation table, attention has therefore increasingly turned to how one
might integrate results from evaluations that use different strategies, carried
out from different perspectives, and using different methods. Clearly, there
are no simple answers here. The probleme are complex and the
methodologies needed will and should be varied.
1.5.Steps of evaluation
Buchori (1972) in Zalili Sailan (2016) there are 5 steps of evaluation, they are6:
1. Planning, in this step the teacher will determine the purpose of evaluation, the aspect
that will be assessed, the method that will be used, Preparation of assessment tools,
and determine the time.
2. Collecting data, the teacher should collect the data by do an assessment,
examination of results and scoring.
3. Management data, the results of assessments made with statistical and non-statistical
techniques, depending on the type of data obtained that is quantitative or qualitative.
4. Interpretation, the teacher will interpret the results of data management activities by
basing themselves on certain norms.
5. Using the assessment result, the teacher will use the assessment result that have
done and interpret according to the purpose to do it as a function to improving the
learning proses, rectify the difficulties of learning students, improving the tools of
evaluation and making an evaluation report (rapor).
6
Zalili Sailan, Teknik Evaluasi Hasil Belajar Bahasa dan Sastra Indonesia, (Kendari: Metro Grapha,
2016),hlm. 14-15.
2. ASSESSMENT
2.1. Definition
Assessment is an assessment or estimate of the level or extent the properties of a person.

More comprehensive assessment (test than usual), because it covers the entire process of
collecting information and its use in drawing conclusions about the characteristics and
behavior. “Assessment is the ongoing process of understanding, improving, and
documenting students’ learning”7.
Assessment is a cyclic process used to identify areas for improvement of student learning
and to facilitate and validate institutional effectiveness. The Higher Learning Commission
offers the following formal definition: Assessment is the systematic collection,
examination, and interpretation of qualitative and quantitative data about student learning,
and the use of that information to document and improve student learning. Assessment is
not an administrative activity, a means of punishment, an intrusion into a faculty member’s
classrooms, or an infringement of academic freedom8.
Assessment has a different meaning to the evaluation. The Task Group on Assessment and
Testing (TGAT) described the assessment as all the methods used to assess the performance
of the individual or group9. Popham (1995: 3) defines assessment in the context of
education as a formal attempt to determine the status of the student regard to the interests of
education10. Boyer & Ewel define assessment as a process that provides information about
individual students, about curriculum or program, the institution or everything related to
institutional system11. "Processes that provide information about individual students, about
curricula or programs, about institutions, or about entire systems of institutions "(Stark &
Thomas, 1994: 46)12. “Assessment is the action or an instance of making a judgment
about something: the act of assessing something” (Merriam Webster: since 1828).
Assessment is the process of gathering and discussing information from multiple
and diverse source in order to develop a deep understanding of what students
know, understand, and can do with their knowledge as a result of their educational
experiences; the process culminates when assessment result are used to improve
subsequent learning (Weimer, 2002). Based on the various descriptions of the above can
concluded that the assessment or assessment can be defined as activities interpreting the
data presented.
7
Handbook of Assessment from Stark State College of Technology, Revision 2010. Page 3.
8
Handbook of Assessment. Stark State College of Technology. 1960.Page 6
9
National Curriculum: Task Group on Assessment and Testing. 1998. Page 3.
10
http://makalahlaporanterbaru1.blogspot.co.id/2012/11/makalah-language-testing.html
11
NASPAonline: Assessment Tips for Student Affairs Professionals. 2001. Page 1.
12
http://makalahlaporanterbaru1.blogspot.co.id/2012/11/makalah-language-testing.html
2.2. Purpose of Assessment
 Purpose One Communication
Assessment can be seen as an effective medium for communication between the
teacher and the learner. It is a way for the student to communicate their learning to
their teacher and for the teacher to communicate back to the student a commentary
on their learning. But to what end? To answer this, we offer the metaphor of
navigation. In order for navigation to take place—that is the systematic and
deliberate effort to reach a specific place—two things need to be known: (1)
where you are and (2) where you are going. This metaphor offers us the
framework to discuss assessment as communication—students need to know
where they are in their learning and where they are supposed to be going with their
learning. Each of these will be dealt with in (out of) turn.
 Purpose Two: Valuing What We Teach
Evaluation is a double-edged sword. When we evaluate our students, they

evaluate us. For, what we choose to evaluate, shows them what it is we value.
The corollary to the aforementioned statement is that if we, as teachers, value
something, then we should find a way to evaluate it. By placing value on
something we show our students that it is important. As teachers, we have no
difficulty doing this for curricular content. We regularly value achievement of
these goals. In so doing we send a very clear message to our students that
this is important. Indeed, it is. But, so too are goals pertaining to habits of mind,
socio mathematical norms, and especially learning tools. In fact, many teachers
would argue that attainment of tools for learning (such as group work skills) are
some of the most important goals in their practice. Is this importance being
communicated to their students? It may be the case that teachers speak regularly
with their students about the value of these skills but in a climate of emphasis
(over-emphasis) on curricular goals it is unlikely that the relative values of the
non-curricular goals are being accurately heard. By placing value (through
evaluation) on all of the targeted learning goals then the relative value of these
goals can be more convincingly communicated.
http://peterliljedahl.com/wp-content/uploads/Four-Purposes-of-Assessment1.pdf
 Purpose Three: Reporting Out
It is difficult to ignore that one of the primary purposes of assessment is to gather

information for the intention of reporting a student's (or a group of students')
progress out to stakeholders other than the teacher and students. Indeed, such a
purpose is a natural extension of assessment as communication. Not so natural,
however, is the reduction of this report to a single mark (percentage and or letter
grade). Such aggregation of a student's performances across a large number of
learning goals serves only to make opaque how that student is performing as a
learner. As a result, there is no communication going on at all. From a
navigational perspective, it says nothing about where a student is meant to be
going vis-á-vis the actual goals that are being focused on in the classroom, and it
says even less about how they are performing vis-á-vis those same goals.
 Purpose Four: Not Sorting / Not Ranking
There exists a significant societal assumption that one of the primary purposes of
assessment is to sort, or rank, our students. Most evident in this regard, is the
requirement to assign an aggregated letter grade (sorting) and/or a percentage
(ranking) to represent the whole of a student's learning. However, there is a
much subtler and more damaging indicator of this assumption—equitability. That
is, there is an expectation that all students are to be assessed equally. Otherwise,
how can any sorting and/or ranking be considered accurate?
2.3. Elements or Components of Effective Assessment
a. The assessment of student learning begins with educational values.
Assessment is not an end in itself but a vehicle for educational improvement. Its effective
practice, then, begins with and enacts a vision of the kinds of learning we most value for
students and strive to help them achieve. Educational values should drive not only what we
choose to assess but also how we do so. Where questions about educational mission and
values are skipped over, assessment threatens to be an exercise in measuring what’s easy,
rather than a process of improving what we really care about.
b. Assessment is most effective when it reflects an understanding of learning as

multidimensional, integrated, and revealed in performance over time.
Learning is a complex process. It entails not only what students know but what they can do
with what they know; it involves not only knowledge and abilities but values, attitudes, and
habits of mind that affect both academic success and performance beyond the classroom.
Assessment should reflect these understandings by employing a diverse array of methods,
including those that call for actual performance, using them over time so as to reveal
change, growth, and increasing degrees of integration. Such an approach aims for a more
complete and accurate picture of learning, and therefore firmer bases for improving our
students’ educational experience.
c. Assessment works best when the programs it seeks to improve have clear,
explicitly stated purposes.
Assessment is a goal‐oriented process. It entails comparing educational performance with

educational purposes and expectations. These are derived from the institution’s mission,
from faculty intentions in program and course design, and from knowledge of students’
own goals. Where program purposes lack specificity or agreement, assessment as a process
pushes a campus toward clarity about where to aim and what standards to apply; assessment
also prompts attention to where and how program goals will be taught and learned. Clear,
shared, implementable goals are the cornerstone for assessment that is focused and useful.
d. Assessment requires attention to outcomes but also, and equally, to the

experiences that lead to those outcomes.
Information about outcomes is of high importance; where students “end up” matters greatly.
But to improve outcomes, we need to know about student experience along the way. We
need to know about the curricula, teaching, and the kind of student effort that led to
particular outcomes. Assessment can help us understand which students learn best under
what conditions; with such knowledge comes the capacity to improve the whole of their
learning.
e. Assessment works best when it is ongoing, not episodic.
Assessment is a process whose power is cumulative. Though isolated, “one‐shot”
assessment can be better than none, improvement over time is best fostered when
assessment entails a linked series of cohorts of students; it may mean collecting the same
examples of student performance or using the same instrument semester after semester. The
point is to monitor progress toward intended goals in a spirit of continuous improvement.
Along the way, the assessment process itself should be evaluated and refined in light of
emerging insights.
f. Assessment fosters wider improvement when representatives from across the

educational community are involved.
Student learning is a campus‐wide responsibility, and assessment is a way of enacting that

responsibility. Thus, while assessment efforts may start small, the aim over time is to
involve people from across the educational community. Faculty members play an especially
important role, but assessment questions can’t be fully addressed without participation by
student‐affairs educators, librarians, administrators, and students.
Assessment may also involve individuals from beyond the campus (alumni/ae, trustees,
employers) whose experience can enrich the sense of appropriate aims and standards for
learning. Thus understood, assessment is not a task for small groups of experts but a
collaborative activity; its aim is wider, better‐informed attention to student learning by all
parties with a stake in its improvement.
g. Assessment makes a difference when it begins with issues of use and illuminates
questions that people really care about.
Assessment recognizes the value of information in the process of improvement. But to be

useful, information must be connected to issues or questions that people really care about.
This implies assessment approaches that produce evidence that relevant parties will find
credible, suggestive, and applicable to decisions that need to be made. It means thinking in
advance about how the information will be used, and by whom. The point of assessment is
not to gather data and return “results”; it is a process that starts with the questions of
decision‐makers, that involves them in the gathering and interpreting of data, and that
informs and helps guide continuous improvement.
h. Assessment is most likely to lead to improvement when it is part of a larger set of

conditions that promote change.
Assessment alone changes little. Its greatest contribution comes on campuses where the
quality of teaching and learning is visibly valued and worked at. On such campuses, the
push to improve educational performance is a visible and primary goal of leadership;
improving the quality of undergraduate education is central to the institution’s planning,
budgeting, and personnel decisions. On such campuses, information about learning
outcomes is seen as an integral part of decision making, and avidly sought.
i. Through assessment, educators meet responsibilities to students and to the public.
There is a compelling public stake in education. As educators, we have a responsibility to

the public that supports or depends on us to provide information about the ways in which
our students meet goals and expectations. But that responsibility goes beyond the reporting
of such information; our deeper obligation – to ourselves, our students, and society – is to
improve. Those to whom educators are accountable have a corresponding obligation to
support such attempts at improvement13.
13
(These principles were developed under the auspices of the AAHE Assessment Forum with support from
the Fund for the Improvement of Postsecondary Education with additional support for publication and
dissemination from the Exxon Education Foundation. Copies may be made without restriction. The authors
are Alexander W. Astin, Trudy W. Banta, K. Patricia Cross, Elaine El‐Khawas, Peter T. Ewell, Pat Hutchings,
Theodore J. Marchese, Kay M. McClenney, Marcia Mentkowski, Margaret A. Miller, E. Thomas Moran, and
Barbara D. Wright).
2.4. Types of Assessment
The term assessment is generally used to refer to all activities teachers use to help students
learn and to gauge student progress. Though the notion of assessment is generally more
complicated than the following categories suggest, assessment is often divided for the sake
of convenience using the following distinctions:
1. Formative and Summative

a. Summative assessment
Summative assessment is generally carried out at the end of a course or
project. In an educational setting, summative assessments are typically used to
assign students a course grade. Summative assessments are evaluative.
Summative Assessments are given periodically to determine at a particular

point in time what students know and do not know. Many associate summative
assessments only with standardized tests such as state assessments, but they are
also used at and are an important part of district and classroom programs.
Summative assessment at the district and classroom level is an accountability
measure that is generally used as part of the grading process. The list is long, but
here are some examples of summative assessments:
• State assessments
• District benchmark or interim assessments
• End-of-unit or chapter tests
• End-of-term or semester exams
• Scores that are used for accountability of schools (AYP) and students (report card
grades).
The goal of summative assessment is to evaluate student learning at the end

of an instructional unit by comparing it against some standard or benchmark.
Summative assessments are often high stakes, which means that they have a high
point value. Examples of summative assessments include: a midterm exam, a final
project, a paper or a senior recital.
Information from summative assessments can be used formatively when
students or faculty use it to guide their efforts and activities in subsequent courses.
b. Formative assessment
Formative assessment is generally carried out throughout a course or project.
Formative assessment, also referred to as "educative assessment," is used to aid
learning. In an educational setting, formative assessment might be a teacher (or
peer) or the learner, providing feedback on a student's work, and would not
necessarily be used for grading purposes. Formative assessments can take the form
of diagnostic, standardized tests.
Formative Assessment is part of the instructional process. When

incorporated into classroom practice, it provides the information needed to adjust
teaching and learning while they are happening. In this sense, formative assessment
informs both teachers and students about student understanding at a point when
timely adjustments can be made. These adjustments help to ensure students achieve
targeted standards- based learning goals within a set time frame. Although
formative assessment strategies appear in a variety of formats, there are some distinct
ways to distinguish them from summative assessments.
The goal of formative assessment is to monitor student learning to provide

ongoing feedback that can be used by instructors to improve their teaching and by
students to improve their learning. More specifically, formative assessments:
 help students identify their strengths and weaknesses and target areas that need
work
 help faculty recognize where students are struggling and address problems
immediately
Formative assessments are generally low stakes, which means that they have
low or no point value. Examples of formative assessments include asking students
to:
 draw a concept map in class to represent their understanding of a topic
 submit one or two sentences identifying the main point of a lecture turn in a
research proposal for early feedback
2. Objective and Subjective

Objective assessment is a form of questioning which has a single correct answer.
Subjective assessment is a form of questioning which may have more than one correct
answer (or more than one way of expressing the correct answer). There are various
types of objective and subjective questions. Objective question types include true/false
answers, multiple choice, multiple-response and matching questions. Subjective
questions include extended-response questions and essays.
Objective assessment is well suited to the increasingly popular computerized or
online assessment format. Some have argued that the distinction between objective and
subjective assessments is neither useful nor accurate because, in reality, there is no such
thing as "objective" assessment. In fact, all assessments are created with inherent biases
built into decisions about relevant subject matter and content, as well as cultural (class,
ethnic, and gender) biases.
https://www.amle.org/BrowsebyTopic/WhatsNew/WNDet/TabId/270/ArtMID/888/ArticleI
D/286/Formative-and-Summative-Assessments-in-the-Classroom.aspx
3. Informal and Formal.
Assessment can be either formal or informal. Formal assessment usually implies a
written document, such as a test, quiz, or paper. A formal assessment is given a
numerical score or grade based on student performance, whereas an informal
assessment does not contribute to a student's final grade such as this copy and pasted
discussion question. An informal assessment usually occurs in a more casual manner
and may include observation, inventories, checklists, rating scales, rubrics, performance
and portfolio assessments, participation, peer and self-evaluation, and discussion.
4. Interim assessments
Interim assessments are used to evaluate where students are in their learning
progress and determine whether they are on track to performing well on future
assessments, such as standardized tests, end-of-course exams, and other forms of
“summative” assessment. Interim assessments are usually administered periodically
during a course or school year (for example, every six or eight weeks) and separately
from the process of instructing students (i.e., unlike formative assessments, which are
integrated into the instructional process).
5. Placement assessments
Placement assessments are used to “place” students into a course, course level, or
academic program. For example, an assessment may be used to determine whether a
student is ready for Algebra I or a higher-level algebra course, such as an honors-level
course. For this reason, placement assessments are administered before a course or
program begins, and the basic intent is to match students with appropriate learning
experiences that address their distinct learning needs.
6. Screening assessments
Screening assessments are used to determine whether students may need specialized
assistance or services, or whether they are ready to begin a course, grade level, or
academic program. Screening assessments may take a wide variety of forms in
educational settings, and they may be developmental, physical, cognitive, or academic.
A preschool screening test, for example, may be used to determine whether a young
child is physically, emotionally, socially, and intellectually ready to begin preschool,
while other screening tests may be used to evaluate health, potential learning
disabilities, and other student attributes.
https://www.edglossary.org/assessment/
2.5 Assessment Methods
Director
Indirect
Method Description
Data
Surveying program alumni can provide information about
Program satisfaction, preparation (transfer or
workforce), employment status, skills for success.
Alumni Survey Indirect
Surveys can ask alumni to identify what should be
changed, altered, maintained, improved, or expanded.
A capstone project or course integrates knowledge,
concepts and skills that students are to have acquired
Capstone during the course of their study. Capstones provide a
Project or means to assess student achievement across a discipline. Direct
Course
These standardized tests are developed by outside,

Certification or professional organization to assess general knowledge
Direct
in a discipline.
Licensure Exam
Competitions External reviewers score, judge the performance, work, Direct
etc.of students.
(Juried) Course evaluations assess student experience and
Students
satisfaction with an individual course and a regenerable
administeredator
Course Indirect
Evaluation near the end of the semester. They provide the faculty,
department, and institution with student perceptions of
Survey
the classroom aspect of their educational experience.
Embedded assessment techniques utilize existing student
Embedded course work as both a grading instrument as well as data
Direct
in the assessment of SLO.
Techniques
Program scan survey employers to determine if their
graduates are satisfactorily skilled. Additional information
Employer Survey Indirect
to collect can include on the job skills, field specific
information, etc.
Interviews are conducted with students when they enter
College and when they leave—either through graduation
Entrance/Exit or early departure. These interviews can be designed to
Direct
measure SLO, but can also be used to learn about
Interviews
students’ perceptions, gather feedback, on various college
services, activities, etc.
A comprehensive exam given near the end of the student's
Academic career (usually during the final semester prior
to graduation). The exam is generally given to determine
Exit Exam/ a student’s acquisition and application of a particular
Comprehensi type or form of knowledge or skill, as well as the ability
Direct
ve Test to integrate knowledge from various disciplines. The
exam can be written, oral, or a combination.
A series of structured discussions with students who are
asked
Focus Groups Indirect
a series of open-ended questions designed to collect data
about beliefs, attitudes, and experiences.
An assessment of a student’s over all satisfaction with his
or her
Graduate Survey Indirect
Collegiate experience and learning
Review both program and student data that is collected at
the
Institutional Data Indirect
Institutional level. Data can include program
Locally enrollment,
A test that isretention,
developedstudent
with inGPA, etc.
the institution to be used Direct
Developed Tests internally. The test is typically administered to a
representative
sample in order to develop local norms and standards
A map/matrix is a grid of rows and columns that organizes
Information that can be used for assessment purposes
by summarizing relationships between goals, SLO,
“Maps” and/or courses, syllabus outcomes, course work, assessment
Indirect
methods, etc. Maps/matrices can be used to review
Matrices
curriculum, select assessment methods, make
comparisons, etc.
Information can be collected while observing “events”such
as classes, social gatherings, activities, group work, study
Observations sessions, etc. Observation can provide information on
student behaviors and attitudes Indirect
Students can be evaluated on participation in campus
and/or
Performance Community events, volunteer work, presentations, Direct

clinical, intern ships, musical or art performances, etc.
The performance of students is rated/scored using a
rubric/scoring guide.
Students’ work is collected through out a program which is
Assessed by faculty using a commons coring guide/rubric.
Portfolio Portfolios may contain research papers, reports, tests, Direct

exams, case studies, video, personal essays, journals,
self-evaluations, exercises, etc.
Typically an exam is administered at the beginning and
at the end of a course or program in order to determine
Pre & Post Tests Direct
the progress
of student learning
Reflective essays can be used as an assessment
Reflective method to determine student understanding of course Direct/
Student content and/or issues as well as students’ opinions and Indirec
perceptions t
Essays
Rubrics/scoring guides outline identified criteria for
Rubrics/Scoring Successfully completing an assignment and establish

levels for meeting the criteria. They can be used to score Direct
Guides
everything from essays to performances.
A test that is developed outside the institution for use
Standardized by a wide group of students using national or regional Direct
Tests norms
A facilitated analysis of the internal strengths &
weaknesses of
SWOT Analysis Indirect
The course, program, department as well as the external
threats & opportunities
Reviewing a syllabus involves determining if the
Syllabus Review course is meeting the goals and outcomes that have Indirect
been established
https://www.wssu.edu/about/assessment-and
research/niloa/_files/documents/assessmentmethods.pdf
3. MEASUREMENT
3.1. Definition
According to Wikipedia, “Measurement is the assignment of a number to a characteristic of

an object or event, which can be compared with other objects or events”. Measurement is
the process of collecting data through empirical observation that is used to collect
information relevant to the intended purpose. In this case the teachers assess student
achievement in reading or watching any of the student, observe their performance, to hear
what they say, and use their senses such as seeing, hearing, touch, smell, and taste14.
Measurement, beyond its general definition, refers to the set of procedures and the
principles for how to use the procedures in educational tests and assessments. Some of the
basic principles of measurement in educational evaluations would be raw scores, percentile
ranks, derived scores, standard scores, etc.
Measurement (measurement) can be defined as the process by roommate information about

the attributes or characteristics of thing are determined and differentiated15. Guilford
defines measurement "Assigning numbers to, or quantifying, Things according to a set of
rules"16. The measurement is expressed as a process of determining the number of
individuals or characteristics according to certain rules17.
Allen & Yen defines measurement as imposing figure in a way that systematic way to
declare an individual18. Thus, the essence of the measurement is the quantification or
determination of the number of characteristics or circumstances of the individual according
to certain rules. State this could be an individual's cognitive, affective and psychomotor.
Measurement has a broader concept of the test. We can measure the characteristics of an
object without using tests, such as observation, rating scales or another way to obtain
information in the form of quantitative.
14
(Cangelosi, 1995: 21)
15
(Oriondo, 1998: 2)
16
(Griffin & Nix, 1991: 3)
17
(Ebel & Frisbie. 1986: 14)
18
(Djemari Mardapi, 2000: 1)
3.2 Measurement Scales: Traditional Classification
Statisticians call an attribute on which observations differ a variable. The type of

unit on which a variable is measured is called a scale. Traditionally, statisticians talk of
four types of measurement scales: (1) nominal, (2) ordinal, (3) interval, and (4) ratio.
3.2.1 Nominal Scales
The word nominal is derived from nomen, the Latin word for name. Nominal
scales merely name differences and are used most often for qualitative variables in which
observations are classified into discrete groups. The key attribute for a nominal scale is
that there is no inherent quantitative difference among the categories. Sex, religion, and
race are three classic nominal scales used in the behavioral sciences. Taxonomic
categories (rodent, primate, canine) are nominal scales in biology. Variables on a
nominal scale are often called categorical variables.
3.2.2 Ordinal Scales
Ordinal scales rank-order observations. Class rank and horse race results are
examples. There are two salient attributes of an ordinal scale. First, there is an
underlying quantitative measure on which the observations differ. For class rank, this
underlying quantitative attribute might be composite grade point average, and for horse
race results it would be time to the finish line. The second attribute is that individual
differences individual on the underlying quantitative measure are either unavailable or
ignored. As a result, ranking the horses in a race as 1st, 2nd, 3rd, etc. hides the information
about whether the first-place horse won by several lengths or by a nose.
There are a few occasions in which ordinal scales may be preferred to using a
quantitative index of the underlying scale. College admission officers, for example, favor
class rank to overcome the problem of the different criteria used by school districts in
calculating GPA. In general, however, measurement of the underlying quantitative
dimension is preferred to rank-ordering observations because the resulting scale has
greater statistical power than the ordinal scale.
3.2.3. Interval Scales
In ordinal scales, the interval between adjacent values is not constant. For example,
the difference in finishing time between the 1st place horse and the 2nd horse need not the
same as that between the 2nd and 3rd place horses. An interval scale has a constant interval
but lacks a true 0 point. As a result, one can add and subtract values on an interval scale, but
one cannot multiply or divide units.
Temperature used in day-to-day weather reports is the classic example of an interval
scale. The assignment of the number 0 to a particular height in a column of mercury is an
arbitrary convenience apparent to everyone anyone familiar with the difference between the
Celsius and Fahrenheit scales. As a result, one cannot say that 30o C is twice as warm as 15o C
because that statement involved implied multiplication. To convince yourself, translate these
two into Fahrenheit and ask whether 86o F is twice as warm as 50o F.
Nevertheless, temperature has constant intervals between numbers, permitting one to
add and subtract. The difference between 28o C and 21o C is 7 Celsius units as is the
difference between 53o C and 46o C. Again, convert these to Fahrenheit and ask whether the
difference between 82.4o F and 69.8o F is the same in Fahrenheit units as the difference
between 127.4o F and 114.8o F?
3.2.4 Ratio Scales
A ratio scale has the property of equal intervals but also has a true 0 point. As a result,
one can multiply and divide as well as add and subtract using ratio scales. Units of time
(msec, hours), distance and length (cm, kilometers), weight (mg, kilos), and volume (cc) are
all ratio scales. Scales involving division of two ratio scales are also themselves ratio scales.
Hence, rates (miler per hour) and adjusted volumetric measures (mg/dL) are ratio scales. Note
that even though a ratio scale has a true 0 point, it is possible that the nature of the variable is
such that a value of 0 will never be observed. Human height is measured on a ratio scale but
every human has a height greater than 0. Because of the multiplicative property of ratio
scales, it is possible to make statements that 60 mg of fluoexetine is three times as great as 20
mg.
http://psych.colorado.edu/~carey/Courses/PSYC5741/handouts/Measurement%20Scales.pdf
3.3. The function of Measurement
a. Instructional
1. Principal (basic purpose)
 To determine what knowledge, skills, abilities, habits and attitudes have been
acquired.
 To determine what progress or extent of learning attained.
 To determine strengths, weaknesses, difficulties and needs of students.
2. Secondary (auxiliary functions for effective teaching and learning)
 To help in study habits formation
 To develop the effort-making capacity of students
 To serve as aid for guidance, counselling, and prognosis
b. Administrative/supervisory
 To maintain standards
 To classify or select for special purposes
 To determine teachers efficiency, effectiveness of methods, strategies used (strengths,
weaknesses, needs); standards of instruction
 To serve as basis or guide for curriculum making and developing
The Differences among Test, Measurement, and Evaluation :
William Wiersma and Stephen G Jurs (1990) in their book “Educational Measurement and
Testing” remark that the terms of Testing, measurement, assessment and evaluation are used with
similar meanings but they are not synonymous.
 Test:
Major aspects of their definition are: i) the presentation of standard set of tasks ii) the student
performs the task of performing iii) the test to be taken independently iv) measure of the learner’s
characteristics v) a quantitative comparison of the performance vi) a technique of verbal
description vii) test classification yields quantitative results.
 Measurement
Measurement in this manner assures the extent or quantity of something. It has an intimate
relationship with human beings. It is so closely related that it is rather difficult to say in which
aspect of our life it does not exist. It measures the height, weight and age of child. Examiners
measure the intelligence, abilities in various fields of the examinees. Some of these measurements
are physical. Physical measurement is direct, simple and very accurate, Psychological and
educational measurements are complex, for they cannot be measured through the system of
physical measurement. The measurement of intelligence is expressed in terms of the Intelligence
Quotient (IQ), that of scholastic achievement in marks or in grades. Generally, there are three
types of measurement: (i) Direct; (ii) Indirect; and (iii) Relative.
 Evaluation
Evaluation includes measurement. It contains the notion of a value judgment. The important
stage in the process of gathering, using all the related relevant and correct information is that of
evaluation. ‘Evaluation is a process of making a value judgement.’
(http://shodhganga.inflibnet.ac.in/jspui/bitstream/10603/134363/13/10_chapter3.pdf)
For the purpose of schematic representation, the three concepts of evaluation, measurement
and testing have traditionally been demonstrated in three concentric circles of varying sizes.
This is what Lynch (2001) has followed in depicting the relationship among these concepts.
Have a look to this cycle for the brief materials:
Figure 1- Assessment, measurement and testing adopted from Lynch (2001)
The purpose of this representation is to show the relationship between superordinate and
subordinate concepts and the area of overlap between them. Thus, evaluation includes
measurement when decisions are made on the basis of information from quantitative methods.
And measurement includes testing when decision-making is done through the use of “a
specific sample of behavior” (Bachman 1990). However, the process of decision-making is by
no means restricted to the use of quantitative methods as the area not covered by measurement
circle shows. Also, tests are not the only means to measure individuals’ characteristics
as there are other types of measurement than tests, for example, measuring an individual’s
language proficiency by living with him for a long time.
http://drjj.uitm.edu.my/DRJJ/OBE%20FSG%20Dec07/OBEJan2010/DrJJ-Measure-assess-
evaluate-ADPRIMA-n-more-17052012.pdf
NOTE :
WARNA BIRU MERUPAKAN TAMBAHAN MATERI & SUMBERNYA

TERCANTUM PADA CATATAN KAKI
WARNA MERAH MATERI YANG HARUS DIHAPUSKAN

LANGUAGE TESTING
Members of group :
HIQMA MONETERISQI (A1M2 16 135)

HASRA RAMADHANA (A1M2 16 023)
NIRMA SELASTIANA R (A1M2 16 037)
NURFITRIYANA RUCHYAT (A1M2 16 039)
NUR LAILI (A1M2 16 041)
RAHMAD ALAM (A1D2 14 133)
ENGLISH EDUCATION DEPARTMENT

FACULTY OF TEACHER TRAINING AND EDUCATION
HALU OLEO UNIVERSITY
2018

Revisi Bab 1 Fix

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Revisi Bab 1 Fix

Caricato da

Copyright:

Formati disponibili

CHAPTER I

EVALUATION, ASSESSMENT, & MEASUREMENT

Generally, in Oxford dictionary evaluation is the making of judgements about the

This section describe five major types of evaluation for uses.5

Evaluation Definition Uses Examples

 Provides  To help decide  Should this

 Determines if  To determine  Did your

 Focuses on long  To influence  What changes in

1.4. Models of evaluation

• Cost allocation is a simpler concept than either cost-benefit analysis

• Cost-effectiveness and cost-benefit studies are often used to make

Economic models therefore can provide estimates of what a program's

Assessment is an assessment or estimate of the level or extent the properties of a person.

Evaluation is a double-edged sword. When we evaluate our students, they

It is difficult to ignore that one of the primary purposes of assessment is to gather

 Purpose Four: Not Sorting / Not Ranking

2.3. Elements or Components of Effective Assessment

a. The assessment of student learning begins with educational values.

b. Assessment is most effective when it reflects an understanding of learning as

Assessment is a goal‐oriented process. It entails comparing educational performance with

d. Assessment requires attention to outcomes but also, and equally, to the

f. Assessment fosters wider improvement when representatives from across the

Student learning is a campus‐wide responsibility, and assessment is a way of enacting that

Assessment recognizes the value of information in the process of improvement. But to be

h. Assessment is most likely to lead to improvement when it is part of a larger set of

i. Through assessment, educators meet responsibilities to students and to the public.

There is a compelling public stake in education. As educators, we have a responsibility to

1. Formative and Summative

Summative Assessments are given periodically to determine at a particular

• District benchmark or interim assessments

• End-of-unit or chapter tests

• End-of-term or semester exams

The goal of summative assessment is to evaluate student learning at the end

Formative Assessment is part of the instructional process. When

The goal of formative assessment is to monitor student learning to provide

2. Objective and Subjective

These standardized tests are developed by outside,

Performance Community events, volunteer work, presentations, Direct

Portfolio Portfolios may contain research papers, reports, tests, Direct

Rubrics/Scoring Successfully completing an assignment and establish

According to Wikipedia, “Measurement is the assignment of a number to a characteristic of

Measurement (measurement) can be defined as the process by roommate information about

Statisticians call an attribute on which observations differ a variable. The type of

3.2.1 Nominal Scales

3.2.2 Ordinal Scales

Have a look to this cycle for the brief materials:

Figure 1- Assessment, measurement and testing adopted from Lynch (2001)

WARNA BIRU MERUPAKAN TAMBAHAN MATERI & SUMBERNYA

WARNA MERAH MATERI YANG HARUS DIHAPUSKAN

HIQMA MONETERISQI (A1M2 16 135)

ENGLISH EDUCATION DEPARTMENT

Potrebbero piacerti anche