Validity and Reliability

Validity and reliability at course-level assessment
Assessment is defined broadly as the process of providing credible evidence of resources,

implementation actions, and outcomes for improvement purposes. At the course level, assessment may
serve as means of providing evidence for, and to improve, the effectiveness of instruction (Banta &
Palomba, 2015). It is part of a cyclic process that expects alignment across objectives, instruction and
assessment (Anderson, et al (Eds.), 2001); hence, designing and evaluating an assessment tool at the
course level relies on clear articulation of what students know and should be able to do (Suskie, 2008)
by the end of the course or specific learning activity.
In their discussion of their institutions assessment plan for the student outcomes of the systems
engineering program, Iqbal & Anderson (2016) mentioned the assessment instruments used in the senior
capstone design course; some were listed specifically as part of the computation of the final grade, while
others were presented as evidence for the attainment of student outcomes. The instruments covered the
assessment of course-related knowledge and skills (project reports); learner attitudes, values and selfawareness (ethics video homework); and learner reactions to instruction (peer evaluation) (Angelo &
Cross, 1993). The instruments also allowed the instructor to give both formative (weekly progress reports)
and summative (all reports) feedback, and there were direct and indirect sources of evidence (University
of Connecticut, n.d.).
One of the characteristics of good assessments is that they yield reasonably accurate and truthful
results (Suskie, 2008). This is possible by ensuring the validity and reliability of assessment data; an
evaluation of the validity and reliability of assessment methods as presented by Iqbal & Anderson (2016)
is discussed in the sections below.
Strengths. Validity refers to the extent to which evidence is able to support a claim, and is
commonly supported by content, construct, criterion, and consequence-related evidence (Moskal,
Leydens, & Pavelich, 2002). The assessment instruments described by Iqbal & Anderson covered three
forms of evidence; this may be considered as a strength of their assessment method (please see Table 1).
2
Table 1. Mapping of assessment instrument to type of evidence
Assessment Instrument
Student Outcome Assessed
Weekly Progress Report
Midterm Report
Final Report
3c, 3d, 3k
Final Oral Presentation
3g
Peer Evaluation
Ethics Video Homework
3d
3f
Idea Evaluation Report
3h
Post-Project Reflection
3i
Type of Evidence
Content
Construct
Criterion
Content
Construct
Criterion
Content
Construct
Criterion
Construct
Criterion
Criterion
Content
Construct
Criterion
Construct
Table 2 shows that at the program level, student outcomes are closely related to certain types of
validity evidence (Moskal, Leydens, & Pavelich, 2002); the assessment instruments used by Iqbal &
Anderson (2016) also covered the evidence needed for the outcomes they said they would like to assess
through the capstone design course.
Table 2. UALR-SE student outcomes arranged by type of evidence
(adapted from Moskal, Leydens and Pavelich, 2002, p. 352)
Content
3c) an ability to design a system,
component, or process to meet
desired needs
3j) a knowledge of
contemporary issues
Type of Evidence
Construct
3d) an ability to function on
multi-disciplinary teams
3f) an understanding of
professional and ethical
responsibility
3g) an ability to communicate
effectively
Criterion
3h) the broad education
necessary to understand the
impact of engineering solutions
in the global and societal
context
3i) a recognition of the need for,
and ability to engage in, lifelong
learning
3k) an ability to use the
techniques, skills and modern
engineering tools necessary for
engineering practice
3
Reliability, on the other hand, refers to the consistency of assessment scores. Due to the nature of
the course and the assessment instruments, there is no opportunity to establish test reliability nor was
there a need to. Student work from this course is mainly rated in a subjective manner, and students are not
really tested as the term is referenced by Moskal, Leydens and Pavelich (2002); hence, it is impossible
to establish test reliability. It is, however, possible to establish rater reliability. The final grade for the
capstone project is determined by 4 components, while the final grade for the individual is determined by
the final project grade and the results of peer evaluation. Two of the components oral presentation and
peer evaluation involve multiple raters; the use of rubrics serves as an attempt to ensure consistency
across raters, and may be considered as a strength of their assessment at the course level.
Weakness. Triangulation is a method of establishing trustworthiness. It consists of using
multiple strategies for data collection as well as sources of data (Golafshani, 2003; Leydens, Moskal, &
Pavelich, 2004). Trustworthiness is largely a qualitative measure of the quality of data and evidence
(Leydens, Moskal, & Pavelich, 2004); due to the unique nature of the assessment data collected in this
class (numeric scores are given through subjective means of evaluation), it is more appropriate to consider
trustworthiness in this context. For this paper, while multiple sources of data were present (written
reports, oral presentation, peer evaluation), each form of evidence was used separately and were not used
collectively to triangulate claims made in the paper; this is considered as a weakness of their assessment
plan. At the course level, it cannot be determined whether triangulation was employed in assessing the
attainment of course outcomes, as these were not discussed. Ideally, the attainment of course outcomes
should be evaluated periodically using multiple sources of data; an example of this will be the Course
Assessment Package for courses offered in multiple sections at the Civil and Mechanical Engineering
Departments of the United States Military Academy (West Point). This document is prepared in the
Spring semester by the designated course director, and includes the course syllabi, summaries of course
evaluations, student self-assessment instruments, and grade summaries (Baliley, Floersheim, & Ressler,
2002).
4
References
Angelo, T. A., & Cross, K. P. (1993). Classroom Assessment Techniques (2nd ed.). San Francisco, CA:
Jossey-Bass.
Baliley, M., Floersheim, R. B., & Ressler, S. J. (2002). Course assessment plan: A tool for integrated
curriculum management. Journal of Engineering Education, 91(4), 425-434.
Banta, T. W., & Palomba, C. A. (2015). Assessment essentials: Planning, implementing, and improving
student assessement in higher education. San Francisco, CA: Jossey-Bass.
Golafshani, N. (2003, December). Understanding Reliability and Validity in Qualitative Research. The
Qualitative Report, 8(4), 597-607.
Iqbal, K., & Anderson, G. T. (2016). Assessment of student outcomes in a distinctive engineering
program. 123rd Annual ASEE Conference and Exposition. New Orleans, LA: American Society
for Engineering Education.
Leydens, J. A., Moskal, B. M., & Pavelich, M. J. (2004, January). Qualitative Methods Used in the
Assessment of Engineering Education. Journal of Engineering Education, 93(1), 65-71.
Moskal, B. M., Leydens, J. A., & Pavelich, M. J. (2002). An Educational Brief of Validity, Reliability
and the Assessment of Engineering Education. Journal of Engineering Education, 351-354.
Ressler, S. J., & Lenox, T. A. (1996). A Structured Assessment System for an Undergraduate Civil
Engineering Program. 1996 ASEE Annual Conference Proceedings. Washington, D. C.:
American Society for Engineering Education.
Suskie, L. (2008). Understanding the nature and purpose of assessment. In J. E. Spurlin, S. A. Rajala, & J.
P. Lavelle (Eds.), Designing better engineering education through assessment: A practical
resource for faculty and department chairs on using assessment and ABET criteria to improve
student learning (pp. 3-23). Sterling, VA: Stylus Publishing, LLC.
The Taxonomy: Educational Objectives and Student Learning. (2001). In L. W. Anderson, D. R.
Krathwohl, P. W. Airasian, K. A. Cruikshank, R. E. Mayer, P. R. Pintrich, . . . M. C. Wittrock
(Eds.), A Taxonomy for Learning, Teaching, and Assessing: A Revision of Bloom's Taxonomy of
Educational Objectives (pp. 1-26). Addison Wesley Longman, Inc.
University of Connecticut. (n.d.). Assessment Primer: How to Do Assessment. Retrieved December 5,
2016, from Assessment: http://assessment.uconn.edu/assessment-primer/assessment-primer-howto-do-assessment/

Validity and Reliability

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Validity and Reliability

Caricato da

Copyright:

Formati disponibili

Validity and reliability at course-level assessment

Assessment is defined broadly as the process of providing credible evidence of resources,

Student Outcome Assessed

Weekly Progress Report

Final Oral Presentation

Idea Evaluation Report

Potrebbero piacerti anche