Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
TOPICS
A.Formativevs.SummativeAssessments
B.SettingTargetsandWritingObjectives
C.ReliabilityandValidity
Assignments
C.ReliabilityandValidity
Inorderforassessmentstobesound,theymustbefreeofbiasanddistortion.
Reliabilityandvalidityaretwoconceptsthatareimportantfordefiningand
measuringbiasanddistortion.
Reliabilityreferstotheextenttowhichassessmentsareconsistent.Justaswe
enjoyhavingreliablecars(carsthatstarteverytimeweneedthem),westriveto
havereliable,consistentinstrumentstomeasurestudentachievement.Another
waytothinkofreliabilityistoimagineakitchenscale.Ifyouweighfivepounds
ofpotatoesinthemorning,andthescaleisreliable,thesamescaleshouldregister
fivepoundsforthepotatoesanhourlater(unless,ofcourse,youpeeledand
cookedthem).Likewise,instrumentssuchasclassroomtestsandnational
standardizedexamsshouldbereliableitshouldnotmakeanydifference
whetherastudenttakestheassessmentinthemorningorafternoononedayor
thenext.
Anothermeasureofreliabilityistheinternalconsistencyoftheitems.For
example,ifyoucreateaquiztomeasurestudentsabilitytosolvequadratic
equations,youshouldbeabletoassumethatifastudentgetsanitemcorrect,he
orshewillalsogetother,similaritemscorrect.Thefollowingtableoutlinesthree
commonreliabilitymeasures.
Typeof
Reliability
HowtoMeasure
Stabilityor
TestRetest
Givethesameassessmenttwice,separatedbydays,weeks,or
months.Reliabilityisstatedasthecorrelationbetweenscoresat
Time1andTime2.
Alternate
Form
Createtwoformsofthesametest(varytheitemsslightly).
ReliabilityisstatedascorrelationbetweenscoresofTest1and
Test2.
Internal
Consistency
(Alpha,a)
Compareonehalfofthetesttotheotherhalf.Or,usemethods
suchasKuderRichardsonFormula20(KR20)orCronbach's
Alpha.
Thevaluesforreliabilitycoefficientsrangefrom0to1.0.Acoefficientof0
meansnoreliabilityand1.0meansperfectreliability.Sincealltestshavesome
error,reliabilitycoefficientsneverreach1.0.Generally,ifthereliabilityofa
standardizedtestisabove.80,itissaidtohaveverygoodreliabilityifitisbelow
.50,itwouldnotbeconsideredaveryreliabletest.
Validityreferstotheaccuracyofanassessmentwhetherornotitmeasureswhat
itissupposedtomeasure.Evenifatestisreliable,itmaynotprovideavalid
measure.Letsimagineabathroomscalethatconsistentlytellsyouthatyou
weigh130pounds.Thereliability(consistency)ofthisscaleisverygood,butitis
notaccurate(valid)becauseyouactuallyweigh145pounds(perhapsyoureset
thescaleinaweakmoment)!Sinceteachers,parents,andschooldistrictsmake
decisionsaboutstudentsbasedonassessments(suchasgrades,promotions,and
graduation),thevalidityinferredfromtheassessmentsisessentialevenmore
crucialthanthereliability.Also,ifatestisvalid,itisalmostalwaysreliable.
Therearethreewaysinwhichvaliditycanbemeasured.Inordertohave
confidencethatatestisvalid(andthereforetheinferenceswemakebasedonthe
testscoresarevalid),allthreekindsofvalidityevidenceshouldbeconsidered.
Typeof
Validity
Definition
Example/NonExample
Content
Asemesterorquarterexamthatonly
Theextenttowhichthe
includescontentcoveredduringthe
contentofthetestmatches
lastsixweeksisnotavalidmeasure
theinstructional
ofthecourse'soverallobjectivesit
objectives.
hasverylowcontentvalidity.
Criterion
Theextenttowhich
scoresonthetestarein
agreementwith
(concurrentvalidity)or
predict(predictive
validity)anexternal
criterion.
Theextenttowhichan
assessmentcorrespondsto
Construct othervariables,as
predictedbysome
rationaleortheory.
Iftheendofyearmathtestsin4th
gradecorrelatehighlywiththe
statewidemathtests,theywouldhave
highconcurrentvalidity.
Ifyoucancorrectlyhypothesizethat
ESOLstudentswillperform
differentlyonareadingtestthan
Englishspeakingstudents(becauseof
theory),theassessmentmayhave
constructvalidity.
So,doesallthistalkaboutvalidityandreliabilitymeanyouneedtoconduct
statisticalanalysesonyourclassroomquizzes?No,itdoesn't.(Althoughyoumay,
onoccasion,wanttoaskoneofyourpeerstoverifythecontentvalidityofyour
majorassessments.)However,youshouldbeawareofthebasictenetsofvalidity
andreliabilityasyouconstructyourclassroomassessments,andyoushouldbe
abletohelpparentsinterpretscoresforthestandardizedexams.
TryThis
Reflectonthefollowingscenarios.
1.Aparentcalledyoutoaskaboutthereliabilitycoefficientonarecent
standardizedtest.Thecoefficientwasreportedas.89,andtheparentthinks
thatmustbeaverylownumber.Howwouldyouexplaintotheparentthat
.89isanacceptablecoefficient?
2.Yourschooldistrictislookingforanassessmentinstrumenttomeasure
readingability.TheyhavenarrowedtheselectiontotwopossibilitiesTest
Aprovidesdataindicatingthatithashighvalidity,butthereisno
informationaboutitsreliability.TestBprovidesdataindicatingthatithas
highreliability,butthereisnoinformationaboutitsvalidity.Whichtest
wouldyourecommend?Why?
ContinuetoSectionAssignments
Home|BasicConcepts|SelectedResponse|ConstructedResponse
SITEMAP
ThiscoursewasdevelopedinpartnershipbetweenthePinellasSchool
DistrictandtheFloridaCenterforInstructionalTechnologyatUSF.