Effect of Scoring Patterns On Scorer Reliability in Economics Essay Tests PDF

Journal of Economics and Sustainable Development ISSN 2222-17 !"aper# ISSN 2222-2$%% !&nline# 'ol.() No.
1%) 2 1*
www.iiste.org
Effect of Scoring Patterns on Scorer Reliability in Economics Essay Tests

,adu) -. .. "/.D0) I1e/) Eloc/u1wu 2rancis Department of Science Education) 3niversit4 of Nigeria) Nsu11a Nigeria. 0Email of t/e corresponding aut/or5 bcmadu(owa6gmail.com orbarnabas.madu6unn.edu.ng. Abstract 7/e stud4 investigated t/e effect of scoring patterns on scorer reliabilit4 in Economics essa4 test. In t/is stud4) one researc/ 8uestion was posed and one /4pot/esis was also tested. 7/e sample of t/e stud4 comprised of all t/e 2 Economics teac/ers and 12 Senior Secondar4 II !SS2# Economics students from t/e public secondar4 sc/ools in 9guata Education :one in 9nambra State) Nigeria. Economics Essa4 7est !EE7# was used for data collection w/ic/ was administered to 12 SS2 Economics students to generate t/e scripts used for t/e stud4. 7/e instrument !EE7# was trial tested using 2 SSII Economics students w/o s/are t/e same c/aracteristics wit/ t/e stud4 sub;ects in order to determine t/e reliabilit4 of t/e instrument. 7/e reliabilit4 coefficient of .$< was obtained using scorer reliabilit4 formula. 7/e data obtained for t/e stud4 were anal4=ed using >endall?s coefficient of concordance !w# in answering researc/ 8uestion) w/ile c/i-s8uare test of significance of >endall?s coefficient was used in testing t/e null /4pot/esis. 7/e findings of t/e stud4 revealed t/at scoring an item across board was more effective in scoring Economics essa4 test. -ased on t/e finding it was recommendation t/at scoring an item across board s/ould be adopted for improving scorer reliabilit4 of Economics essa4 test in bot/ internal and e@ternal e@aminationA it is also necessar4 t/at t/e scoring pattern recommended s/ould be incorporated in t/e curriculum of teac/er training institution since t/e use of t/is scoring pattern in sc/ools is not popular Key words5 ,easurement) 9ssessment). Scoring patterns) E@aminations) Essa4 8uestions) Economics. 1.Introduction ,easurement of learning outcome in Education is carried out t/roug/ assessment. 9ssessment is t/e process of gat/ering information about students? abilities or be/avior for t/e purpose of ma1ing decisions about t/e students !Elliot) >ratoc/mill) .oo1 and 7ravers) 2 #. Different assessment formats are utili=ed b4 classroom teac/er depending on t/e ob;ective of t/e measurement. 7/ese assessment formats for determining t/e students? understanding of 1e4 course topics include multiple-c/oice 8uestions) true-false) fill-in-t/e blan1s) s/orts answer) problem solving e@ercises and essa4 8uestions. ,ost of t/e alternatives to multiple-c/oice and true-false 8uestions are described in literature as constructed response or essa4 test 8uestions) meaning t/at t/e4 re8uire students to create t/eir own answers rat/er t/an select t/e correct one from a list of pre written alternatives. Essa4 test is one of t/e assessment tools utili=ed b4 classroom teac/ers especiall4 w/en t/e teac/er wants t/e students to originate) organi=e) e@press) and integrate ideals in a given problem !9gwaga/) 1<<7#. Essa4 test is described as one or more essa4 8uestions administered to a group of students under standard conditions for t/e primar4 purpose of collecting evaluation data. Essa4 8uestions are usuall4 categori=ed into two t4pes namel4 e@tended response 8uestions and restricted response 8uestions !,e/rens and Be/mann 1<7$#. E@tended response 8uestions !ECD# allow students freedom to determine t/e content and to organi=e t/e format of t/eir answerA t/e students decide w/ic/ facts are pertinent and /ow to organi=e) s4nt/esis) and evaluate t/em. Suc/ 8uestions are most appropriate w/en t/e ob;ective is to test student?s writing s1ills) including conceptuali=ation) organi=ation) anal4sis) s4nt/esis) and evaluation) giving t/e student minimum or ample c/oice regarding t/e topic. Cestricted response 8uestions !CCD# are 8uestions t/at limit bot/ t/e content and t/e form t/at t/e students answer ma4 ta1e. Cestricted response 8uestions are t/e appropriate form re8uired for testing t/e content. 2or t/is stud4) t/e researc/ers adopted t/e restricted response 8uestions for writing economics essa4 test. 2ailure to establis/ ade8uate and effective limits for t/e student response to t/e essa4 8uestion allows students to set t/eir boundaries for t/eir response) meaning t/at students mig/t provide responses t/at are outside) t/e intended tas1. If students? failure to answer wit/in t/e intended limits of t/e essa4 8uestion can be ascribed to poor or ineffective wording of t/e tas1) t/e teac/er is left wit/ unavailable and invalid information about t/e students? ac/ievement of t/e intended learning outcome and /as little or no basis for grading t/e student responses. 7/e essa4 tests /ave t/e following ob;ectives !.as/in) 1<$7#. 7/e4 can be used to test learning outcomes not measurable b4 ot/er means. 7/e4 can test t/oug/t processes) t/e students? abilit4 to select) organi=e) and evaluate ideas etcA and t/eir abilit4 to appl4) integrate) t/in1 criticall4 and solve problems. Ce8uire t/at students use own writing s1illsA t/e students must select t/e words) compose t/e sentences and paragrap/) organi=e t/e se8uence of e@position) decide upon correct grammar and spelling. +$
Journal of Economics and Sustainable Development ISSN 2222-17 !"aper# ISSN 2222-2$%% !&nline# 'ol.() No.1%) 2 1*
www.iiste.org
"ose a more realistic tas1 t/an multiple-c/oice and ot/er Eob;ectiveF items. ,ost of t/e life?s 8uestions and problems do not come in a multiple-c/oice format) and almost ever4 occupation re8uires people to communicate in sentences and paragrap/s) if not in writing. .annot be answered correctl4 b4 simpl4 recogni=ing t/e correct answerA it is not possible to guess. .an be constructed relativel4 8uic1l4. 7/ese ob;ectives indicate t/at educators c/oose essa4 8uestions over ot/er forms of assessment because essa4 items c/allenge students to create a response rat/er t/an to simpl4 select a response. Some educators use t/em because essa4s /ave t/e potential to reveal students? abilities to reason) create) anal4=e) s4nt/esi=e) and evaluate. Gence) essa4 tests can be used to assess /ig/er-order or critical t/in1ing s1ills) evaluate students? t/in1ing and reasoning s1illA provide aut/entic e@perience) test t/oug/t process !Ceiner) -ot/ell) Sudwee1s) and Hood ) 2 2#) and it ta1es relativel4 little time to construct and minimi=es guessing. Gowever) despite t/ese advantages of essa4 test) t/ere is muc/ sub;ectivit4 in its scoring due to t/e fact t/at students organi=e t/eir responses to 8uestions in different wa4s and /ence allocating mar1s for essa4-t4pe or free-response 8uestions can be ver4 unreliable !-aird) Ireatore@) J -ell) 2 (#) and t/erefore it is said to /ave low scorer reliabilit4 !&nun1wo) 2 2#.Scorer reliabilit4 or inter-rater reliabilit4) inter-rater agreement or concordance is t/e degree of agreement among t/e raters.!Iwet) 2 1 #. Scorer reliabilit4 is t/e degree of correspondence or agreement among t/e scores given to students b4 different e@aminers !9bon4i) 2 11#. It gives a score of /ow muc/ /omogeneit4 or consensus t/ere is in t/e ratings given b4 ;udges or raters. It is useful in refining t/e tools given to individuals ;udges and for determining if a particular scale is appropriate for measuring a particular variable given b4 a single score. It is common to observe variations among t/e scores received b4 different individuals. Some of t/is variation is due to actual differences in t/e c/aracteristic being measuredA /owever) t/ere ma4 be ot/er factors t/at contribute to t/e observed variation. 7/ese ot/er factors are sources of error t/at prevent an accurate assessment of t/e ob;ect of measure. Celiabilit4 is an e@pression of t/e proportion of t/e variation among scores t/at are due to ob;ect of measure. 9s variation due to error goes to =ero) t/e reliabilit4 of an assessment goes to 1. 2actors t/at ma4 serve as sources of error in an essa4 test includeA 'ariations in t/e students writing proficienc4. 'ariations in t/e content 1nowledge ie domain e@pertise) of t/ose evaluating t/e essa4 test) and 7/e consistenc4 wit/ w/ic/ t/e essa4 tests are evaluated. 7/is t/ird factor depends in large part on t/e selection of met/odKpattern b4 w/ic/ essa4 tests are scored. 7/e effect of t/e selection of a scoring pattern on t/e assessment scorer reliabilit4 is of primar4 concern of t/is stud4. 7/e first two factors) sub;ects writing proficienc4 and raters domain e@pertise) were assumed to contribute little to t/e variation of essa4 test scores. Scoring patterns according to Ebuo/ and &1afor !2 11# are various met/ods t/at are emplo4ed b4 scorers to obtain t/e 8uantitative performance of students. Some of t/e scoring patterns as reported b4 Ebuo/ and &1afor !2 11# are as follows5 Scoring all t/e items at t/e same time. Scoring an item across board. Can1ing of all scripts before scoring all items. Division of tas1 scoring into section. Scoring all t/e items at same time presupposes t/at t/e teac/er scores all t/e 8uestions for one student before pic1ing anot/er script. If in an Economics test for instance) t/e students were re8uested to answer five 8uestions) t/e teac/er /as to mar1 all five 8uestions for a student before entering into anot/er script. 7/is does not facilitate fast mar1ing and 4et) it is t/e popular mar1ing st4le t/at most teac/ers adopt !&nun1wo) 2 2#. Scoring an item across t/e board is t/e pattern in w/ic/ t/e teac/er scores one 8uestion for all t/e students completel4 before t/e teac/er enters into anot/er 8uestion. If it was 8uestion number one !1# for e@ample t/at t/e teac/er starts wit/) t/e teac/er /as to finis/ mar1ing t/is 8uestion for ever4 student w/o too1 t/e e@amination before pic1ing up anot/er 8uestion. 7/is ma1es scoring faster since it enables t/e teac/er concentrate on a 8uestion and its ma1ing sc/eme at a time. -4 t/e time t/e teac/er /as scored about twent4 !2 # students? responses to a particular 8uestion) t/e teac/er must /ave become ver4 familiar wit/ all t/e points in t/e mar1ing sc/eme and t/eir associated mar1s regarding t/e 8uestion. Can1ing all script before scoring all items is anot/er pattern of scoring essa4 test in w/ic/ t/e answers are not divided into points to w/ic/ mar1s are awarded. 7/ere is ;ust one standard answer to eac/ 8uestion. 7/e teac/er !scorer# reads eac/ student?s response to a 8uestion) compares it wit/ t/e standard answer) and awards a grade or mar1 /e considered appropriate based on /is pre-set criteria. 7/e scorer ma4 use numerical grades !eg) $ L#) letter grade !eg) 9) -) .# or comments !eg) above average) superior 8ualit4) poor wor1) e@cellent wor1) below average) etc#. Division of tas1 of scoring into session pattern is also one of t/e scoring patterns emplo4ed especiall4 w/ere large papers are to be scored. In t/is pattern) one scorer speciali=es in scoring a section !a part or number# of t/e +<
www.iiste.org
test. 7/e scorer scores /is section and passes on t/e script to t/e ne@t scorer. Evidences abound in support of t/is met/od) for e@ample) Bovegrove !1<$(# advised t/at in an e@amination in w/ic/ crucial decisions ma4 be ta1en) two or more scorers s/ould be allowed to score a section of t/e script independentl4. Gowever) an attempt to compare t/ese patterns and determine t/e one t/at will 4ield /ig/er scorer reliabilit4 of scoring Economics essa4 test is t/erefore t/e concern of t/is stud4 since little efforts /ave been made to compare t/ese scoring patterns and determine t/e one t/at en/ances /ig/er scorer reliabilit4 w/en emplo4ed in scoring essa4 tests in Economics. 7/e reliabilit4 of essa4 test depends on /ow well it is been scored !&1pala) 2 *#. 7/is s/ows t/at t/ere is a problem of scoring essa4 test even w/en t/e same mar1ing sc/eme is used) and t/is can also affect students? academic ac/ievement positivel4 or negativel4) in t/e sense t/at student responses can be over scored or under scored wit/out reflecting t/e actual performance of t/e students being assessed. 7/oug/ t/e4 ma4 be factor!s# t/at ma4 affect t/e scorer reliabilit4 of essa4 test) but ,e/rens and Be/man !1<7$# observed t/at a carefull4 planned) constructed and administered essa4 test can be ruined b4 improper scoring pattern and standards. 7/is calls for t/e need for proper scoring pattern in scoring essa4 test w/ic/ Economics is no e@ception. 1.1 Purpose of the Study 7/e main purpose of t/is stud4 is to investigate t/e effect of different scoring patterns on scorer reliabilit4 in Economics essa4 tests. 1. Research !uestion 7/e researc/ 8uestion posed to guide t/is stud4 is5 H/at is t/e effect of scoring patterns on scorer reliabilit4 in Economics essa4 testM 1." Research #ypothesis 7/e null /4pot/esis !G # formulated and tested at . % level of significant for t/is stud4 is5 7/ere is no significant difference in t/e correlation coefficients of scoring patterns of scorers w/o scored Economics essa4 tests using different scoring patterns. .$ethodology .1 %esign. 3nder t/is design 2 scorers in eac/ condition !Scoring all t/e items at t/e same time) scoring an item across board) division of tas1 scoring into section) ran1ing of t/e scripts before scoring# scored t/e full set of 12 writing responses of senior secondar4 two students of economics. 7/is resulted in a completel4 crossed design in w/ic/ ever4 scorer scored ever4 paper. 7/is resulted in a total of 2 raters at eac/ class e8uall4 distributed among t/e four scoring conditions. Eac/ prompt was scored for bot/ content and conventions. 7/e final score was obtained b4 combining t/e content and convention score so t/at t/e content score was given twice t/e weig/t as t/e convention score. 7/e stud4 design is s/own in figure 1 Scoring conditions Scorers Scoring all the items Scoring an item Ran&ing of all scripts %i'ision of tas& at same time. across board before scoring all scoring in sections items. Scorers 1( ( "( )( *( + , 1( ( "( )( *( + , 1( ( "( )( *( + , 1( ( "( )( *( + , 1 2 * ( . . . 12 2igure 1. Stud4 design for scoring. He furt/er e@amine t/e data from t/is stud4 to better understand t/e trends in t/e findings as utili=ed b4 1reiman) 2 7. In our anal4sis we were interested in t/e differences between t/e ratings of t/e 2 scorers in eac/ condition and t/e specificall4) we calculated mean differences !bias#) standard deviation of differences and t/e total root mean s8uare errors) using t/e following -ias N
1 Np
(Di )
i =1
Np
H/ere Di N rater score for paper i O Cesearc/er score for paper i
www.iiste.org
N p N total number of papers. SDdiff N
1 N
(Di Bias )
i =1
Np
Coot mean s8uare error C,SE N
2 Bias 2 + SDdiff
. Study Area 7/e stud4 was conducted in 9guata Education :one of 9nambra State of Nigeria. 9guata Education :one is made up of t/ree !*# Bocal Iovernment 9reas namel4A 9guata) &rumba Sout/) and &rumba Nort/) wit/ (7 public secondar4 sc/ools. 7/e sample si=e for t/is stud4 was all t/e 2 Economics teac/ers and 12 SSII Economics students. 9ll t/e 2 Economics teac/ers in t/e public secondar4 sc/ools in 9guata Education :one were used for t/is stud4 because t/e population is manageable. &n t/e /and) two secondar4 sc/ools were purposivel4 selected w/ic/ made up of 12 SSII Economics students. 7/e instrument used for data collection was Economics Essa4 7est !EE7# developed b4 t/e researc/ers. 7/e Economics Essa4 test !EE7# was based on t/e following Economics topics5 Demand and Suppl4) concept of mone4) 9griculture) Distributive trade) and "roduction drawn from SSIJ2 s4llabuses. 7/e test was developed b4 preparing table of specification based on t/e si@ levels of cognitive domain of -loom?s ta@onom4 of education. 7/e test contains five items !8uestions# wit/ sub 8uestions in eac/ item) and eac/ item carries an e8ual mar1 of !2 # totaling !1 # in t/e w/ole test. 7/e researc/ers also developed scoring guide for t/e scoring of t/e developed Economics essa4 test. 7/e instrument was face-validated b4 four e@perts in Economics. 7/ese e@perts were re8uired to e@amine t/e items wit/ respect to5 H/et/er t/e items constructed correspond as indicated in t/e table of specification. 7/e structure and clarit4 of t/e items. H/et/er t/e answers to t/e 8uestions correspond or tall4 wit/ t/ose provided in t/e scoring guide. H/et/er t/e language used in constructing t/e 8uestions is suitable for SSII Economics students. 7/e comments and recommendations of t/ese e@perts served as a guide to t/e modification of items in t/e EE7. 2or content validit4) t/e use of a well constructed and validated table of specification was used in constructing t/e EE7 items. 7/e instrument was trial tested using fift4 !% # SSII Economics students w/o s/are t/e same c/aracteristics wit/ t/e stud4 sub;ects to obtain t/e internal consistenc4 of t/e instrument and it was found to be .$<. 2or controlling e@traneous variables in t/is stud4) t/e following measures were adopted5 ." Training of teachers -scorers# 7/ere was training programme of all t/e teac/ers !scorers# t/at were involved in t/e scoring. Scorers are recruited based on re8uirements developed. Depending on t/e assessment) specific educational and e@perience re8uirements are met !ie degree) e@perience as a classroom teac/er#. Scorers were trained using compre/ensive training materials developed b4 t/e researc/ers. During t/is period) t/e validated instrument and scoring guide were discussed. 7eac/ers pa4 more attention to t/e processes and procedures utili=ed for constructed response item scoring. 9lt/oug/ specific constructed response procedures var4 slig/tl4 based on t/e teac/ers conventions and re8uirements) certain components are universall4 addressed including pattern development) range finding) scorer selection procedures) scorer training and 8ualification) and scorer monitoring "attern development for constructed response items is carried out at t/e beginning of a programme. Depending upon t/e content and t4pe of pattern eit/er a single rule ma4 be developed and applied to all items) or itemspecific rules ma4 be developed and in small cases) a /olistic rule ma4 be developed for a content area wit/ item specific rules for eac/ item. &nce constructed response items are developed and field-tested) range finding is carried out. Cange finding is t/e process used to determine /ow to appl4 t/e rule to students? papers and t/erefore determine t/e standards t/at are used to score constructed essa4 response items. 7/e process ma4 also define bot/ a range of response t4pes or performance levels wit/in a score point on t/e rule and t/e t/res/old between score points. 7/is implies t/at t/e researc/ers determine w/ere one score point ends and anot/er begins. 7/e range finding process defines t/e papers t/at are c/aracteristic of t/e various score points represented b4 t/e rule. 7/e researc/ers loo1 at a pool of responses w/ic/ cover t/e range of score points for a particular item and t/roug/ scoring and discussion come to an agreement score on eac/ response. 7/e researc/ers made notes from discussion and used t/ese notes for interpreting t/e responses to use to train scorers. 7/e researc/ers assured t/at t/e scorers are scoring based on t/e standards set w/en responses wit/ consensus scores are used as anc/or and training papers. 2or instance) some sample of constructed responses were scored twice) providing t/e basis for a number of statistics related to inter-rater reliabilit4) for instance) perfect agreement) perfect plus ad;acent 71
www.iiste.org
agreement) spearman correlation) 1appa statistics etc#. In addition) papers previousl4 scored b4 researc/ers were distributed to scorers and form t/e basis for validit4 indices t/at are similar to t/e inter-rater reliabilit4 statistics. 'alidit4 papers were c/osen specificall4 because of certain features t/at can test) for instance) w/et/er scorers are consistentl4 appl4ing t/e consensus-based logic to borderline paper. Scorers were monitored in a variet4 of wa4s. 2or instance scorers were also monitored using bac1-reading) w/ereb4 a researc/ leader rescored papers from a certain scorer or scorers t/at are performing at marginal level of reliabilit4. 7/roug/ t/is process) t/e researc/ leader provided specific feedbac1 or additional training in real time. 7/ere was also trial scoring of t/e dummies b4 t/e scorers during t/e training e@ercise. 2or data collection) t/e instrument !EE7# was administered to 12 SSII Economics students b4 t/e researc/ers to generate t/e scripts t/at were distributed to Economics teac/ers !scorers# w/o scored t/e scripts using different scoring patterns after t/e scorers /ave been randoml4 assigned to t/e different scoring patterns. 9fter scoring of t/e scripts b4 t/e scorers) t/e researc/ers personall4 collected t/em for recording and anal4sis. Data collected was anal4=ed using >endall?s coefficient of concordance !w# in answering t/e researc/ 8uestion w/ile c/i-s8uare test of significance of >endall?s coefficient was used in testing t/e null /4pot/esis. 7rends in t/e findings were also anal4=ed b4 determining t/e bias) standard deviation of difference and root mean s8uare error. " .Results Cesearc/ Duestion 1. H/at is t/e effect of scoring patterns on scorer reliabilit4 in Economics essa4 testM 7able 1. Summar4 of >endall?s coefficient of concordance !w# of t/e four scoring patterns Scoring "atterns No of Scorers >endall?s .oefficient !H# Scoring all t/e items at t/e same time. 2 .2* Scoring an item across board. 2 .7$ Can1ing of all scripts before scoring 2 .+% all items. Division of tas1 scoring into section 2 .71s 2rom table 1 above) it was revealed t/at scoring an item across board /ad a positive relations/ip wit/ a /ig/ correlation coefficient of .7$ w/ile division of tas1 also /ad a positive relations/ip wit/ correlation coefficient of .71. Can1ing before scoring recorded a correlation coefficient of .+% w/ile scorin all t/e items at t/e same time pattern /ad a low correlation of .2*.t/ese coefficient indices indicate t/e level of agreement among t/e raters in eac/ pattern. G4pot/eses 7/ere is no significant difference in t/e correlation coefficient of t/e scoring patterns of scorers w/o scored Economics essa4 tests using t/e four different scoring patterns. 7able 25 7est of significance of >endall?s coefficient of t/e scores awarded b4 scorers in t/e four different scoring patterns Scoring patterns N H df P2 " Scoring all t/e items at t/e same time. 2 .2* 1< * .1( .1% Scoring an item across board 2 .7$ 1< * .1( . Can1ing of all scripts before 2 .+% 1< * ) 1( . scoring all items. Division of tas1 scoring into section 2 .71 1< * .1( . 7able 2 above s/ows t/at t/e correlation coefficients for scorers w/o scored Economics essa4 test using scoring an item across board !S9I9-#) dividing t/e tas1 of scoring into sessions pattern !DSIS"#) and ran1ing all script before scoring all items !C9S-S9I# were significant. 7/is is due to t/e fact t/at t/e e@act probabilit4 value of " is less t/an . %.. In t/e same vein) t/e e@act probabilit4 value of " is greater t/an . %. 7/is s/ows t/at t/e use of S9I9-) DSIS") and C9S-S9I /ad a significance difference on scorer reliabilit4 in scoring Economics essa4 test. -ecause of t/ese findings) an4 apparent pattern ma4 be due simpl4 to c/ance w/ic/ is furt/er anal4=ed in table *. 7able * summari=es t/e means of t/e bias) standard deviation of differences and C,SE values across all scorers in eac/ condition. S/0 1ontent Score -ias SDDI2 C,SE 1 - .17 .7 .7( 2 - .2% .+7 .71 * - .2* .++ .7 ( - .2* .+$ .72
72
www.iiste.org
7/e scoring an item across t/e board condition /ad t/e lower overall mean scorer bias compared wit/ t/e ot/er t/ree conditions across all scores. Gowever) t/e scoring of all t/e items at t/e same time condition /as t/e /ig/est mean deviations of differences and overall means C,SE. 7/is provides some additional detail about t/e results of t/e stud4. 7/e distribution was artificial and represented muc/ /ig/er mean t/an t/e national distribution of scores in t/e full sample of papers. ).%iscussion Cesearc/ers evaluated t/e results of t/e stud4 in terms of inter-rater agreement among t/e scorers. 7/eir results were inconclusive in t/at same statisticall4 significant differences between t/e scoring conditions were found. H/ereas t/ere was a suggestion of a pattern in t/e data) t/is stud4 found no consistent statisticall4 significant differences in reliabilit4 between distributed scoring and traditional local scoring. 7/e findings of t/e stud4 s/ow t/at t/ere is effect of scoring patterns on scorer reliabilit4 in Economics essa4 test. 7/e stud4 reveals t/at scorers w/o scored Economics essa4 test using scoring an item across board pattern of scoring recorded a /ig/ scorer reliabilit4 coefficient followed b4 division of tas1 of scoring into session) ran1ing all scripts before scoring all items) and scoring all items at a time. 7/e essence of reliabilit4 is to ensure free error measurement) t/erefore t/e relative superiorit4 of scoring an item across board over ot/er patterns of scoring in en/ancing scorer reliabilit4 in Economics essa4 test could be attributed to t/e fact t/at) b4 t/e time t/e teac/er /as scored about 2 students? responses to a 8uestionA /e must /ave become ver4 familiar wit/ associated mar1s regarding t/e 8uestion. 7/is is w/4 &nun1wo !2 2# argued t/at scoring an item across board promotes scorer reliabilit4. 7/is implies t/at Economics teac/ers are more consistent in using scoring across board in scoring t/an an4 ot/er pattern. 7/e finding was furt/er strengt/ened using c/i s8uare test of significance of >endall?s coefficient w/ic/ reveals t/at t/e relations/ip was significant for scorers w/o scored Economics essa4 test using scoring an item across board !S9I9-#) dividing t/e tas1 of scoring into sessions pattern !DSIS"#) and ran1ing all script before scoring all items !C9S-S9I#. 7/e results of t/is stud4 disagrees wit/ t/e findings of Ebuo/ and &1afor !2 11# w/o reported t/at ran1ing an item before scoring was more effective scoring pattern in scoring -iolog4 essa4 test. Gowever t/is stud4 /as proven t/at scoring across board is more reliable in scoring Economics essa4 test compare to ot/er scoring patterns. In man4 studies comparison of student responding to essa4 response items b4 paper-and-pencil /ave been confounded b4 potential rater effects in rating /and written essa4s. H/en essa4s are /andwritten) scorers ma4 give writers t/e benefit of t/e doubt if spelling or punctuation) for instance) is not clear. 7/is benefit of t/e doubt ma4 var4 from rater to rater. Just as t/e st4le) tidiness) si=e or an4 ot/er c/aracteristic of student /andwriting can inappropriatel4 influence t/e scorers assigned b4 raters) raters must be constantl4 reminded to evaluate student responses based on t/e aut/or paper and scoring guides rat/er t/an t/e eas4 or difficult4 of reading responses. 7/at is) wit/ reference to table *) it is li1el4 t/at regression to t/e mean effects contributed to t/e negative bias. 9 second reason was t/at t/e monitoring process !eg validit4 c/ec1s) bac1reading etc# t/at t4picall4 occurs in operational scoring was suspended for t/e stud4. 7/is was done because it was t/oug/t t/at t/e monitoring process would contaminate t/e comparison across conditions. 7/e patterns of scorer error are ver4 similar across t/e four conditions of stud4. -reland) Bee and ,ura1i !2 %# speculated t/at t/e most plausible e@pectation for t/ese findings seems to be t/at t4ped essa4s are li1el4 to be perceived b4 scorers as final drafts) and t/erefore e@pectations are slig/tl4 /ig/er and errors in grammar or spelling are ;udged mere /ars/l4 t/an t/e4 are in /andwritten essa4s. In man4 studies) comparisons of students responding to constructed response items b4 paper-and Opencil /ave been confounded b4 potential rater effects in rating /and written essa4s *.1onclusion &n t/e basis of t/e findings in t/is stud4) it was found t/at scoring an item across board was found more effective in en/ancing scorer reliabilit4 of Economics essa4 test and t/ere is no significant difference in t/e correlation coefficient of scoring pattern of scorers w/o scored Economics essa4 tests using scoring all t/e items at t/e same time pattern. Gence)since scoring an item across board were found more effective in improving scorer reliabilit4 of Economics essa4 test) it is recommended for its use in scoring Economics essa4 test in bot/ internal and e@ternal e@amination and s/ould be incorporated in t/e curriculum of teac/er training institutions. References 9bon4i) S. &. !2 11#. Instrumentation in Behavioural Research5 9 "ractical 9pproac/. Enugu5 7ime@ "ublis/er. 9gwaga/) 3.N.'. !1<<7#. 74pes of test-essa4 and ob;ectives. In S.9. E=eudu) 3.N.'. 9gwaga/ J ..N. 9gbaegbu !eds#. Educational Measurement and Evaluation for Colleges and Universities. &nits/a5 .ape publis/ers International Bimited. -aird) J.9.) Ireatore@) J.) J -ell) J. ". !2 (#. H/at ma1es mar1ing reliableM E@periments wit/ 3> E@aminations. Assessment in Education) 11!*#) **1-*($.
7*
www.iiste.org
-reland) G.) Bee) Q.E. J ,ura1i) E.!2 %#. .omparabilit4 of 7&E2B .-7 essa4 prompts5 respons mode anal4sis. Educational and s!chological Measorement"#$%&'"$(()$*$. .as/in) E.N.!1<$7#.Improving essa4 test. IDE9 "9"EC N&.17..enter for 2acult4 of Evaluation and Development) >ansas State 3niversit4. Ebuo/) ..N. J &1afor) I.9. !2 11#. Effects of scoring b4 session) ran1ing and conventional patterns on scorer reliabilit4 in -iolog4 essa4 tests. International +ournal of Educational Research.11 !1# 2(2-2%2. Elliot) S. N.) >ratoc/ill) 7. C.) .oo1) J. B.) and 7ravers) J. 2. !2 #. Educational ps!cholog!, Effective teaching" effective learning !*rd ed#. -oston5 ,cIraw Gill. Ironlund) N. E. !1<7+#. Measurement and evaluation in teaching !*rd ed.#. New Qor15 ,acmillan publis/ing .o.) Inc. >rieman) .. !2 7#. Investigating t/e effects of training and rater variables on reliabilit4 measures5 9 comparison of standup local scoring) online distributed scoring) and online local scoring. "aper presented at t/e annual meeting of 9merican Educational Cesearc/ 9ssociation) ./icago) IB. Bovegrove) ,. N. !1<$(#. Evaluating t/e result of learning. In -. &. 31e;e !ed#. -oundations of Education. -enin .it45 Et/iope "ublis/ing .orporation. ,e/rens) H.9. J Be/mann) I.J. !1<7$#. Measurement and evaluation in education . ps!cholog!. !2nd ed#. New Qor15 Golt Cine/art and Hinsten Inc. Nworgu) -.I. !2 +#. Introduction to Educational Measurement and evaluation, theor! and practice !2nd ed.#. Nsu11a5 Gallman "ublis/er. &1pala) D. !2 *#. Standardi=ation) test administration and scoring. In -. I. Nworgu !ed#) Educational measurement and evaluation, theor! and practice !1*+-1(2#. Nsu11a5 3niversit4 7rust "ublis/er. &nun1wo) I. I. N. !2 2#. -undamentals of educational measurement and evaluation. &werri5 .ape "ublis/ers Int?l Btd. 7/e Hest 9frican E@amination .ouncil !2 11#. 'etting s/eet on Economics. Iwet) >.B !2 1 '. /and0oo1 of Inter)rater Relia0ilit! !2nd edition# /ttp5KKwww.agreestat.comKboo1Re@ercepts./tml retrieved 2 t/ september)2 12.
7(
This academic article was published by The International Institute for Science, Technology and Education (IISTE). The IISTE is a pioneer in the Open Access Publishing service based in the U.S. and Europe. The aim of the institute is Accelerating Global Knowledge Sharing. More information about the publisher can be found in the IISTEs homepage: http://www.iiste.org CALL FOR JOURNAL PAPERS The IISTE is currently hosting more than 30 peer-reviewed academic journals and collaborating with academic institutions around the world. Theres no deadline for submission. Prospective authors of IISTE journals can find the submission instruction on the following page: http://www.iiste.org/journals/ The IISTE editorial team promises to the review and publish all the qualified submissions in a fast manner. All the journals articles are available online to the readers all over the world without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. Printed version of the journals is also available upon request of readers and authors. MORE RESOURCES Book publication information: http://www.iiste.org/book/ Recent conferences: http://www.iiste.org/conference/ IISTE Knowledge Sharing Partners EBSCO, Index Copernicus, Ulrich's Periodicals Directory, JournalTOCS, PKP Open Archives Harvester, Bielefeld Academic Search Engine, Elektronische Zeitschriftenbibliothek EZB, Open J-Gate, OCLC WorldCat, Universe Digtial Library , NewJour, Google Scholar

Effect of Scoring Patterns On Scorer Reliability in Economics Essay Tests PDF

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Effect of Scoring Patterns On Scorer Reliability in Economics Essay Tests PDF

Caricato da

Copyright:

Formati disponibili

Journal of Economics and Sustainable Development ISSN 2222-17 !"aper# ISSN 2222-2$%% !&nline# 'ol.() No.

Effect of Scoring Patterns on Scorer Reliability in Economics Essay Tests

H/ere Di N rater score for paper i O Cesearc/er score for paper i

N p N total number of papers. SDdiff N

Coot mean s8uare error C,SE N

Potrebbero piacerti anche