Assessment in Education: Principles, Policy & Practice

This article was downloaded by: On: 14 February 2011 Access details: Access Details: Free Access Publisher
Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 3741 Mortimer Street, London W1T 3JH, UK
Assessment in Education: Principles, Policy & Practice
Publication details, including instructions for authors and subscription information: http://www.informaworld.com/smpp/title~content=t713404048
Teachers, schools and using evidence: Considerations of preparedness

Judy M. Parra; Helen S. Timperleya a Faculty of Education, University of Auckland, Auckland, New Zealand
To cite this Article Parr, Judy M. and Timperley, Helen S.(2008) 'Teachers, schools and using evidence: Considerations of
preparedness', Assessment in Education: Principles, Policy & Practice, 15: 1, 57 71 To link to this Article: DOI: 10.1080/09695940701876151 URL: http://dx.doi.org/10.1080/09695940701876151
PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf This article may be used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.
Assessment in Education: Principles, Policy & Practice Vol. 15, No. 1, March 2008, 5771
Teachers, schools and using evidence: Considerations of preparedness

Judy M. Parr* and Helen S. Timperley
Faculty of Education, University of Auckland, Auckland, New Zealand
Assessment 10.1080/09695940701876151 CAIE_A_287784.sgm 0969-594X Original Taylor 2008 0 1 15 jm.parr@auckland.ac.nz JudyParr 00000March and & Article Francis Francis (print)/1465-329X in 2008 Education: principles, (online) policy & practice
Downloaded At: 05:44 14 February 2011
Despite assessment being viewed as integral to practice, there are questions about schools preparedness to engage in this process. Data from three studies conducted in New Zealand primary schools explore whether use of student achievement data is part of the professional canon or skill set. (1) When implementing new literacy materials, evidence of need did not necessarily inform choice nor was achievement data used to make decisions about effectiveness of materials or use. (2) When a classroom initiative designed to achieve literacy goals was implemented, few schools collected evidence adequate for its evaluation. In both cases, practitioners appeared to hold a theory about acceptable evidence at variance with current policy expectations. Further, they may have lacked necessary skills. (3) Teachers, with practice, learned to interpret data accurately but this skill did not relate to student progress. A high level of pedagogical content knowledge may be needed to relate achievement information to teaching practice.
Introduction The key to better learning for students is better teaching (Darling-Hammond 2000). Effective teaching is underpinned by an evidence-informed and well-articulated knowledge about the content of what one is teaching, about how to teach and about ones students. Effective practice is not something absolute but, rather, is achieved by knowledgeable, committed teachers who tailor and adapt their practices to the ongoing needs of their learners in order to achieve outcomes of a high standard across heterogeneous groups of students (Alton-Lee 2003). Knowledge of the learner involves identifying patterns of strengths and weaknesses; looking backward at what has been done, to assess the effectiveness of instruction in terms of rate and extent of progress, and looking forward to work out what to teach next (Timperley and Parr 2004). This knowledge comes from ongoing assessment to inform and guide instruction (Crooks 1993; Tunstall and Gipps 1996; Black and Wiliam 1998; Torrance and Prior 1998), allowing better or more accurate decisions to be made (Stoll et al. 2003). Closely analysed evidence about the learning of students allows deliberate adjustments to a classroom teaching programme in order to meet the needs of students better. Knowing a students present level allows a teacher to work on learning goals just a little ahead of independent performance, in the region of sensitivity where the skills and knowledge are perhaps in embryonic form, to build on what the learner is just starting to do (Vygotsky 1978). The teacher has to design graduated support so that the learning goal is attainable and so there is a match among the task requirements, the skill of the learner and the support provided. Effective practitioners consider resources or programmes in relation to the needs of their students and, using evidence of achievement, test whether these work in their context. A careful matching of task and instruction to student competence levels was a feature of the outstanding first grade teachers in Pressley et al.s studies (2001).
Corresponding author. Email: jm.parr@auckland.ac.nz
ISSN 0969-594X print/ISSN 1465-329X online 2008 Taylor & Francis DOI: 10.1080/09695940701876151 http://www.informaworld.com
58
J.M. Parr and H.S. Timperley
That the evidence from assessment will be used to inform decisions at the school and classroom level, to monitor the outcome of changes in practice, and that such actions will lead to positive change are underlying assumptions of accountability policies (Ingram et al. 2004). While it is suggested that accountability policies with their associated central assessments may have little influence on teacher pedagogical practice (Firestone et al. 1999), there is evidence that collectively looking at student output in ways that are systematic, structured and in-depth assists teachers to consider their practice and modify their programme and that this has benefits for teacher and student learning, including the raising of student achievement (Newman et al. 1997; Symes and Timperley 2003; Warren Little et al. 2003). Despite the current notion of assessment as integral to school and class programmes, there are questions about the preparedness of schools and teachers to engage in the process (Earl and Katz 2002). Preparedness can be viewed in two senses. First, routine use of evidence, particularly of student achievement data, in relation to decision-making may not be a valued part of the professional canon. As Warren Little et al. (2003) note, introducing practices like the collective consideration of student work into the ongoing work of schools is a venture into difficult terrain involving transforming longstanding traditions of privacy and non-interference. There is a complex culture surrounding evidence and decision-making, shown in a study of schools that were purportedly leading practitioners of Continuous Improvement practices, where data-based decision-making is a core focus (Ingram et al. 2004). While overall around 40% of the remarks from participants in interviews indicated a use of systematic data for decision-making, a similar proportion reflected the use of anecdotal information, experience or intuition to make decisions. Notably, both teacher and school effectiveness were judged by some indicator other than student achievement data (Ingram et al. 2004). A similar finding with respect to teacher effectiveness is also reported by Sinnema where a consideration of student learning outcomes, not limited to standardised achievement measures, did not feature in teacher performance appraisals (Sinnema 2005; Robinson and Sinnema 2006). At a fundamental level, schools and their teachers may hold a different theory about the form that evidence of effectiveness might take or how it should be used (Timperley and Parr 2005). While evidence that informs decisions in the classroom particularly the moment-by-moment instructional decisions is drawn from a variety of sources, notably the teachers log in the head, the latter evidence is seldom recorded and available for systematic examination and reflection. Shifts in beliefs about the value of systematically collected evidence, and in professional norms and discourse around the consideration of such evidence, may be required (Annan et al. 2003). There is, however, a further dimension to preparedness other than the issue of the prevailing professional culture. This is the largely unexamined possibility, namely, whether teachers and their leaders possess the requisite knowledge and skills to work with data systematically collected with respect to student performance. To interpret data from diagnostic measures and from measures that have associated normative data requires considerable knowledge and skill. To utilise the information for teaching and learning is likely to require a high level of content knowledge about the subject and knowledge of the subject from the point of view of teaching it to others (Parr et al. 2006). For management in schools to use the information for planning or goal setting or the normative information for considering progress against derived expected rates of progress or against their stated goals is not straightforward. The notion of considering school effectiveness through the lens of student outcomes data raises a host of issues including how to view the extent to which any progress is attributable to the schools efforts (e.g. Gray 2004). As Saunders (2000) intimates, there are challenging technical, conceptual and, indeed, ethical issues. Further, it has been noted, more generally, that schools and the individuals in them may lack the knowledge, skills and personnel to implement any change agenda (McLaughlin 1990; Fullan 1991).
59
Three different research projects conducted in New Zealand provide data that illustrate our progressive and cumulative understanding of dimensions of the question concerning the preparedness of schools and teachers to engage in evidence-informed decision-making, in particular the use of evidence of student learning. The data were, in the case of each study, only one aspect of the larger study. In New Zealand, there is no mandated requirement to provide data on student achievement to the Ministry of Education but schools, under the National Assessment Guidelines, have to report annually against their self-chosen goals or targets with respect to student achievement. While students in New Zealand are not subjected to national testing until Year 11, teachers have access to numerous assessment tools that are able to be used both diagnostically and also to view progress against national normative data. Some measures like Assessment Tools for Teaching and Learning (asTTle) (Ministry of Education 2001) are designed as curriculum referenced diagnostic tools for Years 4 to 12 but they also have associated normative data to allow schools to make any of a number of comparisons should they so choose (e.g., by gender, of schools like us, by ethnic group, by year level. etc.). Tools like the Six Year Observation Survey (Clay 2006) function as diagnostic instruments to identify readers at risk. Other assessment tools like the Primary Achievement Tests and the Supplementary Test of Reading Ability (STAR) (New Zealand Council for Educational Research and Elley 2001) are standardised tests commonly administered and often used by schools to consider patterns of achievement over time. However, an instrument like STAR is intended to help identify readers at risk and its sub-tests function diagnostically. Thus the majority of measures of student achievement in New Zealand are designed for formative and diagnostic learning purposes. However, a number of these measures have associated normative data. The data from the studies presented allow an investigation of normative practice; of the value placed on evidence-informed decision-making; of knowledge of the principles associated with the use of evidence, and of the extent to which teachers and management possess the requisite skills to interpret and use the student achievement data generated from such measures. The first study considers the process of decision-making viewed in the everyday context of the classroom. The data are from a research project that documented the implementation of commercially available literacy packages into junior classes in primary schools (Parr et al. 2003). In New Zealand, schools have complete autonomy in the selection of resource materials. Schools were tracked over two years as they selected, implemented and made choices about the ongoing use of the literacy materials. These data inform a research question concerning the nature of evidence that is collected to monitor the success of new materials. The second study also provides data to address this question but in a different context, one in which there was an expectation that there would be a monitoring of progress towards goals. As part of a professional development in leadership project, participating schools set specific literacy goals and then, in an action research type cycle, designed and implemented a classroom initiative and evaluated progress towards these goals (Timperley et al. 2003). In addition, data from this study address the research question of the extent to which schools, in particular their leadership, understand the principles operating when evidence is used to inform decisions about teaching and learning. Finally, the third study investigates preparedness in terms of capacity. Gathered as part of a much larger study of elements of a professional learning project that contribute to enhanced student progress and achievement in literacy (Parr et al. 2006), these data allow an examination of the nature and extent of the knowledge and skills teachers and their leaders have regarding interpretation and application of student achievement data. In addition, the data allow an exploration of the relationship between facility in data interpretation and use and student achievement in terms of progress. The method and findings are presented for each study individually; then the picture obtained from all three is considered for the purposes of the discussion.
60
Study 1: Use of evidence in decisions regarding selecting and implementing literacy resources Method Schools (N = 67) which met specific criteria in terms of location and need (the areas from which their students came were in the lowest third socio-economically) participated in this research. They were invited to select a literacy resource that is, a package of published, commercially available materials to help address the literacy learning needs of their students. The choice was from those most commonly requested in submissions by schools for special funding, like, for example, Jolly Phonics or Rainbow Reading (a tape-assisted reading resource). Schools were supplied with the materials and the research tracked the process of implementation and use from decisions about selection to decisions about modification, continuation or up-scaling use. The data presented here were obtained from all schools through written questionnaires. However, case study visits were made at various points to a purposive sample of schools (a total of 55 visits to 39 of the participating schools) and these data informed subsequent questionnaire construction and interpretation of responses. As the packages arrived, schools completed a comprehensive two-part questionnaire, the first part by each teacher who would be using the materials (N = 143), while the second was a management questionnaire to be completed by the teacher responsible for oversight of the project in each school. Relevant to the focus of this paper were questions about the perceived needs of groups of students and how they found out about these needs. For the remaining three questionnaires only one per school was completed. Each was sent to the school liaison person, often a designated literacy leader, who was asked to consult all of the teachers and any teacher aides associated with the materials, although there is no certainty that other views were incorporated in the responses. Response rates were uniformly high (9598%). In these questionnaires, schools were asked, amongst other things, to report on the influence that the package had on learning outcomes and the sources of evidence that teachers used to determine the success of the materials on learning outcomes. Part way through the project schools were asked about any modifications to the way they were implementing the materials and, again, towards the end of the two years, schools were asked about methods of informing decisions about continuation and whether to alter the scale of use. For each questionnaire, a trained and experienced primary school teacher was employed to code the open-ended responses. A random sample of approximately 15% of the returned questionnaires was read and a series of coding categories derived from the responses to each question. The coder, principal researcher and project manager met to review and amend these categories. Once the coding system was established, each question or subsection was systematically analysed and recorded using the coding categories. A coding reliability test was conducted on 10% of the questionnaires. There was a high degree of agreement between two coders, averaging 86% over five questionnaires (range from 79 to 95%). Findings Responses to the questionnaires provided a significant amount of data related to the routine use of evidence in decision-making about choice of the readymade resource package and about continuing, discontinuing or modifying use. We wanted to know, specifically, whether information about student learning and achievement was considered in making judgements about the resource. Through questions intended to relate reasons for choosing the package to the reported needs of the students, the type of evidence teachers reported using in general to find out about literacy needs was established. Table 1 shows the sources nominated by 30% or more of teachers who were to use the materials. About three-quarters claimed that they found out about their students

Table 1. Major sources teachers use to ascertain student literacy needs. N replies Running records Anecdotal records (including conferencing) Basic word test or BURT Writing samples/Sample analysis Observation Survey (Six Year Net) 112 79 70 52 45
61
% teachers 76.7 54.1 48.0 35.6 30.8
needs through administering running records. The next most commonly nominated source was anecdotal records which includes conferencing. This means that the log in the head that is, a teacher inventory compiled from a variety of sources of observation and interaction is also a sizeable category, mentioned by 54% of teachers. Nearly half mentioned using a basic word test or named the BURT word test; a third nominated writing samples and 30% the Six Year Observation Survey (Clay 2006). Clearly, teacher-controlled classroom sources predominated when finding out about the needs of students. Other than the BURT or the Observation Survey, no formal tests were named although, in junior classrooms, this is not unusual. There was a different response, however, when the enquiry switched from the general finding out about needs to the influence of the package on learning outcomes and basis for judgement. After nearly a year, schools were asked to rate the influence of the materials on learning outcomes for the students using them, on a 1 to 5 scale (from not much to a lot). Then the schools were asked in an open question what source or sources of evidence they had used to measure this influence and to rate how important each source of evidence they had nominated was in helping them to decide on the overall rating they had given. Table 2 lists the sources mentioned by more than 10% of schools; how many schools nominated the source, and the ratings given each source received in terms of its importance (1=not important, 5=very important). Considerable weight is given to teacher observation and teacher-taken running records. Teacher reports or comments, student motivational factors and students alphabet or sound knowledge were also frequently used to inform decisions. While a number of schools reported using the Observation Survey (Six Year Net), they were not in agreement about its usefulness. For those schools that had in place some form of tracking system, this was reportedly useful. The basis for decisions namely, how they were reached and what they were based on, about whether to continue, discontinue or modify the package provides another lens through which to
Table 2. Number of schools nominating source of evidence by rating of weight in inuence on student learning outcomes. Rating Source of evidence Observation Running records Teacher reports/comments Alphabet/sound knowledge Eagerness of students and their motivation Observation Survey (Six Year Net) School based tracking/progress (including entry/exit test) 1= not much 2 1 1 2 1 5 1 3 7 6 5 3 7 3 4 10 8 15 11 7 3 6 5= a lot 20 18 8 7 7 6 6
62
examine the use of evidence. Schools were asked whether they had decided to continue, discontinue or modify package use and then to state the basis for reaching the decision. Categories derived from the responses to this question show it as being primarily by consultation of staff (42% of schools). More significantly, only about 8% of schools said they came to these decisions by looking at students results or by tracking students progress. Again, this was followed by a question that asked for a weight to be allocated to each source in terms of its influence on such decisions, using a scale from 0 (none at all) to 5 (a lot). Table 3 shows the percentage of schools nominating each source of evidence and how many allocated each weighting in decisions to continue, discontinue or modify the materials. The highest mean weightings for sources of evidence were for teacher comments or feedback and observation of students using the materials (4.27 and 4.25, respectively), followed by attitude of students at 4.17. Lower again was teacher-collected data at 3.69 and the lowest weighting was given to student performance on more standardised measures at 2.95. An analysis of variance of the different sources showed that differential importance was attached to them (F (5,332) = 9.21, p <.001). Post hoc tests (Tukey HSD) showed that data from student testing as a source of information on which to base decisions about the package was rated significantly lower than teacher comment and feedback, observation of students and attitude of students (p < .001). This low weight given to student achievement from measures with normative data available mirrors the previously reported findings regarding the influence of the package on learning outcomes where few schools reportedly drew conclusions from looking at students results or by tracking their performance. While convergence from a number of sources of information strengthens the conclusions drawn, these data suggest that schools rely heavily on teacher opinion about resources. Where this opinion is informally and not systematically sought it may focus on perceived proxies for student learning (like interest), proxies that may or may not relate to actual learning. To use information about student learning requires that leaders have put in place processes and systems that foster the tracking of learning, then support collegial consideration of the learning outcomes from moves designed to enhance student achievement in literacy like the use of new resource materials.
Table 3.
Percentage of schools nominating source of evidence by rating of weight in decision-making. No weight Weight given to this evidence A lot Not nominated 12.3 6.2 13.8 16.9 21.5 24.6
Source of evidence Teacher comments/feedback Observation of students using the package Attitude of students using material Teacher collected data e.g. Running records, PM benchmark Performance on standardised measures of reading Discussion with colleagues/other schools % of schools % of schools % of schools % of schools % of schools % of schools
0 1.5 1.5 3.1 3.1 10.8 7.7
1 0.0 1.5 1.5 7.7 12.3 1.5
2 4.6 1.5 1.5 0.0 0.0 10.8
3 4.6 4.6 4.6 12.3 4.6 13.8
4 26.2 40.0 30.8 32.3 27.7 16.9
5 46.2 40.0 40.0 23.1 18.5 20.0
Assessment in Education: Principles, Policy & Practice Study 2: Understanding of principles of evidence-based decision-making in evaluating a classroom initiative designed to meet literacy goals Method
63
Thirty schools, a sample of those participating in a national professional development project concerning leadership in literacy, were selected as case studies. Facilitators working in the schools nominated, where possible, a school in each of three categories, according to their perception of the extent to which schools had been successful in the project. The selection thus represented a range in terms of perceived success. Each school had undertaken to instigate a classroom-based project to promote the achievement of their particular literacy goal. Student achievement data were requested from each school in relation to those students whose literacy achievement was targeted in the classroom project. Whether such data were available and the nature of these data allowed some further inferences about evidence-based decision-making. Where these data were provided, they were categorised in terms of whether they were collected at more than one point in time, thus allowing progress to be investigated, then in terms of whether they assessed the aspect of literacy targeted and, finally, whether they would permit an independent judgement with respect to progress. In addition, the principals and literacy leaders in these schools (N = 53) completed a questionnaire in which there were questions designed to establish their understandings with respect to the principles of evidence-informed decision-making. A scenario describing a schools efforts to implement a project to address student need in literacy was deliberately constructed so as to fail to meet certain criteria for evidence-informed practice on three major dimensions. The first dimension related to the identification of students needs. In this scenario, the identification of need was based on teachers expressed concerns about reading comprehension rather than using achievement data to determine the extent to which reading comprehension was a problem. This basis for identifying needs clearly made comparison with outcomes difficult to determine. The second focused on the match between needs and programme. The basis for the decision to adopt a particular programme (in this case a peer tutoring reading) was that the students had enjoyed the programme in a neighbouring school. (This was considered to be insufficient evidence of a programme that was likely to address comprehension difficulties). The final dimension focused on the match between the initial need (comprehension) and the assessment used to determine progress and make the decision to continue with the programme. The scenario specifically mentioned improvement in accuracy as the basis for the decision to continue, with no mention of data on comprehension, the originally identified need. Respondents were asked to rate how effective they thought the school in the scenario seemed to be in their approach to the three dimensions outlined above. A rating scale of 17 was used, where 1 represented not effective/not appropriate, 4 represented neither effective/appropriate nor ineffective/inappropriate, and 7 represented highly effective/highly appropriate. Reasons were asked for all ratings. These reason responses were open-ended and were analysed iteratively. Coding categories were devised for each question using theoretical concepts and the responses themselves. The categories encompassed common themes that derived from the principles violated in the scenario like recognition of the need for data and recognition of need for the instrument to measure what was being targeted. The categories were modified until there was a high level of agreement between coders in placing a statement in a particular category. A 25% sample of questionnaires was then coded by two people independently. The percentage agreement obtained (number of decisions on which there was agreement divided by number of agreements + disagreements) was 85% or greater for each of the three questions generating an open-ended response.
64
Findings An indication of the status of evidence-informed decision-making came from the request for the achievement data of targeted students. Most schools were vague about this aspect of the initiative, with many unsure about the nature of the information requested. Some principals had not been directly involved in the initiative and referred to the literacy leaders for details of the assessment information, but this was not always helpful because the literacy leaders were also unsure. After considerable probing in many cases, 19 of the 29 schools were able to provide some achievement data on the targeted student population. Few schools, however, were able to provide student achievement information on their school-based project that allowed an evaluation of progress over time. Only 9 of the 19 collected data at more than one point in time. The other 11 relied on teacher judgement to establish needs and/or assess progress with explanations such as, Teachers highlighted the students they felt they could move and We just knew. Of the nine who collected data at two points in time with the potential to judge change, the data for five schools were unable to be analysed for a variety of reasons. For example, one school selected only one child per class, although the reason for selection was not clear (their initiative was targeted at four students in each class), while others collected data that did not relate to the focus of the initiative (e.g., reading accuracy data were collected when comprehension was targeted). With the remaining four schools, there were difficulties with interpreting the data and, from the data provided, in no case could it be stated unequivocally that achievement had improved. The data supplied were problematic to interpret with conclusions difficult to substantiate. The school with the most promising set of data in terms of its completeness supplied pre-post data (over a three month period) for five students in eight classes listing reading ages, percentage of words read accurately and percentage of comprehension questions answered correctly. Detailed inspection of the data, however, revealed that different criteria were used to determine reading ages in the pre- and post-test for both reading accuracy and comprehension. The attempt to obtain student achievement data to establish the effectiveness of the various interventions was unsuccessful; there was, literally, no evidence of achievement. Knowledge of how an evidence-informed assessment of, for example, a classroom intervention should proceed, was tested by asking principals and literacy leaders to rate the three problematic aspects of the hypothetical scenario related to the use of achievement data: diagnosing students needs on the basis of teacher perceptions, rather than using data; selecting a programme that did not address the identified need; and collecting follow-up data mismatched to the identified need. For the first dimension, identifying needs through teacher perception rather than using achievement data, low and high ratings were fairly evenly split (see Table 4). Those who gave low ratings recognised the problem and rated this aspect of the scenario accordingly, showing in their reasons that they had the appropriate knowledge to make this judgement. High ratings were typically justified on the basis that the teachers did identify a need but six of these respondents also noted that there was no achievement data to confirm this need. This absence did not concern them sufficiently to downgrade their ratings, suggesting that although they had the knowledge and skill to recognise the issue, they did not perceive it to be relevant to apply such. The second dimension, the adoption of a programme that was not designed to address the identified need, received more low than high ratings (see Table 4). The main reason for low ratings was the need/programme mismatch (17 respondents) or that the teachers did not find out enough about the programme (6 respondents), again suggesting that most had the requisite knowledge of this principle. The third aspect, related to using an inappropriate assessment to identify whether the need was met, was rated more highly than other aspects. Twenty-nine of the 30 high ratings were

Table 4. Principals and literacy leaders ratings of aspects of the scenario. No. of low ratings (13) Teacher perception and discussion as basis to adopt intervention Needs/programme match Inappropriate assessment to identify if need met 21 23 10 No. of neutral ratings (3.54.5) 11 11 11
65
No. of high ratings (57) 20 18 30
positive because they perceived that the data showed that the programme was successful. Again, five of these respondents commented on the lack of a match between identified need and outcome measure but they were not sufficiently concerned about it to lower their ratings. If detecting problems with the scenario can be used as a measure of relevant knowledge of evidence-based decision-making principles, then the results indicated that principals and literacy leaders showed a mix of expertise. Although some clearly had difficulties identifying problems with the scenario, many others readily identified the problems, but were differentially concerned about the importance of a data-based decision-making approach.
Study 3: Knowledge of use of student achievement data in an evidence-informed, needsbased professional development project Method These data were part of the research accompanying a professional development project that was school-wide, so principals and senior management as well as classroom teachers (Years 1 to 8) participated (N = 117). The primary source of data for teacher knowledge was a questionnaire which all present completed at a staff meeting. A scenario aimed to examine the specific knowledge required to interpret and then use data related to student achievement in literacy. Similar questionnaires were administered at two points, at the beginning of the project and one year into the project. The scenario asked respondents to interpret hypothetical data from a small class of students (N = 20) from either a reading or a writing assessment. The data were similar to those that would be obtained from two assessments familiar to many New Zealand primary teachers, namely Supplementary Test of reading Ability and asTTle: Writing. Basically, the data contained sub-test results and total raw scores for individuals and the class. For each sub-test and the total score, the national mean and expected range and the class mean and range were given (both tools that these data are similar to have normative data). Respondents were asked to interpret the data for a junior colleague. The questions concerned what the data showed with respect to a specific aspect of reading comprehension or writing; what the main points were that could be taken from these data; and what advice they would give about what the colleague should do, given these data. Responses to the three questions were coded using the same categories. This allowed us to see the extent to which responses were accurate and specific in that they did not simply refer in a generalised way to trend but were written in a manner that showed the interpretation was based on evidence. A total score was calculated. For each question, accurate but non-specific answers or answers that used information selectively, neglecting perhaps to mention major points, scored one point, while an accurate, data referenced point scored two points (maximum score 3 points). Reliability of coding was established by calculating the percentage of codes that were the same between coders. Where one coder allocated a code to a point and the other did not, this was counted as a disagreement. Agreement ranged from 71% to 89%. For each question where the
66
level of agreement on a sample of 20% of questionnaires fell below 80%, all responses to that question were double coded and disagreements between coders discussed and resolved. Student achievement results in reading were from a standardised test of reading, Supplementary Test of Reading (STAR) (New Zealand Council for Educational Research and Elley 2001). In writing, data were obtained from a criterion referenced (to the national curriculum) measure of writing (Assessment Tools for Teaching and Learning Project asTTle, Ministry of Education 2001) that has associated national normative data (Years 48). Student achievement progress scores were calculated for each teachers class and expressed as an effect size gain to equate progress for reading and writing groups (gains in raw score for STAR test were proportional to total score for test version as versions for year groupings have different maximum scores). Findings The major research question concerned the ability of teachers to give an accurate interpretation and one that was referenced to data, with an added part to determine whether skill changes over a period when there is a school-based focus on working with student achievement data. The scenario questions are considered separately, then overall patterns noted. Respondents were first asked what the data for a specific aspect of reading or writing showed. Around a third of respondents were able to reference their points accurately to data at Time 1 and by Time 2 this had risen to nearly two-thirds (see Figure 1). Conversely, the percentage responding with an answer that was either inaccurate or muddled fell by half between Time 1 and Time 2 and the percentage of teachers giving a generalised interpretation with no reference to data fell from 26 to 10%. A similar percentage did not respond at both points in time. Responses to the main points that should be taken from the data showed a similar pattern of responses. Less than a third of the points were related to data initially but, at Time 2, this was two thirds (indicated in Figure 2). The inaccurate or muddled responses declined more markedly
Figure 1. Specific aspect of data: Percentage of teacher responding by category at Time 1 and Time 2
70 60
Time 1 Time 2
Percentage of Teachers
50 40 30 20 10 0 No Response Inaccurate Generalised Selective Datareferenced Other
Response
Figure 1. Specific aspect of data: Percentage of teacher responding by category at Time 1 and Time 2.

70 60 50 40 30 20 10 0
No Response
67
Time 1 Time 2
Inaccurate Generalised
Selective
Datareferenced
Other
Response
Figure 2. Main points in data: Percentage of teachers responding by category at Time 1 and Time 2.
Figure 2. Main points in data: Percentage of teachers responding by category at Time 1 and Time 2
Figure 3. Advice to colleague percentage of teachers responding by category at Time 1 and Time 2
(from 23% to 3%). The generalised responses also declined, but less so than in response to the previous question regarding a specific aspect of the data. The final question asked about what the young colleague should do, given these results. The interesting aspect of the responses to this question is that, at Time 1, over half of the teachers were able to give advice that was referenced to data, higher than in their replies to either of the other two questions. And, by Time 2, this level of data referenced response had risen to 70%, the highest rate for any category in response to any of the questions. (Figure 3 shows these trends graphically). This apparent facility to use data when giving advice is reinforced by the fact that only 10% of responses were judged inaccurate initially while none was at Time 2. The level of generalised response, initially at around 20%, while lower, did not fall off as much at Time 2 as it did in responses to the other two questions. The no response category was also somewhat less than those for the other questions. A total score for each teacher was obtained according to how their responses were categorised. At Time 1 the average score was 3.62 while, at Time 2, this had increased to 4.85. There was an overall significant increase in knowledge of data interpretation and use (t = 5.089, p<.01) an increase that yielded an effect size (Cohens d) of 0.51. Each teachers level of knowledge in data interpretation was considered in relation to the progress of students in their class to provide an initial indication about the relationship between skill in interpreting data and student progress. The correlation coefficient indicated that the level of competence of a teacher with respect to data interpretation at either time point was not significantly related to the degree of progress of students in that teachers class. Discussion The findings from these three studies with respect to the use of evidence in decision-making do indeed support the notion that the picture is a complex one. In the first study, while teachers report they use a number of tools in finding out about student needs, tools that would assist systematic
68

80 70
Time 1 Time 2
60 50 40 30 20 10 0
No Response
Inaccurate
Generalised/ Qualification
Datareferenced
Other
Response
Figure 3. Advice to colleague percentage of teachers responding by category at Time 1 and Time 2.
gathering of evidence of student learning and achievement, they do not generally look to such evidence when thinking about decision-making about everyday practices of the classroom, in this case about the efficaciousness of commercial materials. In Study 2, responses suggest that, for many, there was not a well-developed understanding of the principles and processes underlying evidence-informed decision-making. But, clearly, leaders had not sought to acquire such knowledge in the course of the professional development project with its associated supports for diagnosis, goal-setting and the subsequent classroom interventions. This is understandable when responses to some of the other questions in the questionnaire are considered. These responses suggested a theory of leadership as facilitative and collegial rather than one where there was a strong instructional focus to leadership which would involve making judgements about success of instruction. As in Study 1, teacher discussion, feedback and collegial processes were more highly valued. On the other hand, there was a group of school leaders who demonstrated understanding, in theory, of the principles underpinning evidence-based decision-making with respect to the effect of classroom interventions. They could, as it were, talk the talk but there was, for some, a lack of walking the talk. That is, either they did not choose to utilise student achievement data because they perceived it not to be relevant or they lacked the specific skills needed to interpret and use it. Conceivably, these two moderating variables may have interacted. In addition, the findings in Study 2 with respect to the failure to collect systematic student achievement data to evaluate the classroom intervention suggest that the explanation may be, as Study 1 highlights, that achievement data is not considered by schools to be an important source of information on which to base decisions about efficaciousness. But, as the problematic nature of the student achievement data that was collected indicates, it may be that teachers and their management leaders do not possess the level of knowledge, skill and experience necessary to collect and interpret such data. This lack of preparedness is shown in Study 3 where, initially, teachers and their leaders lacked skill in data-referenced inference. While they gained considerable knowledge and skill as a result of their participation in the professional development that
69
focused on professional learning to meet student and teacher learning needs and while this learning was accompanied by enhanced student achievement, this level of knowledge, in and of itself, did not relate to subsequent gains in student achievement. This latter finding needs to be viewed in relation to other recent studies focusing on identifying student needs and honing teaching in the light of these needs. Receiving targeted, specific feedback about their students reading ability led to significant shifts in student achievement in schools, even before the commencement of professional development intended to help teachers hone their practice to meet needs of the students as shown in the data (McNaughton et al. 2004). Informed of the strengths and needs of their students, these teachers were seemingly able to make shifts in practice that were associated with increased student achievement. Similarly, teachers who examined student data together and worked out as a group what its implications were for teaching and who collectively took responsibility for deciding how best to help those underachieving, difficult-to-move students, had higher achieving students than schools where such a collective examination, diagnosis and problem-solving cycle did not operate (Timperley 2005). These two research studies suggest the need for the application of particular knowledge in order to use information gained from an examination of evidence. The argument advanced here is that there are several levels on which particular beliefs and knowledge operate in relation to the use of evidence, specifically that of student achievement, to inform decisions. It is likely that there is an iterative relationship over time between changes in beliefs in the value of evidence and changes in knowledge and skills. A teacher or leader has to believe in the utility of evidence as well as have a sense of the principles that guide evidence-informed decision-making. On another level, practitioners need to have the skill to draw inferences from the data, and need to understand what it shows. There is, however, an additional and significant element of knowledge required to apply the information gained from interpreting the data to classroom instructional practice. Clearly, it may be difficult for individual teachers or some groups of teachers to take the information from a diagnostic test and be able to make teaching decisions from it if they have insufficient pedagogical content knowledge (the knowledge about the subject from the point of view of how to teach it) (Shulman 1986). The diagnostic information that students score poorly in terms of the dimension structure when writing for one of various purposes for writing (say an argument) is not particularly useful information if the teacher does not know how language works at a text and local level to provide coherence and cohesion, or how particular language features support argument. Then, when they possess this content knowledge about language, they need to know how best to teach it to their students. This is the likely explanation for the lack of a direct relationship between scores on knowledge of data interpretation and student progress. Data from these same teachers show that the progress of students is significantly related to the level of pedagogical content knowledge their teachers have (Parr and Timperley 2006). Teachers with higher levels of knowledge about reading or writing and how to teach it to their students had students who made more progress. We therefore suggest that using evidence of student achievement to make better decisions decisions that are likely to result in enhanced achievement involves more than preparedness in terms of valuing such evidence. It involves more than an understanding of the principles in terms of how evidence-informed decisions operate or more than simply understanding what the data show. To apply this knowledge to teaching practice requires considerable knowledge of the subject from the point of view of teaching it. Up-skilling of practitioners to participate in evidence-informed decision-making with respect to practice requires professional learning on two fronts: understanding and skill in gathering and interpreting evidence and knowledge of the content to which the data refer and how to teach this, in order to apply the information gained from evidence.
70
Notes on contributors
Judy Parr is associate professor in the Faculty of Education, University of Auckland and has research and teaching interests broadly in developmental and educational psychology; these interests particularly concern literacy, with an emphasis on writing and the interface between literacy and technology. She has researched extensively in areas that impact on educational practice and published in a range of international journals. Recent articles appear in the Journal of Technology and Teacher Education, Leadership and Policy in Schools, Journal of Educational Change and Journal of Adolescent and Adult Literacy. Helen Timperley is professor in the Faculty of Education, University of Auckland and has extensive research experience in ways to promote professional and student learning in order that it impact positively on student outcomes. She has had numerous research articles published in international journals and has written four books on her specialty research areas. Most recently, with Associate Professor Parr, she published the book Using evidence in teaching practice (Hodder-Moa-Beckett, 2004).
References
Alton-Lee, A. 2003. Quality teaching for diverse students in schooling: Best evidence synthesis. Wellington, NZ: New Zealand Ministry of Education. http://www.minedu.govt.nz/ index.cfm?layout=documentanddocumentid=8646anddata=1. Annan, B., M.K Lai, and V.M.J. Robinson. 2003. Teacher talk to improve teaching practices. SET: Research Information for Teachers 1: 3135. Black, P., and D. Wiliam. 1998. Assessment and classroom learning. Assessment in Education 5, no. 1: 775. Clay, M.M. 2006. An observation survey of early literacy achievement. Revised 2nd ed. Sydney: Heinemann Crooks, T. 1993. Principles to guide assessment practice. New Zealand Principal 8, no. 3: 1416. Darling-Hammond, L. 2000. Teacher quality and student achievement: A review of state policy evidence. Education Policy Analysis Archives. 8. http://epaa.asa.edu/epaa/v8n1/ Earl, L., and S. Katz. 2002. Leading schools in a data-rich world. In Second international handbook of educational leadership and administration, ed. K. Leithwood and P. Hallinger, 10031022. Dordrecht: Kluwer Academic Publishers. Firestone, W. A., J. Fitz, and P. Broadfoot. 1999. Power, learning, and legitimation: Assessment implementation across levels in the United States and the United Kingdom. American Educational Research Journal 36, no. 4: 759793. Fullan, M. 1991. The new meaning of educational change. New York: Teachers College Press. Gray, J. 2004. Frames of reference and traditions of interpretation: Some issues in the identification of underachieving schools. British Journal of Educational Studies 52, no. 3: 293309. Ingram, D., K.R. Seashore Louis, and R. Schroeder. 2004. Accountability policies and teacher decisionmaking: Barriers to the use of data to improve practice. Teachers College Record 106, no. 6: 12581287. McLaughlin, M.W. 1990. The RAND change study revisited: Macro perspectives and micro realities. Educational Researcher 19, no. 9: 1116. McNaughton, S., M. Lai, S. Ferry, and S. MacDonald. 2004. Designing more effective teaching of comprehension in culturally and linguistically diverse classrooms in New Zealand. Australian Journal of Language and Literacy 27, no. 3: 184197. Ministry of Education. 2001. Assessment tools for teaching and learning: Project asTTle. Ministry of Education and the University of Auckland. www.asTTle.org.nz Newman, F.M., M.B. King, and M. Rigdon. 1997. Accountability and school performance: Implications from restructuring schools. Harvard Educational Review 67, no. 1 (Spring): 4169. New Zealand Council for Educational Research and W. Elley. 2001. The supplementary test of reading. Wellington: NZCER. Parr, J.M., and H. Timperley. 2006. A needs based approach to enhancing achievement in reading and writing. Paper presented to Annual Meeting of the American Educational Research Association, April 711, in San Francisco, USA. Parr, J.M., M. Aikman, E. Irving, and K. Glasswell. 2003. The readymade literacy materials project: An investigation of the use of package materials in junior classrooms. Wellington, New Zealand: Ministry of Education. Parr, J.M., H.S. Timperley, P. Reddish, R. Jesson, and R. Adams. 2006. Literacy professional development project: Identifying effective teaching and professional development practices for enhanced student learning. Wellington, New Zealand: Ministry of Education.
71
Pressley, M., R. Allington, R. Wharton-McDonald, C.C Block, and L.M. Morrow. 2001. Learning to read: Lessons from exemplary first grades. New York: Guilford. Robinson, V.M.J., and C. Sinnema. 2006. The leadership of teaching and learning: Implications for teacher evaluation. Paper presented to Annual Meeting of the American Educational Research Association, April 711, in San Francisco, USA. Saunders, L. 2000. Understanding schools use of value added data: The psychology and sociology of numbers. Research Papers in Education 15, no. 3: 241258. Shulman, L.S. 1986. Those who understand: Knowledge growth in teaching. Educational Researcher 15, no. 2: 414. Sinnema, C. 2005. Teacher appraisal: Missed opportunities for learning. Unpublished PhD thesis, The University of Auckland. Stoll, L., D. Fink, and L. Earl. 2003. Its about learning and its about time. London: Routledge Falmer. Symes, I., and H. Timperley. 2003. Using achievement information to raise student achievement. SET, Research Information for Teachers 1, 3639. Timperley, H. 2005. Distributed leadership: Developing theories from practice. Journal of Curriculum Studies 37, no. 6: 395420. Timperley, H., and J.M. Parr. 2004. Using evidence in teaching practice. Auckland, NZ: Hodder Moa Beckett. . (2005). Theory competition and the process of change. Journal of Educational Change 6, no. 3: 227251. Timperley, H., J. Parr, and R. Higginson. 2003. An evaluation of the literacy leadership enhancement programme. Wellington, New Zealand: Ministry of Education. Torrance, H., and J. Pryor. 1998. Investigating formative assessment: Teaching, learning and assessment in the classroom. Buckingham: Open University Press. Tunstall, P., and C. Gipps. 1996. Teacher feedback to young children in formative assessment: A typology. British Educational Research Journal 22, no. 4: 389404. Warren Little, J., M. Gearhart, M. Curry, and J. Kafka. 2003. Looking at student work for teacher learning, teacher community, and school reform. Phi Delta Kappan, November, 185192. Vygotsky, L.S. 1978. Mind in society: The development of higher psychological processes. Cambridge, MA: Harvard University Press.

Assessment in Education: Principles, Policy & Practice

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Assessment in Education: Principles, Policy & Practice

Caricato da

Copyright:

Formati disponibili

This article was downloaded by: On: 14 February 2011 Access details: Access Details: Free Access Publisher