Three Language Assessment Reading Logs - by Jason Beale

THREE LANGUAGE ASSESSMENT READING LOGS BY JASON BEALE, MONASH UNIVERSITY 2001 Reading Log #1: Bachman, L. F.
and A. S. Palmer (1996) Language Testing in Practice (pp. 3 - 15) Oxford: Oxford University Press ___________________________________________________________________________
This chapter begins by discussing some common misconceptions regarding language tests. The most common is that there is an 'ideal' test that is applicable in all situations, or even in one specific situation. This view regards language proficiency as "a set of finite components grammar, vocabulary, pronunciation, spelling" (p. 4), which can be tested in a similar way for all testees.
It cannot be assumed that a testee's results on one test provides a universally valid indicator of language knowledge or skill. There are many contexts for language learning and language use, each of which draws on different language knowledge and skills.
Language learners also have many different reasons for studying and using English. For a test to be valid, especially in terms of predicting future performance, it needs to be designed in terms of the language learner's specific future language needs.
Even a single test designed for a specific group is usually unable to provide a sufficiently detailed assessment of general ability. The example given in this chapter (p. 6) is that of a test for university teachers. It's results grouped the testees mainly in terms of reading ability, regardless of variations in speaking and listening abilities.
In the above example, the addition of a dictation test attempted to assess listening skills. It was unsuccessful in terms of face validity (test takers and users complained that it was artificial), as well as operational validity (test performance was not representative of interactive, conversational listening).
This chapter begins by presenting the issues of assessment through an interesting anecdotal example. This stresses the main point, that language testing is not an abstract science, but a practical skill requiring informed judgements.
The chapter goes on to briefly list ways in which language testing contributes to course management at various stages. These are: clarifying and evaluating instructional objectives providing information on students' strengths and weaknesses determining suitable materials and activities determining student readiness for a further stage of instruction assigning grades based on achievement providing feedback on a teaching program's effectiveness
The remainder of the chapter focuses on one fundamental principle of language testing, that is, the correspondence between language test performance and language use outside the test situation.
A framework is presented which identifies two characteristics, common to both language test performance and real-life language use. The first concerns the specific characteristics of the language tasks to be tested. The second concerns the characteristics of the individual language users and test takers, including their topical knowledge, affective schemata and language ability. When using this framework to develop or select language tests, it is
Jason Beale
important to "demonstrate the correspondences between both the characteristics of the language use situation and those of the test situation and tasks." (p. 11)
Comments
This chapter is mainly concerned with issues of test validity- how to ensure a test measures what it says it does. More specifically it provides a useful framework for approaching the issue of operational validity, or the extent to which test performance represents language use in the target situation. This framework counters an approach that sees language as an abstract set of universal skills. It gives the characteristics of individuals equal weight, not only in terms of language ability, but also with regards to feelings and emotions ("affective schemata"). In other words the conditions of language use in the target situation (ie. business negotiation, university study), need to be accounted for in the test situation. The individual characteristics of the test takers are also important with regards to bias for best in test design making sure that tests "facilitate, rather than impede, test takers' performance." (p. 12)
Jason Beale
Reading Log #2: Swain, M. (1984) Teaching and Testing Communicatively, TESL Talk, pp. 7 - 18. ___________________________________________________________________________
This article was published seventeen years ago, at a time when "communicative" was still a buzz-word in language teaching circles. It points out a predominance of non-communicative tests, and stresses the need for tests that will complement the communicative approach to language teaching.
Three general principles of communicative testing are proposed. The first is to "start from somewhere". This theoretical ground is provided by communicative competence, a model composed of four knowledges/skills: grammatical, sociolinguistic, discourse, and strategic.
The second principle is to "concentrate on content". The four main characteristics of test content are that it should be motivating, substantive, integrated and interactive. Each of these is a key aspect of communicative teaching methodology, and shows how closely teaching and testing are connected in Swain's model.
The third principle is to "bias for best". This means eliciting a testee's best performance. In the communicative test model developed by Swain it involves providing the following: more than adequate time to complete a task opportunity to review and change work access to reference materials checking that testees are on task clear instructions including what is being tested useful suggestions about how to do the task.
Jason Beale
The scoring procedures that Swain developed with her test included both "objective counts and subjective judgements". The scoring criteria was based around the four components of communicative competence: Grammatical competence: producing a structured comprehensible utterance (including grammar, vocabulary, pronunciation and spelling). Sociocultural competence: using socially-determined cultural codes in meaningful ways, often termed 'appropriacy' (ie. formal or informal ways of greeting). Discourse competence: shaping language and communicating purposefully in different genres (text types), using cohesion (structural linking) and coherence (meaningful relationship). Strategic competence: enhancing the effectiveness of communication (ie. deliberate speech), and compensating for breakdowns in communication (ie. comprehension checks, paraphrase, conversation fillers).
Swain states that her model of testing is directly inspired by communicative teaching in the classroom setting. The only significant difference between the two is that in a testing situation the teacher will "step back as a participant" in order to assess the students' performance.
Comments
Swain's model is similar in many ways to the way VCE Outcomes are assessed. The student works on a piece of writing over time, drafting and rewriting. A drawback of the previous CAT (curriculum assessement tasks) was that students were able to hand in work done outside the classroom setting. This took away the stress of the examination setting, but prevented monitoring how much assistance the student obtained from outside sources.
Jason Beale
One of the conditions of Swain's communicative testing model is in fact to give testees suggestions about how to do the task. Surely, though, the actual amount and kind of help needs to be decided in advance, and provided to all testees in a similar way. If students are assisted in different ways, in different contexts, then such a test would have limited validity as a comparative measure of achievement. This is quite important, given that one of the main reasons for testing is selection of candidates for courses, jobs, advancement or awards.
Another factor that needs to be taken into account, is the unpredictability of communicative tasks that involve exchange of information and negotiation of meaning. On the other hand, the more that the responses available in the communicative task are controlled by the testing material and situation, then the closer such a test comes to a traditional non-communicative test. A balance is needed to ensure that all candidates provide comparable responses that can be assessed using the same assessment criteria.
Testing communicatively in this way aims at reproducing real-life conditions of communication in the test situation. The major drawback of this approach is in the area of practicality. To ensure bias for best, it requires sufficient time and testee support at all stages. As Swain makes clear at the end of the article, this is best done in the classroom itself, as an extension of a communicative course of instruction.
Jason Beale
Reading Log #3: Weir, C. J. (1993) Understanding and Developing Language Tests, New York: Prentice Hall. (Chapter 2, Testing Spoken Interaction, pp. 30 - 63) ___________________________________________________________________________
In order to test speaking we need to ask this question: What are the features of spoken language? Weir presents a three-part framework consisting of: operations, conditions, and output quality.
The operations of speaking are categorised as either routine (standard ways of presenting information and of interacting) or improvisational(negotiation of meaning as well as interaction management such as turn-taking and agenda management). These categories are borrowed from the work of Bygate, and provide a dual approach to the testing of speaking. In my own opinion it would be desirable to design tests with different ratios of routines and improvisation. An emphasis on structured routines for lower level testees, and greater opportunity for improvisation for higher testees.
The conditions of speaking are next on the list of features considered by Weir. There would seem to be an infinite range of conditions under which speaking takes place - time constraints, noise interference, and of course social context including the purpose of communication and the relative status of the interlocutors. As Weir points out there is a balance to be met between authenticity and practicality here. Ideally a test would consist of a hidden cameras and microphones in the style of a "Big Brother" reality show, and in fact with technology advancing constantly, this may well be an option sometime this century.
Despite this imaginary future, we still have to struggle with the enduring concept of "test conditions" that is at the heart of our educational system. An important part of this concept is
Jason Beale
that conditions must be equal between all candidates to ensure that test results are reliably consistent between candidates and over time. Without consistency of test conditions the very nature of the test as a measurement of ability and progress becomes meaningless. And without such measurement, assessment cannot contribute to the system of awards and advancement that the education system serves.
It is in this context that the communicative approach to testing must tread very softly. Weir states that "every attempt should be made to simulate reality as closely as possible." (p. 38) The important word to be stressed here is "simulate". An assessment activity must necessarily be different from reality because the conditions of real life are not consistent, nor are they biased to encourage best performance.
The third feature of spoken interaction is output quality. This requires us to ask, what criteria do we use to judge spoken performance? As Weir points out, real speech is characterised by "self-correction, false starts, repetition, rephrasing and circumlocution." (p. 40) No one is a "perfect speaker". In Weir's opinion such "compensation features" should not be included in assessment "to the detriment of candidates." Weir's suggestions for assessment criteria instead requires us to focus solely on the positive features of speech - fluency, appropriateness, organisation, management, and range of language.
In my own opinion, it is probably the extent of compensation features mentioned above that are indicators of communicative language ability. The absence of these features would imply that the test conditions are not at all communicative, but simply mechanical. Accordingly I would like to see Weir include theses features into his marking scheme in a positive sense as flagging increased flexibility and ease with language.
Jason Beale
The section of this chapter examining test formats (pp. 47 - 63) is extremely well set out and provides a wealth of observations on the various characteristics of different formats. My own feeling is that many of these differences are simply academic. Every format is an artificial construct. As such, it comes with a set of rules that must be made clear to the testee. The testees sole motivation will always be to play by the rules of the particular format in order to succeed in the test. Viewed in this way, all formats of spoken test are equal. They are all constructs that isolate and simulate a particular aspect of communicative language use.
Testing spoken language is ultimately a game - it needs clear boundaries in which to play, and clear criteria by which to judge success. And if the philosopher Ludwig Wittgenstein was right, language assessment is simply one game within the larger game of language itself.
Jason Beale

Three Language Assessment Reading Logs - by Jason Beale

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Three Language Assessment Reading Logs - by Jason Beale

Caricato da

Copyright:

Formati disponibili

THREE LANGUAGE ASSESSMENT READING LOGS BY JASON BEALE, MONASH UNIVERSITY 2001 Reading Log #1: Bachman, L. F.

Potrebbero piacerti anche