Sei sulla pagina 1di 26

BULATS Test Specification:

A Guide for Clients


November 2010

University of Cambridge ESOL Examinations
1 Hills Road Cambridge
CB1 2EU
United Kingdom
Tel +44 1223 553997
Fax +44 1223 553621
Email ESOLHelpdesk@CambridgeESOL.org

www.CambridgeESOL.org

Cambridge ESOL 2010

Contents
1 Introduction 1
1.1 What is BULATS? 1
1.2 What levels of language ability does BULATS test? 1
1.3 How are BULATS results reported? 2
1.4 Who is BULATS suitable for? 2
1.5 What topics and situations are covered? 2
1.6 How should candidates prepare for BULATS? 4
2 BULATS Online and CD-ROM Reading and Listening tests 5
3 BULATS Standard test 7
4 BULATS Online Writing test 10
5 BULATS Writing test (paper-based) 11
6 BULATS Online Speaking test 12
7 BULATS Speaking test (face-to-face) 13
8 Test results 14
9 Statistical characteristics of the test 16
9.1 How accurately do the BULATS tests measure? 16
9.2 Are different versions of the tests equivalent? 19
9.3 Are the computer-based test and the Standard test equivalent? 20
9.4 On-going validation of BULATS 20
10 The development of BULATS 22
10.1 The history of BULATS 22
10.2 The development of the BULATS computer-based tests 22
10.3 The revision of BULATS in 2002 23
10.4 The Production Cycle for question papers 23


BULATS Test Specification: A Guide for Clients i

1 Introduction
1.1 What is BULATS?
BULATS is a suite of language tests specifically for the use of companies and organisations which
need a reliable way of assessing the language ability of groups of employees or trainees.
BULATS is designed to test the language proficiency of employees who need to use a foreign
language in their work and for students and employees on language courses or on
professional/business courses where foreign language ability is an important element of the course.
BULATS provides:
relevant, useful and reliable language tests in work contexts
reports on candidates' performance in terms of internationally understood standards
test administration to suit the client company's individual requirements
rapid turn around of test results
information to help the interpretation of test results
advice to companies on appropriate strategies for language testing, assessing language needs
(language auditing) and training.

BULATS is a multilingual service, offering tests in English, French, German and Spanish. The
following tests are offered:
BULATS Online Reading and Listening test
BULATS CD-ROM Reading and Listening test
BULATS Standard test (a paper-based reading and listening test)
BULATS Online Speaking test (English only)
BULATS Speaking test (a face-to-face Speaking test)
BULATS Online Writing test (English only)
BULATS Writing test (a paper-based writing test)
Tests can be combined to test all four language skills or they can be used independently.
Tests of English are produced by the University of Cambridge ESOL Examinations, French by the
Alliance Franaise, German by the Goethe-Institut and Spanish by the Universidad de Salamanca.
These four institutions are members of the Association of Language Testers in Europe. BULATS is co-
ordinated by University of Cambridge ESOL Examinations, who have been producing language
examinations since 1913 and deliver over 1.5 million tests a year in over 130 countries. Cambridge
ESOL is a part of Cambridge Assessment, one of the world's largest assessment agencies.
1.2 What levels of language ability does BULATS test?
BULATS tests are suitable for all learners who need a language test with a business or workplace
focus. There is no pass mark; rather candidates are placed in one of six levels or bands. These are
expressed as levels on the Council of Europes Common European Framework of Reference for
Languages (CEFR).
BULATS Test
Specification: A Guide for Clients | | 1 Introduction 1

The CEFR provides a series of levels of language ability from Beginner to Upper Advanced and is the
standard framework used in Europe for comparing candidates who have sat different tests in different
languages. They describe levels of ability in terms of what people can do in real situations. For
example, a typical BULATS computer-based test or Standard test candidate who receives a CEFR
level B2 would be expected to be able to understand most business reports and non-routine letters
with the aid of a dictionary. A list of the functional-situational Can-do statements at each level can be
printed on the back of all BULATS candidate reports and examples of these can be found in section 6.
Common European
Framework of Reference
(CEFR) Levels
Level description Cambridge ESOL certificated
examinations at these levels
C2 Upper Advanced CPE
C1 Advanced CAE, BEC Higher
B2 Upper Intermediate FCE, BEC Vantage
B1 Intermediate PET, BEC Preliminary
A2 Elementary KET
A1 Beginner -

Candidates taking the Standard test or a computer-based Reading and Listening test are placed into a
CEFR band based on their overall score on the module. They also receive scores for Listening, and
Reading and Language Knowledge.
Candidates taking a Speaking test or a Writing test are placed into a CEFR band based on examiner
judgement of their performance. Different levels of performance within a CEFR band are also
reported.
1.3 How are BULATS results reported?
The exact format of the reporting of candidates test results is decided by the organisation they work
for or study with. Typically, candidates receive a Test Report Form (TRF), which includes information
about their level. On the reverse of the form there is a summary of the Can-do statements and a guide
to the interpretation of scores. Group reports can also be produced for organisations who have
entered several candidates. See section 8 for more details.

1.4 Who is BULATS suitable for?
BULATS is suitable for any learner who needs to use English, French, German or Spanish at work.
The test is designed to be suitable for a wide range of people at work ranging from technicians to
secretaries or managers who may work in banking, education, manufacturing, administration,
research, marketing or other sectors. It does not require any previous business experience, so it is
also suitable for candidates who may need to use the foreign language in a work context in the future.
1.5 What topics and situations are covered?
To ensure that BULATS tests are representative of the language used in business situations, a wide
range of different functions and situations are covered. Below are some of the areas that candidates
can expect to meet in a BULATS test.
BULATS Test
Specification: A Guide for Clients | | 1 Introduction 2

Personal information
Asking for and giving personal details (name, occupation, etc.)
Asking about and describing jobs and responsibilities
Asking about and describing an organisation and its structure
The office, general business environment and routine
Arranging appointments/meetings
Planning future events and tasks
Asking for and giving permission
Giving and receiving instructions
Predicting and describing future possibilities
Asking for and giving opinions
Agreeing and disagreeing
Making, accepting and rejecting suggestions
Expressing needs and wants
Discussing problems
Making recommendations
Justifying decisions and past actions

Entertainment of clients, free time, relationships with colleagues and clients
Discussing interests and leisure activities
Inviting, accepting and refusing offers and invitations
Thanking and expressing appreciation
Apologising and accepting apologies
Travel
Making enquiries, reservations, requests and complaints
Health
Health and safety rules in the workplace
Leisure activities, interests and sports
Buying and selling
Understanding and discussing prices and delivery dates, offers and agreements
Products and services
Asking for and giving information about a product or service
Making comparisons, expressing opinions, preferences, etc.
Making and receiving complaints
Results and achievements
Descriptions and explanations of company performance and results, trends, events and changes
Other topic areas
A number of other topics in areas of general interest, such as food and drink, education (training,
courses), consumer goods, shopping and prices, politics and current events, places, weather, etc.
may be included.
BULATS Test
Specification: A Guide for Clients | | 1 Introduction 3

1.6 How should candidates prepare for BULATS?
BULATS tests candidates ability to use foreign languages in real-life situations. So the best way to
prepare for BULATS is to practise using the language in realistic situations.
Advice to candidates on how to prepare for BULATS is given in the BULATS Information for
Candidates Handbook. This is available from Agents for a small fee, or can be downloaded free of
charge from the BULATS website.
The skills tested in the Online and CD-ROM Reading and Listening tests are the same as in the
Standard test. Candidates should familiarise themselves with the format of the computer-based test
and the way they need to answer questions. A demonstration version of the computer-based test is
available on the BULATS website (www.bulats.org).
Teaching resources for tutors who are helping candidates to prepare for BULATS are available on the
Cambridge ESOL website at www.CambridgeESOL.org/teach/bulats/
Cambridge ESOL has also developed two types of BULATS Online Courses:
The blended learning course, which is delivered as a mix of online and classroom learning.
The self-study course, which will be delivered entirely online.
More information on these course can be found here:
http://bulats.org/BULATS-Training-and-Learning-Courses/Overview.html

BULATS Test
Specification: A Guide for Clients | | 1 Introduction 4

2 BULATS Online and CD-ROM Reading and Listening tests
The BULATS Online Reading and Listening test and the BULATS CD-ROM Reading and Listening
test are computer-based tests They are delivered via one of two modes: online or CD ROM. The task
types in both modes are identical, and the candidate experience is the same. The main differences
between the CD-ROM test and the online test are technical and administrative.
The tests assess candidates ability to use the foreign language by presenting questions via a
computer. Questions appear on screen and candidates answer them by clicking on a particular option
or by typing in words or phrases.
The tests are adaptive tests, which means they change in level of difficulty to meet the language level
of the candidate. If a candidate gets questions right, the program will give the candidate more difficult
questions. If they get questions wrong, it will give easier ones. The tests are supported by a large
secure bank of tasks, which allows a quick and accurate assessment of a candidates language skills.
There are eight types of question and they assess Reading and Listening skills, including grammar
and vocabulary knowledge. The tests start by testing a candidates Reading proficiency before starting
the second section which tests Listening. The task types can come in any order within each part of the
test. As the test is adaptive, the length of the test will depend on the candidate's level of ability but it is
usually approximately 60 minutes long.
Tasks in the Online and CD-ROM Reading and Listening tests
Skill Task type and focus
Reading and Language
Knowledge
Multiple choice. Reading to understand e.g. notices, messages,
timetables, adverts, leaflets, graphs.
Multiple choice. Reading a longer text for understanding, e.g. a
newspaper or magazine article, advert, leaflet.
Multiple choice cloze. Medium length gapped text focusing on
lexis and lexico-grammar.
Open cloze. Medium length gapped text focusing on grammar
and lexico-grammar.
Multiple choice. Gapped sentences focusing on grammar and
vocabulary, e.g. semantic precision, collocations, fixed phrases,
linking words.
Listening Multiple choice. Understanding short conversations or
monologues. Candidates listen and select the correct answer.
Multiple choice. Understanding short conversations or
monologues. Candidates listen and select the correct picture or
graphic.
Multiple choice. Listening to extended speech for detail and
inference. One monologue and one dialogue.

Candidates can hear the listening recordings twice.
BULATS Test
Specification: A Guide for Clients | | 2 BULATS Online and CD-ROM Reading and Listening tests 5


How are the results reported?
Candidates receive an overall BULATS score and scores for Listening, and Reading and Language
Knowledge. Each score is out of 100. The CEFR and BULATS scores are shown in the table below.
CEFR Levels BULATS scores Level description
C2 90100 Upper Advanced
C1 7589 Advanced
B2 6074 Upper Intermediate
B1 4059 Intermediate
A2 2039 Elementary
A1 1019 Beginner

Candidates scoring 0-9 in the online test are indicated as being pre-A1.

BULATS Test
Specification: A Guide for Clients | | 2 BULATS Online and CD-ROM Reading and Listening tests 6

3 BULATS Standard test
The Standard test lasts 110 minutes and tests listening and reading skills, and knowledge of grammar
and vocabulary. The sections, task format and focus, and the number of questions in each part or
section are given in the tables below.
Skill Part or
Section
Task format and focus Number of
questions
Listening Part 1 Graphical and written prompts with a short
conversation or monologue. Candidates have
to choose from 3 options.
10
The main focus is listening for specific
information.

Part 2 Forms and notes with gaps for missing
information. Candidates have to listen to
phone messages, orders, etc. and complete
the missing information.
12
The main focus is listening for specific
information and completing notes and forms.

Part 3 Written prompts with short recorded sections
or snippets, usually monologues. Candidates
have to match the text to the most appropriate
prompt.
10
The main focus is listening for global meaning.
Part 4 Multiple-choice questions with extended
speech in the form of monologues or
dialogues. Candidates have to choose from 3
options.
18
The main focus is listening for specific
information.

Subtotal: 50

Skill Part or
Section
Task format and focus Number of
questions
Reading and
Language
Knowledge
Part 1
Section 1
Graphical and written prompts, for example
notices, messages, timetables, adverts,
leaflets, graphs, etc. Candidates have to
choose from 3 options.
7
The main focus is reading for specific
information.

BULATS Test
Specification: A Guide for Clients | | 3 BULATS Standard test 7

Part 1
Section 2
Sentences where one word is gapped.
Candidates have to choose from 4 options.
6
The main focus is knowledge of grammar and
vocabulary.

Part 1
Section 3
A longer text, for example a newspaper or
magazine article, advert, leaflet, etc. with
multiple-choice questions. Candidates have
to choose from 3 options.
6
The main focus is reading for specific
information.

Part 1
Section 4
A medium length text with 5 words gapped.
Candidates have to provide the missing
words for the gaps.
5
The main focus is knowledge of grammar.
Part 2
Section 1
Four short texts on a similar theme or topic
with prompts. Candidates have to choose the
prompt which refers to each text.
7
The main focus is reading for specific
information.

Part 2
Section 2
A medium length text with 5 multiple-choice
gapped items. Candidates have to choose
from 4 options.
5
The main focus is knowledge of vocabulary.
Part 2
Section 3
A medium length text with 5 words gapped.
Candidates have to provide the missing
words for the gaps. This task is the same
format as Part 1 Section 4.
5
The main focus is knowledge of grammar.
Part 2
Section 4
Sentences where one word is gapped.
Candidates have to choose from 4 options.
This task is the same format as Part 1
Section 2.
6
The main focus is knowledge of grammar and
vocabulary.

Part 2
Section 5
A longer text, for example a newspaper or
magazine article, report, etc. with multiple-
choice questions. Candidates have to choose
from 4 options.
6
The main focus is reading for specific
information and general meaning.

BULATS Test
Specification: A Guide for Clients | | 3 BULATS Standard test 8

Part 2
Section 6
A medium length text with a wrong word in
some of the lines. Candidates have to identify
and correct the wrong word or indicate that
the line does not contain a wrong word.
7
The main focus is grammar.
Subtotal 60
Total 110

How are the results reported?
Candidates receive an overall BULATS score and scores for Listening, and Reading and Language
Knowledge. Each score is out of 100. The CEFR and BULATS scores are shown in the table below.
CEFR Levels BULATS scores Level description
C2 90100 Upper Advanced
C1 7589 Advanced
B2 6074 Upper Intermediate
B1 4059 Intermediate
A2 2039 Elementary
A1 019 Beginner

A1 includes candidates who are working towards A1.

BULATS Test
Specification: A Guide for Clients | | 3 BULATS Standard test 9

4 BULATS Online Writing test
The BULATS Online Writing test is a separate, stand-alone test and can be taken on its own or in
conjunction with other BULATS tests. It is available in English.
There are two parts to the BULATS Online Writing test. These are described in the table below:
Section Format Time
(approx.)
Further information
Part 1 Short message, email
or letter (5060 words)
15 minutes Candidates write a short message, email
or letter using information given in written
input
Part 2 Report or letter (180
200 words)
30 minutes Candidates write a short report or letter
following brief instructions. For this part,
candidates choose a task from two
alternatives.

How are candidates assessed?
Candidates are assessed by trained examiners.
They are assessed on:
accuracy and appropriacy of language
organisation of ideas
task achievement..
Examiners undergo a process of training and certification. This helps to ensure that different
examiners award standardised scores that are reliable across candidates at different levels of ability.
How are the results reported?
The Test report will indicate which CEFR level the candidate has achieved. A strong performance
within a level is shown by the word high. For example, a strong B1 level candidate will receive a
result of B1 High
BULATS Test
Specification: A Guide for Clients | | 4 BULATS Online Writing test 10

5 BULATS Writing test (paper-based)
The BULATS Writing test is a separate, stand-alone test and can be taken on its own or in conjunction
with other BULATS tests. It is available in English, French, German and Spanish.
There are two parts to the BULATS Writing test. These are described in the table below:
Section Format Time
(approx.)
Further information
Part 1 Short message, email
or letter (5060 words)
15 minutes Candidates write a short message, email
or letter using information given in written
input.
Part 2 Report or letter (180
200 words)
30 minutes Candidates write a short report or letter
following brief instructions. For this part,
candidates choose a task from two
alternatives.

How are candidates assessed?
Candidates are assessed by trained examiners.
They are assessed on:
accuracy and appropriacy of language
organisation of ideas
task achievement..
Examiners undergo a process of training and certification. This helps to ensure that different
examiners award standardised scores that are reliable across candidates at different levels of ability.
How are the results reported?
The test report will indicate a candidates CEFR level. The report will also state whether they are high,
middle, or low within that band.
BULATS Test
Specification: A Guide for Clients | | 5 BULATS Writing test (paper-based) 11

6 BULATS Online Speaking test
The BULATS Online Speaking test is a separate, stand-alone test and can be taken on its own or in
conjunction with other BULATS tests. It is available in English.
The BULATS Online Speaking test has five parts. These are described in the table below:
Section Format Time
(approx.)
Further information
Part 1 Interview 3 minutes The candidate answers eight questions about
him/herself, his/her work, background, future
plans and interests.
Part 2 Reading Aloud 2 minutes The candidate reads aloud eight sentences.
Part 3 Presentation 2 minutes The candidate is given a work-related topic (eg
The Perfect Office) to talk about for one
minute. The candidate is given prompts about
the topic and 40 seconds in which to prepare.
Part 4 Presentation
with Graphic
2 minutes The candidate is given one or more graphics
(eg pie charts, line graphs) with a business
focus (eg Company Exports) to talk about for
one minute. The candidate has one minute in
which to prepare.
Part 5 Communication
Activity
3 minutes The candidate gives his/her opinion on five
questions related to one scenario (eg Planning
a Conference).
How are candidates assessed?
Candidates are assessed by trained examiners.
For Parts 1, 3, 4 and 5 candidates are assessed on:
task achievement
management of discourse
pronunciation
use of grammar and vocabulary and extent of hesitation.
For Part 2 candidates are assessed on:
overall intelligibility, pronunciation of individual sounds and stress & intonation is assessed.
Examiners undergo a process of training and certification. This helps to ensure that they award
standardised marks that are reliable across candidates at different levels of ability and over time.
How are the results reported?
The Test report will indicate which CEFR level the candidate has achieved. A strong performance
within a level is shown by the word high. For example, a strong B2 level candidate will receive a
result of B2 High
BULATS Test
Specification: A Guide for Clients | | 6 BULATS Online Speaking test 12

7 BULATS Speaking test (face-to-face)
The BULATS face-to- face Speaking test is a separate, stand-alone test and can be taken on its own
or in conjunction with other BULATS tests. It is available in English, French, German and Spanish.
The BULATS face-to-face Speaking test is in three parts. These are described in the table below:
Section Format Time
(approx.)
Further information
Part 1 Interview 4 minutes The examiner asks candidates questions about
themselves, their work and interests.
Part 2 Presentation 4 minutes The examiner gives candidates a sheet with
three topics on it. Candidates choose a topic
and have one minute to prepare a short
presentation. They speak on the topic for one
minute. Afterwards, the examiner asks
candidates one or two questions about their
presentation.
Part 3 Information
exchange
and
discussion
4 minutes The examiner gives candidates a sheet with a
role-play situation. Candidates ask the examiner
questions to get the required information. This
leads to a discussion on a related topic.

How are candidates assessed?
The test is conducted and marked by one examiner. The Speaking test is recorded and the recording
is sent to a second examiner who assesses the candidates speaking ability independently.
Examiners use a set of scales to assess candidates ability in English. These scales focus on
particular areas of language ability. They are:
accuracy of language
range of grammar and vocabulary
pronunciation
management of discourse.
Examiners undergo a process of training and certification. This helps to ensure that they award
standardised marks that are reliable across candidates at different levels of ability and over time.
How are the results reported?
The test report will indicate a candidates CEFR level. The report will also state whether they are high,
middle, or low within that band.

BULATS Test
Specification: A Guide for Clients | | 7 BULATS Speaking test (face-to-face) 13

8 Test results
There are two types of test report.
The group report can be provided to organisations who have entered a number of candidates for
a BULATS session. This report lists candidates and their scores in each part, their overall score,
and their CEFR level.
The candidate report which can be printed onto pre-printed BULATS stationery. This report is
normally referred to as the Test Report Form or TRF.
Both kinds of report are normally provided by the Agent or organisation which arranges the test
session. Test reports can be printed in English, French, German or Spanish.
Below is an example of the front of a Test Report Form (TRF) in English for a candidate who took the
BULATS Online Reading and Listening test. Apart from some minor details this is identical to the TRF
for the CD-ROM Reading and Listening test and the Standard test: the candidates name, company
and the date of test are on the front with the candidates overall CEFR level and the BULATS score
(out of 100). In addition, the candidates BULATS scores for Listening, and Reading and Language
Knowledge (both out of 100) are provided. Note that as Listening and Reading and Language
Knowledge are not weighted equally the overall score is not necessarily the mean of the section
scores; for example, a candidate who receives a Listening score of 50 and a Reading and Language
Knowledge score of 70 will not necessarily receive an overall score of 60.

BULATS Test
Specification: A Guide for Clients | | 8 Test results 14

Writing and Speaking Test Report Forms have a very similar format to that shown here and contain
information about how strongly candidates performed within a CEFR level..
Below is an example of what is usually found on the reverse of a BULATS Test Report Form. The first
table shows a list of Can-do statements corresponding to different CEFR levels. Can-do statements
are functional/situational oriented statements that have been shown to describe what we would expect
a candidate at a specific CEFR level to be able to do in the language they are being tested in. For
example, a candidate who receives a CEFR level of B2 is expected to be able to understand most
reports and non-routine letters, with dictionary help.
Also included on the reverse is an explanation of the scores and further information regarding the
scoring of the Speaking and Writing tests.

BULATS Test
Specification: A Guide for Clients | | 8 Test results 15

9 Statistical characteristics of the test
Two key qualities of an exam are validity and reliability.
Validity relates to the usefulness of a test for a purpose: does it enable well-founded inferences about
candidates' ability? Can performance in the test be interpreted in terms of ability to perform in the real
world?
Reliability relates to the accuracy of the measurement of the exam: does it rank-order candidates
similarly in repeated uses? Can we expect a candidate to achieve the same score in two versions of
the same test or in the computer-based and the Standard tests?
This section presents evidence for the validity and reliability of BULATS.
9.1 How accurately do the BULATS tests measure?
It is important for candidates and exam users to be confident that an examination produces scores
that are accurate in that 1) the scores within one test are not significantly different for candidates who
are at the same ability and 2) if a candidate sat two versions of the same test (and no increase in
candidate ability occurred between the tests) he or she would get the same or nearly the same score
on both tests.
Language testers use the concept of Measurement Error, for example, to observe this. Error does not
mean that the test contains mistakes but rather that candidates scores are not completely consistent
across the test or between different versions of the test. Imagine a group of candidates who are all at
the same level of language proficiency. If they sat a test they would not all get exactly the same score
no matter how accurate or long the test was. This difference in scores could be due to a number of
factors such as different levels of motivation or misinterpretation of a question or some candidates
meeting questions that tested a particular area of language they were weak at. This difference in
scores is an example of Measurement Error.
Reliability
A common way of measuring error and consistency of test scores is to use a correlation coefficient
called Cronbachs Alpha. This operates by dividing the test into halves and correlates the candidates
scores in one half of the test to their scores in the other half. It then adjusts the correlation to take
account of the full number of items in the test as a whole. In theory, reliability coefficients can range
from 0 to 1 but in practice we can expect them to be between 0.6 and 0.95, with the higher number
indicating a more reliable test.
BULATS agents are requested to return candidate answer sheets from the Standard test to
Cambridge ESOL, where they are used to calculate the reliability of the BULATS Standard test. Table
1 opposite shows the reliability (Cronbachs Alpha) for the most recent versions of Standard BULATS
based on a random sample of the live BULATS population.
BULATS Test
Specification: A Guide for Clients | | 9 Statistical characteristics of the test 16

Table 1: Reliability (Cronbachs Alpha) for most recent versions of Standard BULATS, by
component and as a whole
Standard test
version
Sample size of
candidates
Listening
reliability
Reading and
Language
Knowledge
reliability
Overall
reliability
EN21 520 0.93 0.95 0.97
EN22 468 0.92 0.92 0.96
EN23 959 0.92 0.93 0.96
EN24 1446 0.95 0.94 0.97
EN25 789 0.91 0.92 0.95


However, correlations depend on the rank ordering of candidates: consistent rank ordering is easier to
achieve with a group of candidates with a wide range of abilities. Therefore, measures of reliability,
such as Cronbachs Alpha, are as much dependent on the spread of ability of the candidate population
as the accuracy of the test. This means that direct comparison of reliability across different tests with
different populations and ranges of item difficulty can be misleading. In judging the adequacy of the
reliability of a test we need to take into account the type of candidates taking the tests and the purpose
of the test.
The BULATS computer-based test is adaptive, which means that candidates receive items appropriate
to what the test calculates as the candidates ability. Therefore different candidates will not be
presented with the same items. This means that split-half methods of calculating reliability, such as
Cronbachs Alpha, cannot be used. An analogous measure, the Rasch reliability, is used instead;
rather than raw scores, this reliability measure uses candidates ability estimates. Table 2 below
shows the reliability (Rasch) of computer-based BULATS based on a sample of the live BULATS
population.
Table 2: Reliability (Rasch) of computer-based BULATS, by component and as a whole
Sample size of candidates Listening reliability Reading and Language
Knowledge reliability
Overall reliability
1407 0.92 0.89 0.94

The reliability of the overall test is 0.94, which is very high. At first glance, it may be surprising that the
reliability of the Listening sub-section (0.92) is higher than that for the RLK sub-section (0.89) since
the Listening section is shorter than the RLK section. However, the Listening section also shows a
higher standard deviation of ability estimates which would help to increase the reliability, since
reliability improves as the collection of score data becomes more widely spread from the mean and
the range increases.
Standard Error of Measurement
Another way of describing the accuracy of a test is in terms of candidates' individual scores and the
likely variation in those scores from their real or true scores; that is their scores if the test contained no
Measurement Error whatsoever. (A true score can be defined as the mean score if a candidate were
to take the test repeatedly.) This is what the Standard Error of Measurement provides.
BULATS Test
Specification: A Guide for Clients | | 9 Statistical characteristics of the test 17

The transformation of raw scores to BULATS scale scores is non-linear. Therefore the form of SEM
that is most meaningful to report is the Conditional Standard Error of Measurement. This relates to a
particular score in the test. In the case of BULATS, Conditional SEM is most useful when used to
estimate the error associated with each band cut-off, that is the lowest score at a band. The
conditional SEM will vary slightly according to test version and band cut-off, depending on the precise
difficulty of the items. However, the values reported in the table below for a sample calibration version
are typical.
Table 3: Conditional SEM in BULATS Standard scores for a sample calibration version
At band cut-off Overall SEM Listening/Reading and
Language Knowledge SEM
5 +/-3 +/-4
4 +/-4 +/-5
3 +/-4 +/-5
2 +/-4 +/-5
1 +/-3 +/-4

The table above shows that candidates at Band 1 or 5 are likely to get a score that is within 3 points of
their true score. They are almost certain to get a BULATS score within 6 points (2 Standard Error
Measurements) of their true score. For candidates at Bands 24 these numbers are 4 and 8
respectively.
It is always possible that a candidate will be at the borderline between 2 levels, but for the majority of
candidates who take the BULATS Standard test the Standard Error Measurements reported above
show that they will receive a band that is an accurate reflection of their true ability. It is extremely
unlikely for candidates to receive a band that is more than one band higher or lower than their ability
warrants.
If we want to compare candidates scores, either those of two different individuals or the same
candidates performance over time, it is necessary to take into account the Standard Error of
Measurement of both scores. This is higher than that for a single score. We can calculate from the
table above that a difference of 7 BULATS points between candidates probably indicates a real
difference in language ability and candidates with a difference of 14 BULATS points are almost
certainly at different language abilities. However, when comparing candidate performance over time it
is also necessary to take into account another aspect of Measurement Error known as Regression to
the Mean. This is a statistical phenomenon whereby candidates who score well below or above the
mean will tend to score nearer the mean if they sit the test again regardless of any improvement in
language ability. Regression to the Mean is a phenomenon of all tests.
The SEM of the computer-based tests can be seen in the table below.
Table 4: SEM (Rasch) of BULATS Scores in CB BULATS Version 6.1 (sample size = 1407)
Overall SEM Reading and Language
Knowledge
Listening
+/-5 +/-6 +/-7

BULATS Test
Specification: A Guide for Clients | | 9 Statistical characteristics of the test 18

As for the table detailing Standard test SEM, the table above shows that the majority of candidates will
receive a band that is an appropriate reflection of their true language proficiency level. It is highly
improbable that a candidate will receive a band that is more than one level different from their true
language proficiency level.
Reliability in the Speaking and Writing tests
BULATS Speaking and Writing tests contain tasks that are authentic in that they resemble tasks that
candidates might be expected to perform in a business environment, such as writing a report or giving
a short presentation. These types of tasks require assessment by trained examiners. Reliability in
BULATS Speaking and Writing tests is centred on the need to ensure that examiners mark
consistently over time (intra-rater reliability) and with other examiners (inter-marker reliability). This is
maintained in the examiner certification and training process as detailed below.
The Writing test assesses writing skills in relation to the workplace, and takes 45 minutes. The test
is assessed by up to two trained language specialists.
Writing examiners undergo a rigorous training programme in order to qualify and are required to
re-certify every two years. In addition, sample scripts are regularly monitored by examiner
monitors to ensure that examiners in the field are marking to accepted standards.

The Speaking test assesses speaking skills in the workplace. For the online test the candidates
responses are recorded and marked by up to five trained examiners. For the face-to-face test the
test is recorded and assessed by the examiner conducting the test and then by a second trained
examiner. Provision is made for a third examiner to assess the interview in cases where examiner
1 and examiner 2 differ in their grading by more than two sub-bands.
Oral examiners undergo a rigorous training programme in order to qualify and are required to re-
certify every two years. In addition, sample performances are regularly monitored by examiner
monitors to ensure that examiners in the field are marking to accepted standards.
9.2 Are different versions of the tests equivalent?
Organisations often use BULATS over an extended period of time. Therefore they are likely to use
more than one version of the Standard or computer-based test. It is essential that candidates and
exam users are confident that different versions of the test are equivalent in that they produce scores
and bands that are at the same level of language proficiency.
The equivalence of different versions of the tests is promoted by the Examination Production Cycle,
where item writers are trained and items are trialled and checked for suitability and difficulty. This
process is explained in detail in section 8.4. Care is taken to ensure that the trial sample is
representative of the BULATS candidate population in terms of first language background and
language level. This allows us to check to see if any of the items show bias, that is whether they are
particularly difficult for a specific group of candidates for non-linguistic reasons. Any such items are
excluded from the final test. Items which appear adequate then enter an item bank, from which new
BULATS versions are constructed that conform to set targets of Item Difficulty and Discrimination.
Item Difficulty is a measure of the likelihood of a candidate of a fixed ability to answer the item
correctly. Discrimination is a measure of the capacity of an item to distinguish or discriminate between
weak and strong candidates. Item banking also allows tests to be constructed that contain items that
test a representative sample of the grammatical structures, functions and topics associated with
language used in a business environment.
For each new standard BULATS version, transformation tables are produced which convert raw
scores to BULATS standardised scores and BULATS bands. These tables are produced for Reading
and Language Knowledge, and Listening, and for all items separately, so as to provide the component
BULATS score and overall BULATS score and band. Items for the online and CD-Rom versions come
directly from the calibrated item bank.
BULATS Test
Specification: A Guide for Clients | | 9 Statistical characteristics of the test 19

9.3 Are the computer-based test and the Standard test equivalent?
Some organisations use both the computer-based test and the Standard test. In these situations it is
essential that exam users are confident that both tests produce scores that are comparable.
Whenever a new computer-based test is designed its items are taken from calibrated items in the item
bank. To investigate the effect of the mode of the test (computer or paper based) a sample of
candidates from a number of different languages are requested to take both the computer-based test
and a Standard test. Their results are correlated and bands and scores in each test are compared to
ensure that candidates receive similar scores in the computer-based and Standard tests taking into
account the SEM for both tests (see Jones 2000).
Below is a table showing the correlations of scores in recent computer-based and standard versions of
the test. The measurement error of the tests, as discussed earlier, underestimates this correlation.
This is taken into account and is known as Correction for Attenuation. This table shows that the mode
of the test (paper or computer-based) has, in most cases, little effect on a candidates band and
overall score. However, for the minority of candidates who are uncomfortable using a computer we
cannot expect their scores to be the same in each mode but the overall band should be very close.


Correlation (corrected for
attenuation)
Band score 0.86
Overall BULATS score 0.95
Sample size 62


9.4 On-going validation of BULATS
Cambridge ESOL places emphasis on the maintenance of quality by regular monitoring of candidate
and examination performance. Re-appraisal and, where necessary, revision to ensure that
examinations provide the most accurate, fair and useful means of assessment are key strengths of the
organisation. This work is supported by the Research and Validation Group at Cambridge ESOL; the
largest dedicated research team of any UK-based provider of English language assessment.
Currently a number of projects are in progress related to BULATS and Business English. Volume 17 in
the Studies in Language Testing Series: Issues in Testing Business English by Barry OSullivan, deals
with the recent revision of the Business English Certificate (BEC) examinations and outlines
Cambridge ESOLs understanding of the Business English Construct and model of communicative
ability.
On occasion, organisations and candidates are requested to help in providing data and feedback for
research projects; their co-operation is welcomed.
More information on Cambridge ESOL and its research and validation work can be found on the
Cambridge ESOL website: www.CambridgeESOL.org
BULATS Test
Specification: A Guide for Clients | | 9 Statistical characteristics of the test 20

Cambridge ESOL produces Research Notes, a quarterly journal dealing with current issues in
language testing and Cambridge ESOL examinations. These can be accessed on the website and a
search for articles related to BULATS or other themes made. A selection of recent articles on BULATS
and Business English is given below.
BULATS: A case study comparing computer based and paper-and-pencil tests
Neil Jones Research Notes Issue 3 (November 2000)
CB BULATS: Examining the reliability of a computer based test using test-retest method
Ardeshir Geranpayeh Research Notes Issue 5 (July 2001)
Revising the BULATS Standard Test
Ed Hackett Research Notes Issue 8 (May 2002)
Some theoretical perspectives on testing language for business
Barry O'Sullivan Research Notes Issue 8 (May 2002)
Analysing domain-specific lexical categories: evidence from the BEC written corpus
David Horner and Peter Strutt Research Notes Issue 15 (February 2004)
Using simulation to inform item bank construction for the BULATS computer adaptive test
Louise Maycock Research Notes Issue 27 (February 2007)
Using the CEFR to inform assessment criteria development for Online BULATS speaking and writing
Lucy Chambers Research Notes Issue 38 (November 2009)
CB BULATS: Examining the reliability of a computer-based test
Laura Cope Research Notes Issue 38 (November 2009)

BULATS Test
Specification: A Guide for Clients | | 9 Statistical characteristics of the test 21

10 The development of BULATS
10.1 The history of BULATS
The BULATS Standard test was first launched in 1997 in a limited number of countries. Since then the
network has grown to more than 300 Agents in over 40 countries. The Speaking and Writing tests
became available in 1998 and the computer-based test was launched in 1999. In mid-2002 a revised
version of the Standard test was launched. This version was slightly longer than the original version
and reported results in a slightly different way. At the same time the computer-based test was revised
and the software upgraded. The computer-based test continues to be upgraded with the regular
release of updated versions. In 2008 an online version of the computer-based test was launched, this
is adaptive in the same way as the CD-ROM version but has increased flexibility in that no installation
is needed. To add to the test provision, online Speaking and Writing test have recently been
developed which are easy and cost effective to administer.
10.2 The development of the BULATS computer-based tests
The first trial version of the BULATS computer-based test was released in 1998 in English and this
was closely followed by a full range of tests in all languages which were available from 1999.
The test is adaptive and is supported by a large bank of encrypted, secure tasks. An adaptive
algorithm chooses items as the test progresses according to how the candidate performed on previous
items. It allows candidates to face items at an appropriate level of difficulty and provides more
accurate assessment of candidate ability than a non-adaptive test with a similar number of items.
The software used in the BULATS computer-based test was developed for Cambridge ESOL by
Homerton College, University of Cambridge, which is an acknowledged centre for the development of
educational technology.
Development has continued with the production of new versions of the test which combine revised
item banks and improved software. Many of the changes are in response to customers requirements
to make the application more customisable. These changes have included; an export facility so
customers can export data to other applications, and changes to the registration screen to include
more than one ID number and other company specific information. Alongside this, there have been
improvements to security to maintain the integrity of the test, and improvements to software design to
ensure the test runs effectively on a wide range of hardware specifications.
The computer-based test is designed to run in both stand-alone and networked mode. There is a
comprehensive manual available in six languages.
Online BULATS was launched in 2008; this is also an adaptive test and is available in four languages.
Online delivery means that the tests can be administered with no installation ; tests can be taken at
any computer which has a sound card and is accessed by Internet Explorer. Results from the test are
available immediately and allow flexible reporting so that groups or individuals can be scored.
Online Speaking and writing tests were launched in 2010. These allow candidates to take a speaking
or writing test online on any computer which has a sound card and is accessed by Internet Explorer.
The candidate responses are recorded and uploaded: agents can then arrange for them to be marked
on screen via the online system. These tests are easy to administer and allow great flexibility for
agents, examiners and candidates.
The new online BULATS Speaking test measures the same CEFR levels and measures to the same
precision as the face-to-face test Speaking test. However it is important to note that the online and
face-to-face Speaking tests are different types of tests; they contain different task types, have different
assessment scales and measure different aspects of spoken performance.
BULATS Test
Specification: A Guide for Clients | | 10 The development of BULATS 22


Support for all computer-based products is offered through a dedicated helpline, local agent or office
and our website.
10.3 The revision of BULATS in 2002
As with any test, candidature and usage change over time, and BULATS is reviewed regularly to
ensure the tests are fair, make the best use of modern technology and meet customers needs and
expectations. In the first few years of the test a more detailed picture emerged of the BULATS
population and the needs of BULATS clients. A major study which included a questionnaire to existing
BULATS Agents was carried out. As a result of this and other concurrent validation studies it was
decided to revise some elements of the Standard test. The main revisions were:
The length of the Listening section was increased from 30 to 50 items. As a result of this two
sections of the Listening test were revised allowing candidates just one chance to listen to the
Listening text.
A number of changes to both the format and task types were made in the newly titled Reading and
Language Knowledge section. Reading tasks and Language Knowledge tasks were alternated to
avoid the disproportionatly negative effects of possible candidate lack of time and fatigue on the
grammar and vocabulary tasks, which previously had been at the end of the paper.
Amended tasks were extensively trialled to ensure the measurement characteristics of the test
were maintained or improved and that validity and fairness issues had been accounted for.
Feedback on the revised format has been positive.
A more detailed treatment of the revision process can be found in Issue 8 of Cambridge ESOLs
publication Research Notes, available online at www.CambridgeESOL.org/rs_notes/
10.4 The Production Cycle for question papers


BULATS Test
Specification: A Guide for Clients | | 10 The development of BULATS 23

BULATS Test
Specification: A Guide for Clients | | 10 The development of BULATS 24
Cambridge ESOL employs teams of item writers to produce examination material, and throughout the
writing and editing process strict guidelines are followed in order to ensure that the materials conform
to the test specifications. Topics or contexts of language use which might introduce a bias against any
group of candidates of a particular background (i.e. on the basis of sex, ethnic origin, etc.) are
avoided.
After selection and editing, the items are pretested. Pretesting plays an important role as it allows for
questions and materials with known measurement characteristics to be banked so that new versions
of question papers can be produced as and when required. The pretesting process helps to ensure
that all versions conform to test requirements in terms of content and level of difficulty.
For English Reading and Listening, items are pretested during the live administration of the online test
or following the procedure for foreign language tests described below. A sample of Online candidates
are given a small number of extra items during their tests; these items play no part in the candidates
results and will not increase the length of the test significantly. These extra items will be taken by
many different candidates and the data compared to items with known measurement characteristics
resulting in the items being calibrated and linked to a common scale of difficulty.
For foreign language versions, pretest items are compiled into pretest papers and these are supplied
to candidates. The tests include anchor items which are carefully chosen on the basis of their known
measurement characteristics and their inclusion means that all new items can be linked to a common
scale of difficulty. These pretest papers are despatched to a wide variety of organisations which have
offered to administer the pretests to candidates of a suitable level. After the completed pretests are
returned to the Pretesting Unit at Cambridge ESOL, a score for each student is provided to the centre.
The items are marked and analysed, and those which are found to be suitable are put into an item
bank. BULATS question papers then go through an additional process of Standards Fixing which
confirms the measurement characteristics of the tasks and items.
Materials for the productive tests (Speaking and Writing) are trialled directly with candidates to assess
their suitability for inclusion in the item bank.
For further information, or to enquire about participating in BULATS Pretesting, please email:
BULATSPretesting@CambridgeESOL.org