Sei sulla pagina 1di 14

Test Review Running head: TOLD PRIMARY 3 TEST REVIEW

Test Review: Test of Language Development: Primary, Third Edition Lynne Cox Meagan Keashly University of Calgary

Test Review Description The Test of Language Development- Primary, Third Edition (TOLD-P:3) was authored by Phyllis L. Newcomer and Donald D. Hammill, and published in 1997 by PRO-ED. The complete kit includes an examiners manual, picture book, and profile/examiner record booklets, which all come in a cardboard storage box (Newcomer & Hammill, 1997). This edition of the kit has been

replaced by a newer edition, the Test of Language Development- Primary, Fourth Edition, which is available for purchase at a cost of $314.00 US (PRO-ED, 2008). Purpose: Introduction & Recommended Uses The TOLD-P:3 is an individually administered test designed to assess spoken language in children ages 4 years 0 months, to 8 years 11 months. This test can be used in clinical, research and educational settings (Newcomer & Hammill, 1997). The test has four main uses: (a) to identify children who are significantly below their peers in language proficiency, (b) to determine childrens specific strengths and weaknesses in language skills, (c) to document childrens specific strengths and weaknesses in language as a consequence of special intervention programs, and (d) to measure language in research studies. (Newcomer & Hammill, 1997, p. 12) Examiners wishing to give and/or interpret the test should have a basic understanding of testing statistics; knowledge of test administration, scoring and interpretation; and be informed regarding evaluation of mental ability. They are encouraged to have supervised practice in administering and scoring mental ability tests. Before using the TOLD-P:3, examiners are expected to become familiar with the examiners manual and the other parts of the kit, and should have practiced administering the test at least three times. It takes approximately 30 minutes to 1 hour to administer the core subtests of the TOLD-P:3, and supplemental subtests require an additional 30

Test Review minutes. The supplemental and core subtests should not be administered at the same time (Newcomer & Hammill, 1997). In developing the TOLD-P:3, the authors did not follow a specific theoretical perspective and instead incorporated the contributions of a variety of linguists and psycholinguists such as Chomsky; Jakobson, Fant, and Halle; Brown; Bloom and Lahey; and Vygotsky (Hayward, Steward, Phillips, Norris, & Lovell, 2008). The TOLD-P:3 was developed using a twodimensional model of linguistic features and linguistic systems. It has six core and three supplemental subtests. How do these fit with the two-dimensional model? 2 versus 6 3? The purpose of this separation is to distinguish speech competence, measured primarily by the

supplemental subtests, from language competence. This allows for easier identification of disorders that lie specifically in one area versus the other. While the core subtests are designed to be administered to all children taking the TOLD-P:3, the three supplemental subtests are generally given only to children who are suspected of, or known to have, phonological issues. (Newcomer & Hammill, 1997). The Test of Language Development has gone through a number of revisions since first being published in 1977. It was one of the first tests that focused only on development of language (Hayward et. al., 2008). The TOLD-P was published in 1982 with identical test items to the original TOLD but with a changed testing manual. This version had an increased normative sample; allowed for the creation of four composite scores- Listening, Speaking Semantics and Syntax; and provided more information about rationale for the test and results interpretation. In 1988, the TOLD-P:2 was developed with an increased normative sample with ethnicity information presented, and Phonology was added as the fifth composite. Additionally, items were added to some subtests in order to strengthen reliability, and confusing items were deleted or

Test Review changed. In the third edition, the Phonemic Awareness and Relational Vocabulary subtests were

added, and the phonological subtests became supplemental so that their results would not be part of overall composite scores (Hayward et. al., 2008). New normative data was added and the sample was linked to the 1990 census data. The authors included reliability coefficients for subgroups of the normative sample. Also, items on the test were re-evaluated to eliminate biased ones (Newcomer & Hammill, 1997). However, of note is a study by Hammer, Pennock-Roman, Rzasa and Tomblin (2002) comparing performance on the core subtests, excluding Relational Vocabulary, between African American and White kindergarteners in the U.S that found 15% of the items have differential item functioning. There is currently a fourth edition of the test available which includes normative data representative of 2005 United States census data and expanded studies of item bias (PRO-ED, 2008). Major Features of the Test The nine subtests of the TOLD-P:3 are based on a two-dimensional model designed to measure a childs spoken language. You want to put here in the major features section the core and supplementary subtests. Each subtest has a linguistic feature (semantics, syntax or phonology) and a linguistic system (listening, organizing or speaking) This is more clearly presented here than above (Newcomer & Hammill, 1997). Newcomer and Hammill (1997) note that while the model presents these aspects of language as discrete entities, we recognize that they are in fact highly interrelated. (p. ) [Of note, in the examiners manual the authorsx cite numerous other linguistic assessment instruments which successfully utilize a similar format to each of the subtests. omit] Linking statement each of these subtests is further described below. Semantic Core Subtests

Test Review The Picture Vocabulary (PV) 30-item subtest assesses how well a child understands the meaning of spoken English words. The child is presented with four pictures, and is required to point to the picture that best represents the meaning of a word said by the examiner. The 30-item Relational Vocabulary (RV) subtest examines a childs ability to understand and orally express how two words are related. There are no pictures involved in this subtest, as it is completely verbal. Each of the 28 items on the Oral Vocabulary (OV) subtest present a child with a common

English word spoken by the examiner, and requires the child to give an oral definition of the word. Similar to the RV subtest, there are no pictures involved. Syntax Core Subtests The Grammatic Understanding (GU) subtest contains 25 items that examine the childs ability to understand the meaning of sentences. This subtest does not require a verbal response from the child, as the child is presented with pictures to choose from. The examiner states a sentence, and the child chooses from three pictures to identify the picture that most accurately represents it. The Sentence Imitation (SI) subtest has 30 items and examines the childs ability to state English sentences correctly. A sentence is spoken by the examiner, and the child is expected to repeat the sentence verbatim. There are 28 items on the Grammatic Completion (GI) subtest which are designed to examine the childs ability to recognise, understand, and use common English morphological forms (Newcomer & Hammill, 1997, pg. 9). Included is an understanding of plurals, verb tenses, adjectives, and possessives. The examiner reads unfinished sentences and the child is to supply the morphological form that is missing. Phonological Supplemental Subtests The Word Discrimination (WD) subtest contains 20 items and examines a childs ability to identify significant speech sound differences. The examiner orally states a pair of words to the

Test Review child and the child must then identify the pair of words as being either the same or different from each other. On the 14 item Phonemic Analysis (PA) subtest, a childs auditory processing skill of breaking words into smaller phonemic units is examined. The examiner presents a word, has the child repeat the word, and then asks the child to say the word again but this time eliminating a

specific part of the word. The 20 items on the Word Articulation (WA) subtest examine the childs ability to say important speech sounds in English. The examiner presents a picture and says a sentence to the child, which together are designed to get the child to say specific words that contain certain speech sounds. The examiner then identifies whether or not the child accurately said the speech sounds. Subtests can be grouped together based on common features to create six composites: Listening (PV+GU); Organizing (RV+SI); Speaking (OV+GC); Semantics (PV+RV+OV); Syntax (GU+SI+GC); Spoken Language (PV+RV+OV+GU+SI+GC) (Newcomer & Hammill, 1997, p. 11-12). The phonological subtests do not contribute to composite scores. This decision was in response to test users who felt that scores on these subtests, particularly the Word Articulation subtest, would often skew the composite scores as most children master these skills by age 7 (Newcomer & Hammill, 1997). Excellent and comprehensive section. Administration The test is only appropriate for those children in the age range of 4 years 0 months to 8 years 11 months. The authors state that the test can be used on children with a wide variety of English speaking styles, such as those with speech articulation difficulties. However, they also note that examiners should be cautious when using the test with children who speak non-standard English or are bi-lingual.lingual. bilingual The test is not appropriate for non-English speakers or those who are deaf (Newcomer & Hammill, 1997).

Test Review Core subtests must be administered in the following order: PV (I), RV (II), OV (III), GU (IV), SI (V) and GC (VI). Those administering the supplemental subtests should administer the core subtests in the prescribed order first, and then administer the other subtests in the following order: WD (VII), PA (VIII), and WA (IX). Occasionally, an examiner may choose to omit one or

more subtests, such as when a child is unable to complete a subtest due to a disability; however the remaining subtests must be administered in the same order. Regardless of the childs age, each subtest should be administered with the first item. Testing of each core subtest is stopped when the examinee misses five items in a row. On the supplemental subtests, all items should be administered, regardless of the number of items missed (Newcomer & Hammill, 1997). Clear directions on what to say to the child are provided in both the examiners manual and the record booklet. Each subtest contains at least one practice item, which is not scored, in order for the examinee to become familiar with the types of items and responses required for the subtest. Examiners can repeat the examples with the child. If the child cannot complete the example item(s), the subtest should not be administered. Items are scored as either 1 (correct) or 0 (incorrect). Answers to all test items are in the record booklet (Newcomer & Hammill, 1997). Scoring and Interpretation The front page of the protocol is divided into three sections (a) identifying information about the examinee, examiner and the test administration, (b) record of scores, and (c) profile of scores (Newcomer & Hammill, 1997). Test results are given in raw scores, age equivalents, percentiles, and standard scores. Standard scores for each composite are recorded as sums of standard scores, which are then converted into a quotient (Newcomer & Hammill, 1997). Age equivalents are provided with the manual clearly stating the American Psychological Association (1974) does not endorse the use of age equivalents. Percentiles are provided for each

Test Review subtest but Sattler (2008) notes percentile ranks cannot be used in statistical tests unless they are converted to another scale. However it is usually only for research purposes that you would conduct statistical tests on the results. Percentile ranks are fine for the routine practice of assessment and interpretation. Standard Scores provide the clearest indication of the students performance. There are limitations with standard scores see Sattler. As there are equivalent

indexes for each subtest, the standard scores can be compared for each subtest and provide the best means for evaluating the childs strengths and weaknesses. OK (Newcomer & Hammill, 1997). Various combinations of subtests make up the composite quotients which can be used to identify strengths and weakness in language. The quotients are reliable as they comprise more than one subtest and are constructed to have a mean of 100 and a standard deviation of 15. This does not make them reliable see the reliability section in Sattler. Although the quotients used in this test are not calculated through the use of division, they are still accepted through psychometric custom of the mean score and standard deviation distribution long associated with psychometric custom. (Newcomer & Hammill, 1997). What does this mean? The TOLD-P:3 provides the examiner the option to prorate a score if they are unable to get a needed subtest when completing composites scores. By adding the standard scores for the tests given and dividing the number of subtests actually given, the examiner is able to record the average score in the space corresponding to the missing subtest standard score. (Newcomer & Hammill, 1997). Why would you not be able to get a score? What do you see as a possible disadvantage of this method? Standardization of the Test The norm sample for the TOLD-P:3 was 1,000 people from 28 states. The sample was tested in the spring of 1996 with experienced examiners randomly chosen using the PRO-ED

Test Review

customer files to locate professionals who had purchased the TOLD-P:3 within the past two years. See APA re: beginning a sentence with a number. 1,519 children were tested and 1,000 were used to norm the TOLD-P:3 with the other 519 used in reliability and validity studies (Newcomer & Hammill, 1997). The demographic characteristics in regard to geographic region, gender, race, residence, ethnicity , family income, educational attainment of parents (i.e., less than Bachelors Degree, Bachelors Degree, Masters, Professional, Doctorate Degrees), disabling conditions and age are reported in percentages to ensure the normative sample was a national representation. Run-on sentence. This demographic information was then stratified by age to show the variables conform to national expectations at each age (Newcomer & Hammill, 1997). Do you think it is a representative sample? What is your evaluation? Technical Characteristics/Psychometric Properties Three sources of test error: (a) content, (b) time, and (c) scorer were used to provide evidence the (?) TOLD-P:3 test can be used with confidence (Newcomer & Hammill, 1997). Cronbachs (1951) coefficient alpha method was applied to investigate content sampling error. Subtests and composites were calculated at five age intervals using data from the normative sample. Coefficient alphas for the composites were derived using Guilfords (1954, pg 393see APA for citing secondary sources) formula designed for estimating the alphas of composites (Newcomer & Hammill,1997, p 56). Brackens (1987) recommendation of .80 or higher was used as acceptable reliability for subtests. Coefficients for the subtests equal or exceeded .80. For all subtests clarify this. (Turner, 2006). In seven of nine instances they round to .90. Composite coefficients for the X composite scores

Test Review (i.e., ) are all greater than .90 indicating the TOLD-P:3 is a highly reliable test that can be used with confidence (Newcomer & Hammill,1997). Subgroups representing a broad spectrum of mainstream and minority populations, gender, racial, ethnic, linguistic, and disability categories were reported as percentages Clarify?

10

What does this mean? Why is this helpful?. The U.S. 1990 census figures were used as a guide to ensure minority populations were adequately represented showing the test contains little or no bias relative to those groups.Discuss this when you first introduce the standardization of the test (Newcomer & Hammill, 1997). Scores from Standard Error of Measurements range from 1 on the subtests standard scores, 4 on the composites representing the linguistic features and systems and 3 points for the Spoken Language Quotient. This is somewhat confusing These small scores (??) support adequate content sampling reliability (Newcomer & Hammill, 1997). Time sampling reliability was determined using the test-retest method. Thirty-three students enrolled in kindergarten, first grade and second grade in an elementary school in Austin, Texas were tested and four months later retested (awk). The correlation coefficients r shows all subtests with the exception of Word Discrimination meeting moderately high or good reliability (look at other test reviews to see how this information is typically presented). With an r score of . 77, Word Discrimination meets the set criteria for moderate or fair reliability. Using Guilfords formula, four of the six composites meet moderately high or good reliability while Organizing and Spoken Language rate high or excellent reliability with scores of .91 and .92 respectively. Using the t-test method the two tests were compared with no significant differences at the .05 level of confidence (Newcomer & Hammill, 1997). Interscorer reliability was determined by having two staff persons (?) in PRO-EDs research department independently score a set of 50 protocols. The

Test Review coefficients for the subtests and composites were all .99 providing evidence supporting the tests scorer reliability. (Newcomer & Hammill, 1997).

11

In order to ensure validity, three types of validity: (a) content validity, (b) criterion related validity, and (c) construct validity were provided. A detailed rationale for the items and testing formats of each subtest provides qualitative evidence of ?. Item analysis procedures were used to choose items during the developmental stages of test construction and differential item functioning analyses was used to show the absence of item bias (Newcomer & Hammill, 1997). The TOLD-P:3s scores were correlated with the Bankson Language Test-Second Edition (Bankson, 1990) with results showing correlations statistically significant beyond the .05 level of confidence showing strong evidence for criteria related validity. Evidence for age differentiation, group differentiation, subtest interrelationships, factor analysis, and item validity were evaluated using statistical analysis. (Newcomer & Hammill,1997). Findings? The TOLD-P:3 uses the three-step procedure offered by Gronlund and Linn (1990), to test construct validity. First, several constructs presumed to account for test performance are identified. Second, hypotheses are generated that are based on the identified constructs. Third, the hypotheses are verified by logical or empirical methods. (Hayward et al., 2008, p. 7). The authors provide empirical research for each of the individual subtests What is your evaluation of the technical qualities? Evaluation. While the TOLD-P:3 is applicable to a wide variety of English language speakers, Hammer et al. (2002) remark that caution should be used when utilizing this instrument on African American children. However, they could not conclusively state this test was biased towards them. There are also some things to note when utilizing certain subtests. Speak more generally to this

Test Review issue? Is caution warranted when using it with other groups? Why / why not? The Word

12

Articulation subtest allows for imitation if the child does not know the target word, but Hayward et al. (2008) note that generally in speech pathology imitation is thought to alter the childs production. W. Hetherington, a Speech and Language Pathologist (personal communication, May 27, 2009), does not find the supplemental subtests particularly useful, and recommends that the TOLD-P:3 be given in conjunction with other language tests that look at phonological awareness. Laing and Kamhi (2003) recognize the efforts the authors have made to include diverse populations in the norm group. This aids in making the test more applicable to culturally diverse populations. Link to first part of this section. W. Hetherington confirmed the test as quick and easy to give. The test goes quickly so you dont do not lose the childs attention. Instructions and ceilings placed directly above the subtests allow the examiner to easily direct the student towards the task. Results from the TOLD-P:3 will guided in the direction if more language testing is needed. (W. Hetherington, personal communication, May 27, 2009). The TOLD-P:3 is an extensive test that has a solid psychometric basis and can be used with confidence.

Test Review References


Hamill, D. D. & Newcomer, P. L. (1997). Test of language development-primary 3 (TOLD-I:3). Austin, TX: PRO-ED.

13

Hammer, C. S., Scheffner, C., Pennock-Roman, M., Razasa, S., & Tomblin, J. B. (2002). An analysis of the test of language development-primary for item bias. American Journal for Speech Language Pathology, 11(3), 274-284. Retrieved June 1, 2009, from http://web.ebscohost.com.ezproxy.lib.ucalgary.ca/ehost/detail? vid=1&hid=106&sid=4e613127-9b60-4105-97e5493cfbf17013%40sessionmgr102&bdata=JnNpdGU9ZWhvc3QtbGl2ZQ%3d %3d#db=a9h&AN=7252096#db=a9h&AN=7252096 Hayward, D. V., Stewart, G. E., Phillips, L. M., Norris, S. P., & Lovell, M. A. (2008). Test review: Test of language development-primary 3rd edition (TOLD-P:3). Language, Phonological Awareness, and Reading Test Directory. Edmonton, AB: Canadian Centre for Research on Literacy. Retrieved May 28, 2009 from http://www.uofaweb.ualberta.ca/elementaryed/ccrl.cfm

Laing, S. P., & Kamhi, A. (2003). Alternative assessment of language and literacy in culturally and linguistically diverse populations. Language, Speech & Hearing Services in Schools, 34(1), 44-55. Retrieved June 3, 2009, from http://www.cckm.ca/CPSLPR/pdf/Laing2003.pdf PRO-ED. (2008). TOLD-P:4- Test of language development-primary-fourth edition. Retrieved May 28, 2009, from http://www.proedinc.com/customer/ProductView.aspx? ID=4233&sSearchWords=told Sattler, Jerome M. (2008). Assessment of Children: Cognitive Foundations, Fifth Edition. Le Mesa, ca: Publisher, Inc. Turner, H. C. (2006). Test review: Young childrens achievement test. Journal of

Test Review Psychoeducational Assessment, 24(3), 272-277. Retrieved June 1, 2009, from

14

https://blackboard.ucalgary.ca/webapps/portal/frameset.jsp?tab_id=_2_1&url=%2fwebapps %2fblackboard%2fexecute%2flauncher%3ftype%3dCourse%26id%3d_66569_1%26url %3d

Potrebbero piacerti anche