Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
BRENDA D. TOWNES
Department of Psychiatry and Behavioral Sciences
University of Washington
Seattle, Washington, USA
ALEXANDRE CASTRO-CALDAS
Institute of Health Sciences
Universidade Catolica Portuguesa
Lisbon, Portugal
GAIL ROSENBAUM
Regional Epilepsy Center
University of Washington
Seattle, Washington, USA
79
80 B. D. TOWNES ET AL.
TIMOTHY DEROUEN
Department of Dental Public Health Sciences
Department of Biostatistics
University of Washington
Seattle, Washington, USA
INTRODUCTION
Neurobehavioral tests are regularly used to assess changes in neuropsycholog-
ical function over time, both at the individual and at the group level. Research
studies often report repeat data on samples on several occasions, although the
report of repeated testing over a number of years is rare.
Most of the better-validated and widely used neurobehavioral tests were
developed in English in the United States and were normed on U.S. samples.
Because of the former characteristics, they are often used on Non-English-
speaking samples where the absence of language- and culture-specific norms
poses problems in their interpretation (Townes et al., 2003).
The present report provides useful data that is relevant to both of the
aforementioned issues. First, it provides repeat test data over an eight-year
period on a standard battery of neurobehavioral tests; and secondly, it provides
neurobehavioral data on a substantive sample of Portuguese children that can
be used for comparative and clinical purposes. It also provides evidence for the
comparison between adult and child versions of the same or similar tests. This
is relevant to the issue of identifying continuities and discontinuities during
neurobehavioral development.
In 1997 the authors enrolled students of seven primary schools in Lisbon,
Portugal, in a longitudinal study of the effects of dental mercury amalgams
REPEAT NEUROBEHAVIORAL TEST MEASURES 81
METHODS
Participants
With Institutional Review Board and parental or guardian approval, 507
children ages 8 through 12 years were enrolled in the study. In this aspect
of the study the 4 children age 12 were dropped and data from the amalgam
and composite groups were combined for a total sample of 503 children age
8 through 11 years. Because significant gender differences were found in
neurobehavioral performance at enrollment (Martins et al., 2005), participants
were divided into 4 groups based on gender and age at baseline: younger males
(8–9.9 years), older males (10–11.9 years), younger females (8–9.9 years), and
older females (10–11.9 years). The number of participants in each group, their
age, and ethnic origin (approximately 69% Caucasian; 29% Afro-Portuguese
and 1% Asian Portuguese) are shown in Table 1.
Procedures
Upon enrollment in the study, participants were individually administered
a battery of neurobehavioral tests by psychometrists trained by a U.S.
psychometrist (GR), who had extensive experience in test administration and
scoring as well as cross-cultural testing. The participants were re-tested at
yearly intervals for seven subsequent years. Each psychometrist was continually
monitored and their work calibrated over the 8.5 year testing period by review of
videotaped testing sessions using ratings on a 136-item check list (with 94.5%
82 B. D. TOWNES ET AL.
Testing occasion
Variable B/L 1 2 3 4 5 6 7
Gender
Males N 276 264 260 257 246 233 210 189
% 55% 55% 56% 56% 55% 54% 55% 54%
Females N 227 214 205 204 198 196 172 160
% 45% 45% 44% 44% 45% 46% 45% 46%
Ethnicity
White N 355 336 327 323 309 306 269 242
% 71% 70% 70% 70% 70% 71% 70% 69%
Non-White N 148 142 138 138 135 123 113 107
% 29% 30% 30% 30% 30% 29% 30% 31%
Age at inception
Males
Younger (8–9.9 yrs) N 113 108 106 104 100 91 87 78
Older (10–11.9) N 163 156 154 153 146 142 123 111
Females
Younger (8–9.9 yrs) N 95 89 87 85 85 82 71 67
Older (10–11.9) N 132 125 118 119 113 114 101 93
Age per year
Mean 10.08 11.10 12.12 13.09 14.08 15.01 15.99 17.03
SD 0.94 0.95 0.96 0.96 0.96 0.95 0.95 1.01
to 97.8% accuracy). Tests were double scored and data corrected when errors
were identified (no severe violations of protocol were observed that required
discarding data).
Instruments
The initial battery of tests included: Rey Auditory Verbal Learning test (RVLT);
Visual Learning and Finger Windows (Finger Win) subtests from the Wide
Range Assessment of Memory and Learning; Digit Span, Coding and Symbol
Search subtests from the Wechsler Intelligence Scale for Children-III; Trail
Making Test (Trails); Stroop Color Word Test (Stroop); Standard Reaction Time
(Reaction Time); Finger Oscillation Test (Ftap); Pegboard (Pegs), Matching
Figures (Matching), and Drawing subtests from the Wide Range Assessment
of Visual Motor Abilities. These measures are described and referenced in
Townes et al. (2003).
REPEAT NEUROBEHAVIORAL TEST MEASURES 83
In follow up year four the Drawing subtest was dropped and adult tests
were added as described in Tables 2 and 3. Both the child and adult versions
were administered in follow up year four with adult tests replacing the child
versions in years five through seven.
The Comprehensive Test of Nonverbal Intelligence (CTONI) (Hammill
et al., 1997) was administered at baseline and in the final year of the study.
The Block Design subtest of the Wechsler Abbreviated Scale of Intelligence
(WASI) (The Psychological Corporation, 1999) was administered at the final
follow up. Scores from the Block Design were combined with those from the
Matrix Reasoning subtest of the WASI to provide a WASI Performance Scale
IQ score.
STATISTICAL ANALYSES
Means and standard deviations were calculated for demographic data and within
groups for each test measure across the eight year test–retest span. Anovas
were used to test for differences between “completers” and “drop-outs.” To
investigate the comparability of child and adult tests, Pearson product moment
correlation coefficients were calculated between scores on the child and adult
tests at the fourth year follow up assessment.
RESULTS
Demographic Characteristics
Repeated test scores on neurobehavioral measures were obtained in a sample
of Portuguese children. Table 1 presents basic demographic information on the
sample who were further subdivided by gender and age at inception into the
study (younger versus older). There were slightly more males than females in
the sample; however, the ethnicity of participants was typical of the population
of Portugal (Martins et al., 2005).
During selection of participants a lower IQ boundary of 67 on the CTONI
was chosen. Although truncated at the lower end, the CTONI IQ at baseline
was otherwise normally distributed with a mean of 85.06 (SD = 9.9). These
scores are consistent with clinical experience suggesting that the CTONI
underestimates intelligence in non-U.S. populations by up to one standard
deviation (Martins et al., 2005) and the fact that CTONI IQ is known to
underestimate fluid intelligence (Lassiter et al., 2001).
84 B. D. TOWNES ET AL.
Table 2. Neurobehavioral test scores (raw) for younger (8–9.9 years) and older (10–11.9 years)
male subgroups at inception, by year of study
Testing occasion
Measure B/L 1 2 3 4 5 6 7
RVLT total
Younger M 35.03 35.29 38.70 40.48 43.58 44.38 45.05 44.43
sd 9.46 7.02 8.78 8.02 8.79 9.34 8.28 9.69
Older M 40.82 39.19 40.66 42.03 44.15 46.18 45.59 46.21
sd 9.96 7.57 8.26 8.77 8.47 8.34 8.96 9.00
RVLT delay
Younger M 7.29 7.21 8.08 8.24 8.93 9.42 9.16 9.11
sd 2.99 2.42 2.72 2.79 2.84 3.02 2.60 2.76
Older M 8.90 7.99 8.23 8.67 9.21 9.50 9.66 9.50
sd 3.03 2.70 2.79 2.60 2.73 2.63 2.81 3.02
Visual learn
Younger M 21.53 23.07 23.29 24.83 26.12 33.89 35.47 34.93
sd 8.51 8.02 9.32 9.20 10.25 4.60 3.91 4.52
Older M 22.60 23.93 24.12 24.62 25.65 34.04 34.87 34.51
sd 9.01 9.11 9.74 9.80 9.95 4.11 4.76 4.71
Visual recall
Younger M 6.60 6.94 7.17 7.46 7.67 29.37 32.85 32.05
sd 2.91 3.13 3.53 3.23 3.48 6.06 5.34 6.48
Older M 7.23 7.19 7.32 7.35 7.65 30.10 31.48 31.54
sd 3.25 3.06 3.28 3.55 3.35 6.82 7.22 7.55
Ftap dom
Younger M 35.12 37.21 41.34 43.68 45.96 48.46 50.87 52.55
sd 5.76 5.35 5.19 5.92 5.71 6.67 6.48 6.48
Older M 38.59 41.08 44.13 46.29 48.27 50.09 51.86 52.28
sd 6.09 5.80 5.73 6.06 6.49 6.34 6.92 6.70
Ftap NDom
Younger M 30.63 32.66 36.38 37.85 40.09 42.51 43.88 46.15
sd 5.19 4.92 5.41 5.37 5.56 5.98 5.91 5.98
Older M 34.22 35.89 38.45 40.85 42.82 44.46 45.80 46.72
sd 5.12 5.48 5.54 5.52 6.17 6.16 6.58 6.53
Pegs dom
Younger M 34.48 38.57 41.87 43.71 45.85 45.70 45.37 46.67
sd 5.81 5.58 6.16 5.69 6.31 6.22 6.07 7.44
Older M 38.20 41.19 44.12 44.95 45.49 45.90 45.91 46.99
sd 5.91 5.47 6.62 6.20 6.62 6.70 7.68 8.31
Pegs ND
Younger M 32.32 35.14 37.90 40.30 41.54 41.98 43.07 43.83
sd 5.08 4.77 5.67 4.95 5.68 5.75 5.68 6.52
(Continued)
REPEAT NEUROBEHAVIORAL TEST MEASURES 85
Table 2. Neurobehavioral test scores (raw) for younger (8–9.9 years) and older (10–11.9 years)
male subgroups at inception, by year of study (Continued)
Testing occasion
Measure B/L 1 2 3 4 5 6 7
Table 2. Neurobehavioral test scores (raw) for younger (8–9.9 years) and older (10–11.9 years)
male subgroups at inception, by year of study (Continued)
Testing occasion
Measure B/L 1 2 3 4 5 6 7
NB. In follow-up years 5–7, adult tests were used in the place of child tests, as follows:
WMS-R (Wechsler, 1997b) Visual Reproduction (immediate and delayed) replaced WRAML
Visual Learning and Visual Memory; WASI Matrix reasoning replaced WRAVMA Matching
Pictures; WAIS-III (Wechsler, 1997a) Digit Symbol, Symbol Search and Digit Span replaced
WISC-III Coding, Symbol Search, and Digit Span; WMS-III Spatial Span replaced WRAML
Finger Windows; Adult Trails A and B (Reitan, 1958) replaced intermediate Trails A and Trails
B. Substitute test scores are shown in italics. ğ Lower values represent better performance. For
all other tests, higher values represent better performance.
Table 3. Neurobehavioral Test Scores (raw) for younger (8–9.9 years) and older (10–11.9 years)
female subgroups at inception, by year of study
Testing occasion
Measure B/L 1 2 3 4 5 6 7
RVLT total
Younger M 35.86 36.77 39.41 42.94 44.56 46.56 47.52 46.63
Sd 9.17 9.48 8.59 8.84 8.47 8.96 7.37 9.18
Older M 40.63 38.68 41.67 44.25 47.03 48.24 49.25 49.31
sd 8.98 8.41 9.34 8.08 8.54 8.28 7.62 9.00
RVLT delay
Younger M 7.65 7.30 8.21 8.94 9.27 9.71 10.09 9.87
sd 2.76 2.43 2.71 2.95 2.69 2.53 2.39 2.61
Older M 8.61 8.16 8.70 9.21 9.72 9.73 10.46 10.30
sd 2.81 2.50 2.85 2.61 2.84 2.68 2.52 2.70
Visual learn.
Younger M 18.68 21.89 23.71 22.89 25.36 34.50 35.61 36.06
sd 7.62 8.36 7.77 8.73 9.44 4.97 4.65 3.28
Older M 21.48 22.48 23.34 24.17 23.37 35.41 36.25 36.69
sd 8.50 8.62 8.63 8.89 8.70 3.90 3.62 3.08
Visual recall
Younger M 5.67 6.80 6.95 7.08 7.88 30.78 34.67 33.97
sd 2.95 3.30 3.14 3.10 3.13 6.47 4.40 4.38
Older M 6.32 6.60 6.87 7.45 6.95 32.52 34.01 35.00
sd 2.94 2.80 2.78 3.21 3.17 6.02 4.86 3.91
Ftap dom
Younger M 33.61 36.73 38.98 42.10 43.52 45.98 47.54 48.46
sd 5.65 5.66 5.36 5.08 5.49 5.08 5.23 5.72
Older M 36.98 39.72 41.60 44.11 44.29 46.93 47.78 48.28
sd 5.72 5.59 5.47 5.19 5.41 5.65 5.51 5.46
Ftap NDom
Younger M 28.66 31.05 32.98 35.56 37.39 39.74 40.91 42.01
sd 4.36 4.49 5.53 5.10 4.94 5.21 5.10 5.64
Older M 31.45 33.83 35.75 38.18 39.26 41.25 41.68 42.29
sd 5.01 4.75 4.62 4.87 5.37 5.84 5.64 5.54
Pegs dom
Younger M 34.84 39.27 41.81 44.87 46.36 46.99 49.43 49.49
sd 5.51 5.98 5.65 5.11 5.54 5.18 5.50 5.67
Older M 37.58 42.05 45.06 46.39 48.69 48.80 49.65 50.10
sd 6.58 6.00 5.02 5.46 5.77 5.73 6.56 6.76
Pegs ND
Younger M 31.63 34.95 37.33 39.45 40.76 42.78 43.38 44.88
sd 5.55 5.48 6.05 5.64 5.73 5.84 6.12 5.63
Table 3. Neurobehavioral Test Scores (raw) for younger (8–9.9 years) and older (10–11.9 years)
female subgroups at inception, by year of study (Continued)
Testing occasion
Measure B/L 1 2 3 4 5 6 7
(Continued)
REPEAT NEUROBEHAVIORAL TEST MEASURES 89
Table 3. Neurobehavioral Test Scores (raw) for younger (8–9.9 years) and older (10–11.9 years)
female subgroups at inception, by year of study (Continued)
Testing occasion
Measure B/L 1 2 3 4 5 6 7
NB. In follow-up years 5–7, adult tests were used in the place of child tests, as follows:
WMS-R Visual Reproduction (immediate and delayed) replaced WRAML Visual Learning and
Visual Memory; WASI Matrix reasoning replaced WRAVMA Matching Pictures; WAIS-III Digit
Symbol, Symbol Search, and Digit Span replaced WISC-IV Coding, Symbol Search, and Digit
Span; WMS-III Spatial Span replaced WRAML Finger Windows; Adult Trails A and Adult
Trails B replaced Trails A and Trails B. Substitute test scores are shown in italics. ğ Lower values
represent better performance. For all other tests, higher values represent better performance.
The remaining 67 (13%) dropped out at sometime during years 1–6 and were
labelled as “drop-outs.”
“Completers” and “drop-outs” were first compared at baseline on the
three neurobehavioral endpoints, z-memory, z-visual-motor; and z-attention,
by means of Anovas. None of the “F” ratios were anywhere close to statistical
significance. The two groups were then compared on the same three endpoints
at years 2, 3, 4, and 5. Yet again no statistically significant differences were
found between “completers” and “drop-outs” for any of these endpoints at
any of these repeat testing occasions. It can be concluded therefore that those
dropping out of the study before year six are not doing so specifically for
reasons of poor neurobehavioral performance.
90 B. D. TOWNES ET AL.
Raw Scores
Visual Learn/Vis.Rep.Imm. .26∗∗ 25.10 9.66 32.36 4.51
Visual Memory/Vis.Rep.Del. .31∗∗ 7.51 3.30 26.87 7.18
Matching/Matrix .52∗∗ 36.02 3.69 19.63 6.39
Trails A secs/Ad.Trls A secsğ .53∗∗ 16.00 6.39 29.83 11.57
Trails B secs/Ad.Trls B secsğ .57∗∗ 31.85 13.83 74.86 32.38
Scaled Scores
Coding/Digit Sym. .85∗∗ 8.95 3.55 6.45 2.27
Sym.Search/Ad.Sym.Search .68∗∗ 10.39 3.29 7.31 2.72
Digit Span/Ad.Dig.Span .69∗∗ 7.26 2.60 6.75 2.10
Finger Windows/Spatial Span .53∗∗ 7.37 2.51 7.21 3.30
found between the Coding and Digit Symbol tasks (0.85). Because the child
tests were administered in all cases before their adult counterparts, it is possible
that a fatigue effect took place resulting in variable performance on adult tests
for some participants.
DISCUSSION
Results of this investigation provide guidelines for clinicians in Portugal against
which to evaluate the neurobehavioral development of individual children
across puberty, a period of rapid neurobehavioral development. The question
remains, to what extent is the sample of 503 children representative of the
ability levels of children in Portugal and other countries. The Block Design
subtest of the WASI at follow up year 7 was normally distributed around
a mean of 45.51 (SD 10.3) and the WASI IQ mean was 91.08 (SD 17.94);
American norms are 50 (SD 10) and 100 (SD 15), respectively. Differences
between the present Portuguese sample and U.S. samples of the same age
may result from cultural differences and from an overrepresentation of a lower
socioeconomic status in the present group. The latter issue was discussed
in detail in the original description of the current sample (Martins et al.,
2005). As any random U.S. sample in this age bracket would vary above and
below the mean, the participants in this investigation are considered reasonably
representative of a normal school-age population in Portugal, with the exception
of an overrepresentation of lower socioeconomic status.
One particular anomaly was found in the individual neurobehavioral
developmental curves. The data on the Stroop Color-Word test (the interference
task) showed that improvement was not progressive over time. Rather there was
a dip in performance over the first few years of repeat testing (or no noticeable
improvement) followed by improvement only toward the end of the testing
period. One possible explanation for this finding is that the “interference effect”
is dependent on the children’s ability to read and some of the younger children
were initially poor readers at inception into the study. They would therefore
not be likely to show the interference effect. However, over subsequent years
their reading improved, which in consequence rendered them more sensitive
to the color/word “interference effect” and to a poorer level of performance on
the test. This ad-hoc explanation is in line with observation and deserves to be
systematically addressed in subsequent studies.
Repeated measurement on the same tests is a common practice in certain
clinical settings, particularly in patients undertaking epilepsy surgery or other
types of comparable medical intervention. It is an established method of
92 B. D. TOWNES ET AL.
demonstrating improvement or, at the very worst, the absence of any clinically
significant neurobehavioral deterioration. The interpretation of repeat test
measures in such cases is hampered by the lack of information on the relative
contributions of practice effects and age developmental factors. A literature
review, combined with a careful study of the three main neuropsychological
texts (Lezak et al., 2004; Mitrushina et al., 1999; Spreen & Strauss, 1998),
failed to reveal any study that had used the same battery of tests repeatedly over
an eight-year period. Although the present report does not attempt to separate
out the factors of practice and development, it does provide useful normative
data on their combined effects over the adolescent period, on a sizable sample
of Portuguese children/adolescents.
The applicability of these norms to other cultures remains an open question.
Although replications in other non-English speaking countries are needed, the
time and expense to do so would be prohibitive. The authors hope that future
collaborations will occur between investigators using similar neurobehavioral
measures. Pooling of results from medical and educational studies in different
countries potentially would provide normative data that would be robust for
use in culturally and linguistically diverse settings.
REFERENCES
DeRouen, T. A., Leroux, B. G., Martin, M. D., Townes, B. D., Woods, J. S., Leitao,
J., Castro-Caldas, A., & Braveman, N. (2002). Issues in design and analysis of
a randomized clinical trial to assess the safety of dental amalgam restorations in
children. Controlled Clinical Trials, 23(3), 301–320.
DeRouen, T. A., Martin, M. D., Leroux, B. G., Townes, B. D., Woods, J. S., Leitao,
J., Castro-Caldas, A., Luis, H., Bernardo, M., Rosenbaum, G., & Martins, I. P.
(2006). Neurobehavioral effects of dental amalgam in children: a randomized
clinical trial. Journal of the American Medical Association, 295(15), 1784–
1792.
Hammill, D. D., Pearson, N. A., & Wiederholt, J. L. (1997). Comprehensive Test of
Nonverbal Intelligence—manual. Austin, TX: Pro-ED.
Lassiter, K. S., Harrison, T. K., Matthews, T. C., Bell, N. L., & The, Citadel. (2001).
The validity of the Comprehensive Test of Nonverbal Intelligence as a measure of
fluid intelligence. Assessment, 8(1), 95–103.
Lezak, M. D., Howieson, D. B., & Loring, D. W. (2004). Neuropsychological assessment
(Fourth Edition). New York: Oxford University Press.
Martins, I. P., Townes, B. D., Ferreira, G., Rodrigues, P., Marques, S., & Rosenbaum,
G. (2005). Age and sex differences in neurobehavioral performance: A study of
Portuguese elementary school children. International Journal of Neuroscience,
REPEAT NEUROBEHAVIORAL TEST MEASURES 93
115, 1687–1709.
Mitrushina, M. N., Boone, K. B., & D’Elia, L. F. (1999). Handbook of normative data
for neuropsychological assessment. New York: Oxford University Press.
Psychological Corporation. (1999). Wechsler Abbreviated Scale of Intelligence—
Manual. San Antonio: The Psychological Corporation.
Spreen, O., & Strauss, E. (1998). A compendium of neuropsychological tests:
Administration, norms, and commentary (Second Edition). New York: Oxford
University Press.
Townes, B. D., Rosenbaum, G., & Martins, I. P. (2003). Neurobehavioral assessment
of children. A cross-cultural perspective. In M. R. Simoes & A. C. Castro-Caldas
(Eds.), Neuropsychological Assessment, Psychologia, 24. (pp. 177–185).
Wechsler, D. (1997a). Wechsler Adult Intelligence Scale-III. San Antonio: The
Psychological Corporation.
Wechsler, D. (1997b). Wechsler Memory Scale. Third edition manual. San Antonio: The
Psychological Corporation.