Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Research Article
Purpose: There is currently minimal information on Results: Statistical results showed significant differences
the impact of dysphonia secondary to phonotrauma between RT ratio and number of speech intelligibility errors
on listeners. Considering the high incidence of voice between healthy and dysphonic voices. There was not a
disorders with professional voice users, it is important significant difference in listener comprehension errors. Multiple
to understand the impact of a dysphonic voice on their regression analyses showed that voice quality ratings from
audiences. the Consensus Assessment Perceptual Evaluation of Voice
Methods: Ninety-one healthy listeners (39 men, 52 women; (Kempster, Gerratt, Verdolini Abbott, Barkmeier-Kraemer,
mean age = 23.62 years) were presented with speech & Hillman, 2009) were able to predict RT ratio and speech
stimuli from 5 healthy speakers and 5 speakers diagnosed intelligibility but not listener comprehension.
with dysphonia secondary to phonotrauma. Dependent Conclusions: Results of the study suggest that although
variables included processing speed (reaction time [RT] listeners require more time to process and have more
ratio), speech intelligibility, and listener comprehension. intelligibility errors when presented with speech stimuli
Voice quality ratings were also obtained for all speakers from speakers with dysphonia secondary to phonotrauma,
by 3 expert listeners. listener comprehension may not be affected.
I
t has been estimated that nearly 25% of the work Roy, Merrill, Thibeault, Gray, & Smith, 2004). In fact, a
force in the United States have jobs where vocal use recent study of university teachers found that 52% of female
is considered to be critical to their employment and 33% of male instructors reported hoarse voices (Korn,
(Williams, 2003), and another 3% of the working popula- de Lima Pontes, Abranches, & de Lima Pontes, 2015).
tion have jobs where vocal use is required for public safety Other research has found that 39% of fitness instructors
(Titze, Lemke, & Montequin, 1997). In addition, an esti- report chronic hoarseness (Rumbach, Khan, Brown, Eloff,
mated one third of the workforce relies on their voice for & Poetschke, 2015). For those professional voice users
work (Vilkman, 2004). These professional voice users include with voice disorders, there are a number of consequences
teachers, air traffic controllers, and attorneys among others, in terms of economic, psychosocial, and voice-related qual-
and research has repeatedly shown that they are at risk for ity of life. For example, Van Houtte, Claeys, Wuyts, and
developing voice disorders (e.g., Ramig & Verdolini, 1998; Van Lierde (2011) reported that 20% of teachers had missed
a minimum of 1 day of work due to voice problems. Roy
et al. (2004) found that teachers experienced reduced activi-
a
ties or interactions with others as a result of their voice dis-
Towson University, MD
b order. In addition, Chen and colleagues (2010) found that
Stanford University School of Medicine, CA
c
Johns Hopkins University School of Medicine, Baltimore, MD
teachers with a voice disorder reported decreased ability
to communicate effectively, had reduced social activities,
Correspondence to Paul M. Evitts: pevitts@towson.edu
made fewer phone calls, and that the voice disorder had a
Editor: Krista Wilkinson
significant impact on their overall emotional state. Although
Associate Editor: Preeti Sivasankar
this is just a sample of the literature, there is a growing
Received October 20, 2014
Revision received April 11, 2015
Accepted March 31, 2016 Disclosure: The authors have declared that no competing interests existed at the time
DOI: 10.1044/2016_AJSLP-14-0183 of publication.
American Journal of Speech-Language Pathology • Vol. 25 • 561–575 • November 2016 • Copyright © 2016 American Speech-Language-Hearing Association 561
Downloaded From: http://ajslp.pubs.asha.org/ by a ReadCube User on 11/30/2016
Terms of Use: http://pubs.asha.org/ss/rights_and_permissions.aspx
body of research showing the negative impact that voice The synthetic speech literature has long embraced the
disorders have on professional voice users. use of RTs to measure the impact of an altered acoustic
Furthermore, there is also a large body of research signal on healthy listeners. In general, this body of research
on the treatment efficacy of voice disorders (see Ramig & has shown that listeners require more time to process syn-
Verdolini, 1998, for a review) and the impact of the voice thetic speech compared to normal speech. For example,
disorder on the person (e.g., Roy et al, 2004; Van Houtte Pisoni (1981) used a lexical decision task to measure RTs
et al., 2011). However, there is minimal information on the for both normal and synthetic speech and found that lis-
impact of the resultant disordered signal on the listener. teners required more time to process the synthetic speech.
Considering the types of employment for many of these Reynolds and Fucci (1998) used a true/false paradigm with
speakers, a better understanding of the impact of the normal and synthetic speech stimuli of equal intelligibility
dysphonic voice is clearly warranted. The following section and found that listeners had significantly longer RTs when
contains a brief review of commonly used objective measures presented with synthetic speech. Evitts and Searl (2006) also
that been used to investigate the impact of a disordered included synthetic speech stimuli in their study on alaryngeal
acoustic signal on a listener. speakers and also found that listeners had significantly
longer RTs when presented with synthetic speech than
normal speech. These authors attributed the longer RTs
Listener Reaction Times and increased cognitive workload to the listener requiring
Reaction times (RTs) have long been used as an more time to extract basic acoustic-phonetic information
index of cognitive workload placed on a listener (Gough, from the synthetic speech due to the degraded or impover-
1965), where longer RTs are associated with increased ished nature of the acoustic signal (Evitts & Searl, 2006;
cognitive workload (e.g., Duffy & Pisoni, 1992; Gatehouse Pisoni, 1981).
& Gordon, 1990; Gough, 1965; Pisoni, 1981). Cognitive Although there are currently no studies on the RTs
workload is referred to here as the amount of mental demand of listeners when presented with dysphonic voices, there
imposed on a listener’s cognitive system when performing a is reason to believe that listeners would require additional
specific task (Paas, Tuovinen, Tabbers, & Van Gerven, 2003) processing time compared to typical speech. Support for
and has been used in a variety of patient populations. For this may be drawn from research on the acoustics and
example, Jones, Fox, and Jacewicz (2012) used a dual-task voice quality of dysphonia. For example, dysphonia has
paradigm and reported increased cognitive workload as been found to have increased perturbation (Niebudek-
measured by RTs in adults who stutter compared to adults Bogusz, Kotylo, Politanski, & Silwinska-Kowalska, 2008)
who did not stutter. Kraiuhin et al. (1989) used a discrimi- and increased spectral noise levels (Emanuel & Sansone,
nation task and also reported longer RTs in adults with 1969). Subjective measurements of voice quality have also
Alzheimer’s disease relative to healthy controls. More shown significant differences between normal and dyspho-
recently, RTs have been used in the dysphagia literature nic voices (Murry, Medrado, Hogikyan, & Aviv, 2004).
as a means of measuring the amount of mental resources Similar to alaryngeal, foreign-accented, and synthetic
needed to complete a motor task (Brodsky, McNeil, et al., speech, listeners may need additional processing time
2012; Brodsky, Verdolini Abbott, et al., 2012). when presented with a dysphonic voice due to the altered
Although the literature is replete with information nature of the signal. Given the relationship between RTs
on the cognitive workload and RTs (i.e., processing speeds) and cognitive workload, it is important to understand
of various populations, there is much less information on how a dysphonic voice impacts listener RTs and, ulti-
the impact of a disordered acoustic signal on the RTs of mately, the amount of cognitive workload placed on the
typical, healthy listeners. In the alaryngeal speech literature, listener.
Evitts and Searl (2006) used a single-task paradigm by
presenting stimuli from speakers who used different modes
of alaryngeal speech (i.e., tracheoesophageal, esophageal, Speech Intelligibility
electrolaryngeal) to healthy listeners and found that lis- Aside from RTs, speech intelligibility is perhaps the
teners required more cognitive workload when presented most commonly used measure to determine the impact of a
with alaryngeal speech stimuli. Munro and Derwing (1995) disordered signal on healthy listeners. Although numerous
reported that healthy listeners had significantly longer RTs definitions exist, speech intelligibility is generally considered
when presented with foreign-accented speech compared to to be an index of how well a listener is able to retrieve a
native English. In a similar manner, Wilson and Spaulding speaker’s intended message (Hustad, 2008; Kent, Weismer,
(2010) used RTs to measure the RTs of healthy listeners Kent, & Rosenbek, 1989; Yorkston & Beukelman, 1980).
when presented with Korean-accented speech and native This is a useful tool as it serves as an “index of the severity
English speech stimuli. Results showed that listeners had of the overall functional limitation” (Yorkston, Beukelman,
the longest RTs when presented with moderately intelligible Strand, & Bell, 1999, p. 237). Speech intelligibility is typically
Korean-accented speech, followed by highly intelligible measured by presenting a series of sentences to a listener
Korean-accented speech. Listeners had the shortest RTs and calculating the number of orthographic transcription
when presented with native English speech stimuli (Wilson errors. Percent intelligibility scores can then be calculated
& Spaulding, 2010). for different speakers with a variety of speech disorders.
Listener Comprehension Note. Mean values are measured in millimeters from a 100-mm
visual-analog scale. Higher values are associated with increasingly
When presented with a standard reading passage of disordered voices within each rating category.
a smaller subset of the speakers and asked to answer 16 yes
Constant 0.733 0.002 0.743 0.003 0.743 0.003 0.748 0.003 0.751 0.003
Breathiness −0.002 0 −0.116* −0.002 0 −0.09* −0.003 0 −0.162* −0.003 0 −0.145* −0.002 0 −0.14*
Roughness −0.001 0 −0.055* −0.001 0 −0.135* −0.002 0 −0.227* −0.001 0 −0.128*
Strain 0.002 0 0.158* 0.02 0 0.154* 0.003 0 0.189*
Pitch 0.002 0 0.107* 0.001 0 0.102*
Overall −0.001 0 −0.133*
severity
R2 0.013 0.016 0.023 0.027 0.029
F for change 242.97* 42.26* 141.05* 70.20* 29.35*
in R2
Note. Dependent variable = RT ratio. Variables entered = age (constant), overall severity, roughness, breathiness, strain, and pitch.
*p < .001.
each voice quality contributed a range of .02 to .013 to the was calculated, which showed a weak but significant cor-
variance. For intelligibility, there were two different models relation (Cohen, 1988) between RT ratio and number of
that were predictive of speech intelligibility (see Table 4). intelligibility errors (r = −.152, p = .045). No other correla-
The model that included overall severity and strain was tions among the dependent variables were found to be
able to predict 36% of the variance associated with intelligi- significant with r values of .015 and −.090 and p values
bility errors: R2 = .36, F(1, 174) = 81.3, p < .001. Overall of .409 and .888.
severity contributed 32% to the variance and strain contrib-
uted 4% to the variance. Beta coefficients for voice quality
ratings ranged from −0.349 to 0.849 for the speech intelligi- Discussion
bility multiple regression and −0.09 to −0.227 for the RT The overall purpose of this study was to investigate
ratio multiple regression. This indicates that for every unit the impact of a dysphonic voice on healthy listeners using
increase in voice quality ratings, the number of speech intel- measures of RT, intelligibility, and comprehension. Dyspho-
ligibility errors changed by 0.34 to 0.85 and the RT ratio nia secondary to phonotrauma was targeted due to the high
changed by 0.09 to 0.227. Last, regression analysis for lis- prevalence of voice disorders in professional speakers (e.g.,
tener comprehension showed none of the voice qualities Roy et al., 2004) and the need to understand the impact
from the CAPE-V were predictive of comprehension errors. of the dysphonic voice on their audiences. Results of the
current study suggest that when healthy listeners are pre-
sented with speech stimuli produced by speakers with dys-
Relationship Among Dependent Variables phonia, listeners require significantly greater processing
Last, the final research question investigated the rela- time and have significantly more errors in intelligibility, but
tionship among the three dependent variables: RT ratio do not have more errors in comprehension. In addition,
number of speech intelligibility errors, and number of lis- results provided insight into the impact of different voice
tener comprehension errors. For this, a Pearson correlation qualities (e.g., breathy, strain) on measures of listener RT,
speech intelligibility, and listener comprehension. Specific
research questions are discussed in more depth below.
Table 4. Results of the stepwise multiple regression analysis for
speech intelligibility.
RTs and Dysphonia
Model 1 Model 2
Results of the study indicate that when presented
Variable B SE B β B SE B β with a dysphonic voice, listeners require additional time
to process the signal than when presented with a typical,
Constant 4.664 0.682 4.425 0.667
healthy voice. This increased processing speed is equated
Overall severity 0.161 0.018 0.564* 0.242 0.03 0.849*
Strain −0.145 0.043 −0.349* with increased cognitive workload on the part of the lis-
R2
0.318 0.36 tener. Although the difference was slight (7% increase when
F for change in R2 81.3* 11.11* listeners were presented with dysphonic voices), 84% of
Note. Dependent variable = number of speech intelligibility errors.
the variance of RT ratio could be attributed to the presence
Variables entered = overall severity, roughness, breathiness, strain, or absence of dysphonia. Although direct comparison to
and pitch. other RT studies may not be prudent given different tasks,
*p < .001 the magnitude of difference in the current study is smaller
than others. For example, results from Munro and Derwing