Sei sulla pagina 1di 6

Speech/Text Software

The following piece was produced independently, however it was following a collaborative presentation with my colleague, Allison Blumanis. During this presentation, we created a brochure and a poster to demonstrate key features of speech-to-text and text-to-speech technologies.These can be found at the end of this document. Although they may not know it, many people are already familiar with text-to-speech (TTS) and speech-to-text (STT) technologies. The former (which is sometimes referred to as speech synthesis), is computer technology that converts text input for example, via a word processor or text on a web page into computer generated speech. Meanwhile, speech-to-text (which is sometimes called speech recognition) is a complementary process that transforms spoken language into text on a screen, or interprets it as a computer command. Unfortunately, for many people, their interactions with these technologies have been less than inspiring. Take, for example, the frustrating experience of reaching a telephone voice recognition system that continually misinterprets your commands before eventually transferring you to a live operator. Or having to listen intently to decipher unnatural, robotic, computer voices with awkward prosody. However, these problems are largely teething issues as both technologies are, relatively, in their infancy. Although engineers and scientists have been toying with the notion of mimicking human speech since as early as the 1930s (Juang & Rabiner, 2004), both technologies have only been gaining momentum in the last twenty to thirty years (Lemmity, 1999) with the advent of more powerful computer technology. Subsequently, there has been a small, though growing, body of research into the benefits of these technologies for students with special needs (De La Paz, 1999, p.174). Most research relating to educational technology in the 1970s and 1980s focussed heavily on the general use of microcomputers by all students (Woodward & Rieth, 1997). However, a funding injection into special education and technology research in the United States (to the tune of US$35million) between the years of 1986 and 1997 allowed for more specialised research into areas such speech technology (Woodward & Rieth, 1997). Although the two speech technologies are complementary, they assist different students in different ways. As such, they will be dealt with separately in this paper. Speech-to-Text There are a number of ways in which STT technology can be utilised to support students with special education needs. Most obviously, it allows students with physical impairments, fine motor difficulties or visual impairments to generate text without the need for handwriting. Additionally, it can be used to facilitate communication between non-signing teachers and students with hearing impairments. Although this latter use can technically be employed in todays classrooms through the use of wireless microphones and individual student computers, the cu rrent technology available at the consumer level would result in numerous glitches and therefore be more of a hindrance than a help. However, recent work from the Liberated Learning Consortium (a joint project between St. Marys University in the United States and technology giant, IBM, along with numerous other tertiary institutions worldwide) hints at the powerful way that speech-to-text technology could be utilised in classrooms in the future. This research revolves around second generation STT technology developed by IBM called ViaScribe (Liberated Learning Consortium, 2007). Essentially, this technology transcribes live speech in real time. So, for example, a lecturer or teacher could be giving a presentation to a group whilst wearing a wireless headset or microphone and everything they say is transcribed, as they say it, onto a monitor. Although this is an exciting prospect in the world of assistive technology, there is still some way to go before this technology becomes a reality at the classroom level. As with all STT technologies, there are still issues regarding accuracy, readability (of transcribed text), user friendliness and ease of training and editing (Liberated Learning Consortium, 2007, Technology/Challenges section, 1). Despite these glitches, the use of speech-to-text technology is not merely a pipedream. In fact, there has been a reasonable amount of research demonstrating the benefits that STT can bring to students with learning disabilities. Much of this research revolves around the notion of freeing up cognitive resources normally used for the processes of handwriting, spelling and punctuation, and allowing these freed resources to be utilised in the spoken generation of text. A number of studies have revealed that students who dictate their text produce longer texts with fewer grammatical errors (Macarthur & Graham, 1987; Graham, 1990; Reece, 1992 Experiment 3). Additionally, dictated text was generated faster than students who wrote by hand (Graham, 1990), thus allowing them to retain pre-planned ideas that may be forgotten by students attending to the mechanics of writing and spelling. De La Paz (1999) indicates that the low error rate and increased speed of content production may help improve student motivation and attitude towards writing. Whilst these aspects are positive, it needs to be remembered that written language differs from spoken language (ODonnell, 1974) and as such, it is important that the quality of the texts produced should be examined to ensure that they have been delivered in the mode typical of written work. Text-to-Speech

TTS, like STT, has the potential to assist students with a wide array of special needs. Primarily, it serves the purpose of a personal reader for visually impaired students or students with reading difficulties, allowing them greater reading independence. However, it has also been shown to improve attention to text for students with ADD. Additionally, it can be used simply as a proofreading tool for students with and without special needs, alike. The very nature of text-to-speech software allows it to achieve similar results with student reading as those achieved via modelled, shared and guided reading experiences. For example, it can effectively work on the same principle as the neurological impress method (Heckelman, 1969) and echo reading (Anderson, 1981). Of course, as it does not have the human tutor component, it makes it more difficult to monitor, however it does provide an opportunity for students to engage in additional independent reading above and beyond their time with the teacher. The added independence that TTS gives to struggling readers is one of its primary advantages. Not only does it give students the freedom to read, and particularly research, information that may be beyond their independent reading level, but it also allows teachers to structure specific electronic texts to meet the individual needs of the student and allow students to self-regulate their learning. This is encapsulated in the words of Seegers (2001, p. 39): I like [TTS] because low-level readers can access important information [on the Internet] for research as well as read and listen to classics and other literature that they could not otherwise read. This is an important consideration as it allows these students to attain the requisite field knowledge in subjects such as HSIE and Science, thus removing barriers that their literacy difficulties may have previously put up. Additionally, research strongly underpins the notion that TTS, when delivered as a combination of auditory and visual components, enhances the comprehension of texts by struggling readers (Leong, 1995; Wise & Olsen, 1994; Disseldorp & Chambers, 2002; Montali & Lewandowski, 1996; Shany & Biemiller, 1995). Some researchers disagree, however, with Farmer, Klein and Bryson (1992) finding that students who were struggling readers did not improve their comprehension or word recognition when given the opportunity to highlight unknown words and have them read out by the computer. They suggest that this may occur because poor readers tend to skip over unknown words despite having the option to gain additional information about them. If this is correct, then these students could be explicitly taught to utilise this strategy whilst using a TTS system. Another study that found very little difference between reading performance using print and TTS technology was one by Hecker, Burns, Elkind, Elkind & Katz (2002). However, this research did provide an additional finding of interest. With the research sample being older students with attention deficit disorders, it was found that TTS improved the students attention to the text by 54% and reduced reading time by 29%. No doubt these findings indicate a reduced level of frustration for these students. In terms of writing, research has shown that TTS can be an effective tool for both students with and without special needs. In a study by Raskind & Higgins (1995), it was shown that postsecondary students with learning difficulties were able to detect a higher number of total errors (mainly relating to capitalisation, spelling, usage and typography) while using TTS. However, this method was compared with a human reading the text aloud to the student and it was found that this method allowed students to discover more grammar-mechanical errors. However, as this research was conducted in 1995, it may be that the kinds of voices available to use with TTS software were limited, and very unnatural sounding. Since that time, there has been significant improvement in the natural prosody of TTS voices, and it would be interesting to see if these results still held true with todays technology. Implications and Summation As it can be seen, speech-to-text and text-to-speech technologies have the potential to be utilised effectively to help students with a wide range of special needs. However, it has to be remembered that the current standard of these technologies may impede success rather than promote it, and subsequently special educators should consider carefully whether the technologies are suitable for their particular students (e.g. can they tolerate the length of time needed to train STT software, or deal with the occasional mispronunciation or misinterpretation). Additionally, there are other factors to consider before implementing these technologies in the classroom. Firstly, cost will be a significant factor in todays financially-tight education system. Although some software applications are free (and provide an excellent opportunity to test run the technologies in the classroom), many of the higher quality applications may cost well beyond $1000. It can be argued that this is cheaper than hiring a person to be a personal scribe or reader, but at this point in time, the technology doesnt provide the accuracy that a human does. Teachers also need to consider how these technologies are implemented in their classrooms. Will the technology cause an excessive disruption or distraction for other students? How will others students perceive students who are utilising these technologies? The former can be somewhat addressed through the use of headphones, but there will still no doubt be potential for some distraction element. As for the latter, this can be easily countered by allowing all students to access the technology and engaging in social role valorisation for the student utilising the technology. Overall, speech-to-text and text-to-speech technologies provide educators with a currently limited, but potentially powerful, way to address the needs of all students in their classes. Future research, improvements in technology and subsequent reductions in cost will ensure that this software becomes a standard part of classroom technology in years to come. REFERENCES

Anderson, B. (1981). The missing ingredient: Fluent oral reading. Elementary School Journal, 18, 173-177. De La Paz, S. (1999). Composing via dictation and speech recognition systems: Compensatory technology for students with learning disabilities. Learning Disability Quarterly, 22(3), pp. 173-182. De La Paz, S. & Graham, S. (1995). Dictation: Applications to writing for student with learning disabilities. In T. E. Scruggs & M.A. Mastropieri (Eds.), Advances in learning and behavioural disabilities. (Vol. 9, pp. 227-247). Greenwich, CT: JAI Press. Disseldorp, B., & Chambers, D. (July, 2002). Selecting the right environment for students in a changing teaching environment: A case study. Paper presented at the meeting of the Australian Society for Educational Technology International, Melbourne, Australia. Farmer, M. E., Klein, R., & Bryson, S.E. (1992). Computer-assisted reading: Effects of whole-word feedback on fluency and comprehension in readers with severe disabilities. Remedial and Special Education, 13(2), pp. 50-60. Graham, S. (1990). The role of production factors in learning disabled students compositions. Journal of Educational Psychology, 82, 781-791. Heckelman, R.G. (1969). A neurological impress method of remedial reading instruction. Academic Therapy, 4, 277-282. Hecker, L., Burns, L, Elkind, J., Elkind, K., & Katz, L. (2002). Benefits of assistive reading software for students with attention disorders. Annals of Dyslexia, 52, 244-272. Juang, B.H, & Rabiner, L.R. (2004). Automatic speech recognition: A brief history of the technology and development, Retrieved March 27, 2009, from http://www.ece.ucsb.edu/Faculty/Rabiner/ece259/Reprints/354_LALI-ASRHistory-final-10-8.pdf Lemmity, S. (1999), Review of speech synthesis technology, Retrieved March 27, 2009, from http://www.acoustics.hut.fi/publications/files/theses/lemmetty_mst/chap2.html Leong, C.K. (1995). Effects of on-line reading and simultaneous DECtalk auding in helping below-average and poor readers comprehend and summarize text. Learning Disability Quarterly, 18, 101-116. Liberated Learning Consortium. (2007). Liberated Learning. Retrieved March 20, 2009, from http://www.liberatedlearning.com/ Macarthur, C., & Graham, S. (1996). Learning disabled students composing under three methods of text production: Handwriting, word processing and dictation. Journal of Special Education, 21, 22-42. Montali, J., & Lewandowski, L. (1996). Bimodal reading: Benefits of a talking computer for average and less skilled readers. Journal of Learning Disabilities, 29, 271-279. ODonnell, R. C. (1974). Syntactic differences between reading and writing. American Speech. 49(1/2), pp. 102-110. Raskind, M. H., & Higgins, E. (1995). Effects of proofreading on the proofreading efficiency of postsecondary students with learning disabilities. Learning Disability Quarterly, 18(2), pp. 141-158. Reece, J. E. (1992). Cognitive processes in the development of written composition skills: The role of planning, dictation and computer tools. Unpublished doctoral dissertation. La Trobe University, Victoria. Seegers, M. (2001). Special technological possibilities for students with special needs. Learning and Leading With Technology, 29, 32-39. Shany, M. T., & Biemiller, A. (1995). Assisted reading practice: Effects on performance for poor readers in grades 3 and 4, Reading Research Quarterly, 30(3), pp. 382-395. Woodward, J., & Rieth, H. (1997). A historical view of technology research in education. Review of Educational Research, 67(4), pp. 503-536.

Sometimes called speech/voice recognition

What is speech-to-text?
Speech-to-text (STT) software is computer technology that allows a user to speak into a microphone and have their speech converted to text on a computer screen.

How can it be used?


Allows students with physical impairments , fine motor difficulties or visual impairments to generate text without the use of pen/pencil or keyboard. Students with articulation issues can utilise the software to encourage correct articulation. Allows students with writing difficulties to focus on content generation rather than the mechanics of writing. Future potential for real-time transcription of lecturer/teacher talk to assist hearing-impaired students.

spEECH-TO-TEXT

What some of the research says...


Students with learning difficulties who

Who will it benefit?


Students: with fine motor difficulties with writing difficulties who are visually impaired who are hearing impaired who are physically impaired

dictate texts using STT technology produce texts that are longer and more grammatically correct (Macarthur & Graham, 1987; Graham, 1990). May improve use of vocabulary or syntax that may not be used when concerned about spelling etc in handwritten production (De La Paz & Graham, 1995). Can improve writing motivation and persistence by the removal of barriers (De La Paz, 1999)

What is text-to-speech?

TEXT-TO-SPEECH

Sometimes called speech synthesis

Text-to-speech (TTS) software is computer technology that converts text input (for example, via a word processor or text on a web page) into computer-generated speech.

How can it be used?


Allows students with speech motor impairments to communicate orally. Students with visual impairments or reading difficulties can have text from websites and other text documents read out to them by the computer, creating a sense of independence in their learning. Can be used as a self-editing tool for students with writing difficulties student can have-a-go then have their text read back to them to see if it makes sense.

What some of the research says...


TTS can provide similar benefits to those found from reading aloud to students. Use of TTS has been shown to improve attention to text in students with ADD by 54% and reduced time spent reading by 29% (Hecker, Burns, Elkind, Elkind, and Katz (2002). Combined visual and auditory presentation of text improves comprehension for struggling readers (Leong, 1995; Wise & Olsen, 1994)

Who will it benefit?


Students: with speech impairments with reading difficulties with writing difficulties who are visually impaired

Software Spotlight
Dragon Naturally Speaking
Requires initial training period to allow the software to understand the users voice. Has customisable, active vocabularies (e.g. can be customised to suit the users speech nuances) Types at your talking speed Allows you to execute Windows commands. Costs between $299 and $1399

WANT MORE INFORMATION ON SPEECH-TO-TEXT OR TEXT-TOSPEECH?


Visit the following websites for more information and links to software and resources: SPEECH-TO-TEXT http://specialed.sqweebs.com/stt.htm TEXT-TO-SPEECH http://specialed.sqweebs.com/tts.htm

SPEECH-TO-TEXT

REFERENCES
De La Paz, S. (1999). Composing via dictation and speech recognition systems: Compensatory technology for students with learning disabilities. Learning Disability Quarterly, 22(3), pp. 173-182. De La Paz, S. & Graham, S. (1995). Dictation: Applications to writing for student with learning disabilities. In T. E. Scruggs & M.A. Mastropieri (Eds.), Advances in learning and behavioural disabilities. (Vol. 9, pp. 227-247). Greenwich, CT: JAI Press. Graham, S. (1990). The role of production factors in learning disabled students compositions. Journal of Educational Psychology, 82, 781-791. Hecker, L., Burns, L, Elkind, J., Elkind, K., & Katz, L. (2002). Benefits of assistive reading software for students with attention disorders. Annals of Dyslexia, 52, 244-272. Leong, C.K. (1995). Effects of on-line reading and simultaneous DECtalk auding in helping below-average and poor readers comprehend and summarize text. Learning Disability Quarterly, 18, 101-116. Macarthur, C., & Graham, S. (1996). Learning disabled students composing under three methods of text production: Handwriting, word processing and dictation. Journal of Special Education, 21, 22-42.

Software Spotlight
Word Talk
This is a FREE text-to-speech plug-in that can be used with Microsoft Word. Users can write or copy/paste text from a browser into MS Word, and can then use Word Talk to read the text back by word, sentence, paragraph or whole text. As words are read back, they are highlighted on the screen (you can customise the colour). Additional voice types can be downloaded.

TEXT-TO-SPEECH

TEXT-TO-SPEECH
What is text-to-speech?
Text-to-speech software is computer technology that converts text input (for example, via a word processor or text on a web page) into computer-generated speech.

SPEECH-TO-TEXT
What is speech-to-text?
Speech-to-text software is computer technology that allows a user to speak into a microphone and have their speech converted to text on a computer screen.

Who will it benefit?


Students: with speech impairments with reading difficulties with writing difficulties who are visually impaired

Who will it benefit?


Students: with fine motor difficulties with writing difficulties who are visually impaired who are hearing impaired who are physically impaired

Popular software
WordTalk CAST eReader Natural Reader

Popular software
Dragon Naturally Speaking WordQ/SpeakQ MacSpeech iListen

For more information


http://specialed.sqweebs.com/tts.htm

For more information


http://specialed.sqweebs.com/stt.htm

How can it be used?


Allows students with moderate to severe speech motor impairments to communicate orally. Students with visual impairments or reading difficulties can have text from websites and other text documents read out to them by the computer, creating a sense of independence in their learning. Can be used as a self-editing tool for students with writing difficulties student can have-a-go then have their text read back to them to see if it makes sense.

How can it be used?


Allows students with physical impairments , fine motor difficulties or visual impairments to generate text without the use of pen/pencil or keyboard. Students with articulation issues can utilise the software to encourage correct articulation. Allows students with writing difficulties to focus on content generation rather than the mechanics of writing. Future potential for real-time transcribing of lecturer/teacher talk to assist hearing-impaired students.

Potrebbero piacerti anche