Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Durga Prasad
ASR
What
is the task? What are the main difficulties? How is it approached? How good is it? How much better could it be?
3/34
React appropriately Convert the input speech into another medium, e.g. text
Articulation produces sound waves which the ear conveys to the brain for processing
4/34
Digitization
Converting analogue signal into digital representation Separating speech from background noise Variability in human speech Recognizing individual sound distinctions (similar phonemes)
Pragmatics
go
home Markov model backbone composed of phones (hidden because we dont know correspondences)
x0
x1
x2
x3
x4
x5
x6
x7
x8
x9
Acoustic observations
Different
Health
care
Military
Air
traffic controller
Mobile
Voice
telephony
User interface
Speech
to text
Speech
Recognition works best if the microphone is close to the user (e.g. in a phone, or if the user is wearing a microphone). More distant microphones (e.g. on a table or wall) will tend to increase the number of errors. User may speak different languagesLocal accents may not be recognized
Encouraged
by some innovative models, developments in ASR appear to be accelerating. The outlook is optimistic that future applications of automatic speech recognition will contribute substantially to the quality of life among deaf children and adults, and others who share their lives, as well as public and private sectors of the business community who will benefit from this technology