Sei sulla pagina 1di 2

This guide outlines the steps required to add or modify phonemes in SpeechPlayer

.
It is not detailed, and requires the user to be advanced in computer usage.
First, let's clarify some needed terms. I'll let wikipedia explain them for you:
http://en.wikipedia.org/wiki/Phoneme
http://en.wikipedia.org/wiki/International_Phonetic_Alphabet
So, basically, each sound in a language is a phoneme. SpeechPlayer uses the ipa
notation to represent phonemes.
eSpeak does recognise ipa symbols, so you can use it.
The first tstep required in using SpeechPlayer in another language is changing t
he language code in the program. SpeechPlayer uses eSpeak to convert text to ipa
phoneme data. The language is set to "en", and you have to set it to whatever l
anguage you want.
Then, you have to identify the phonemes which have trouble.
To find out the ipa representation of a phoneme, use command line eSpeak. Use th
e --ipa flag to make it output ipa and then write some letters to produce the ph
oneme in the output.
Now, to really implement the phoneme. At least for vowels, the essential paramet
ers necessary to construct phonemes are formants and their bandwidth values.
I won't explain those here, as this is quite complicated stuff, and I'm not much
of a teacher, especially while using a non-native language. If you want to unde
rstand more, start here:
http://person2.sol.lu.se/SidneyWood/praate/whatform.html
Ok, so how to get the formants and bandwidths for a vowel?
Download WaveSurfer from:
http://sourceforge.net/projects/wavesurfer/
Record the wanted phoneme in a wave vile without breaks. If you're targeting a v
owel, I recommend spelling it continuously and starting and stopping the record
while the the sound can be heard. Load the file into WaveSurfer. You will be as
ked for a configuration. I wasn't able to use NVDA for this program, so I'll des
cribe the steps using jaws. Press the speech analysis configuration with jaws cu
rsor and then click ok.
Test if you can play the file with space. If yes, right click anywhere in the pr
ogram window and chose formant plot. Route jaws to pc and right click again, cho
ose properties. If everything is ok, You should see a window like this:
--Properties: y-old1.wav (pane:1)
Pane Data Plot Spectrogram Formants Sound Playback
Data filename extension: .frm
Data file path: Choose...
Number of header lines to ski 0
Column delimiter: space tab Comma
Backdrop type: None
Plot column: 0 using red
Choose... Line
Plot column: 1 using green
Choose... Line
Plot column: 2 using blue
Choose... Line
Plot column: 3 using yellow
Choose... Line
Lock data plot
Save only data values for current selection
Plot value bounds: min 0.0 max graphic 475
Start time offset: 0.0
Only warn for unsaved manual data modifications

OK Cancel Apply
--Click on "formants" and make the following setttings. They were recommended in a
n article I'll link to at the end.
--Analysis window length: 0.050 s
Analysis window type: Hamming
Pre-emphasis factor: 0.7
Frame interval: 0.01 s
LPC order: 12
LPC type: 0
Down-sampling frequency 10000. Hz
OK Cancel Apply
--Then press ok.
Use jaws cursor to locate a few lines having increasing numbers on them, like:
--1
2
3
-Position the cursor at the end of one of those lines and then right click. Choos
e save data file. If you've done everything correctly, the file name should end
in .frm. You can save it. If not, click cancel and try again.
The resultant file contains 8 fields, 4 formants and then their respective bandw
idths. There is an entry for all these 8 fields for each sound frame. Now, you h
ave to choose a frame and add the formannt info into SpeechPlayer. You can open
the file in excel to make it easier, each frame will be on a different row. Curr
ently, all the formant info resides in data.py. Take a look at the file to see h
ow it looks. Basically, each ipa symbol has some data attached to it, including
the formants. For vowels, formants are called cf1-6, and bandwidths cb1-6.
Complete cf1-4 and cb1-4 with values obtained with WaveSurfer. Don't copy decima
ls. I recommend starting with an already made vowel, as similar to your vowel as
possible. If your vowel is already in data.py, simply edit it. If not, create a
new entry with the ipa symbol you got from eSpeak. Take care not to break the f
ormatting of the file.
After you completed the values, save the file and reload SpeechPlayer. If the vo
wel doesn't sound good, try to choose another frame from the WaveSurfer analysis
file. If it still doesn't sound good, you can try further adjusting the formant
s. For example, if the vowel sounds too sharp, you can try to lower the frequenc
y of higher formants, or to reduce their bandwidths.
The other parameters in data.py are of course important too, and their modificat
ion can improve the sound. In order to understand what they do, open the NVDA vo
ice settings and there you can change many of them. Also, you can read:
http://www.asel.udel.edu/speech/tutorials/synthesis/gensyn.htm
Also, you can try to change settings in WaveSurfer
For consonants, I've had less success. The basic principles should theoretically
apply, only you modify the parallel formants (pf1-4), and their respective band
widths.
Much of this research is based on the following article:
http://link.springer.com/chapter/10.1007/978-1-4757-3413-3_10
It contains additional info on consonants, too.

Potrebbero piacerti anche