Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
LPC (Linear Predictor Coding) is a method to represent and analyze human speech. The
idea of coding human speech is to change the representation of the speech. Representation
when using LPC is defined with LPC coefficients and an errorsignal, instead of the original
speech signal. The LPC coefficients are found by LPC estimation which describes the inverse
transfer function of the human vocal tract.
A(z)
Impuls generator v(n) g u(n) + e(n)
/ white noise A(z)
-
u(n-1) y(n)
W
Z -1
FIR
Speech production Vocal tract model LPC - predictions error filter
Figure 1.1: Relationship between vocal tract model and LPC model.
The above figure 1.1 shows the relationship between vocal tract transfer function and the
LPC transfer function. Left part of the figure shows speech production model, while right-
hand side of figure shows LPC prediction error filter (LPC analysis filter) applied to output
of the vocal tract model. Vocal transfer function and LPC transfer function are defined as
follow:
g g
H(z) = = n (1.1)
A(z) 1 +
∑ ak z−k
k=1
½
n 1 k=0
A(z) = 1 + ∑ ak z−k ak = (1.2)
k=1 −wk k = 1, 2, .., M
The method to obtain the LPC coefficients included in the equation 1.2 is calculated using
LPC estimation. This method is described in next section. LPC analysis and LPC synthesis
is also describe in later sections, which has application in bandwidth expansion.
I
CHAPTER 1. LPC MODELING OF VOCAL TRACT
1.1 LPC-estimation
LPC estimation is used to constructing the LPC coefficients for the inverse transfer function
of the vocal tract. The standard methods for LPC coefficients estimation have the assump-
tion that the input signal is stationery. Quasi stationery signal is obtain by framing the input
signal which is often done in frames in length of 20 ms. A more stationery signal result in
a better LPC estimation because the signal is better described by the LPC coefficients and
therefore minimize the residual signal. The residual signal also called the error signal which
is described in next section.
S g, a
LPC - estimation
Figure 1.2 show a block diagram of a LPC estimation, where S is the input signal, g is the
gain of the residual signal (prediction error signal) and a is a vector containing the LPC
coefficients to a specific order. The size of vector depends on the order of the LPC estimation.
Bigger order means more LPC coefficients and therefore better estimation of the vocal tract.
The above Matlab code calculate a and g from a given input signal S.
Input signal − (frame: 38 of man_nb.wav) LPC coefficients − (LPC order: 12, frame: 38 of man_nb.wav)
1
0.5
0.2
Amplitude
−0.5
0.1
−1
−1.5
0 2 4 6 8 10 12
Amplitude
0
Coefficients [n]
−40
−0.2
−50
−0.3 −60
−70
0.925 0.93 0.935 0.94 0.945 0.95 0 500 1000 1500 2000 2500 3000 3500 4000
Time [s] Frequency [Hz]
(a) Input signal (b) LPC coefficients and its frequency response
Figure 1.3(a) show the input signal and figure 1.3(b) show the LPC coefficients and the fre-
quency response of the LPC coefficients, which is found using above Matlab code.
II
1.2. LPC-ANALYSIS
1.2 LPC-analysis
LPC analysis calculates an error signal from the LPC coefficients from LPC estimation. This
error signal is called the residual signal which could not be modeled by the LPC estimator.
It is possible to calculate this residual signal by filtering the original signal with the inverse
transfer function from LPC estimation. If the inverse transfer function from LPC estimation
is equal to the vocal tract transfer function then is the residual signal from the LPC analysis
equal to the residual signal which is put in to the vocal tract. In that case is the residual
signal equal to the impulses or noise from the human speech production (Se illustration
1.1).
g, a
s e
LPC-analysis
Figure 1.4 show a block diagram of LPC analysis where S is the input signal, g and a is
calculated from LPC estimation and e is the residual signal for LPC analysis.
The above Matlab code calculate e by filtering the input signal S with the inverse transfer
function which is found from LPC estimation.
Input signal − (frame: 38 of man_nb.wav) Error signal − (frame: 38 of man_nb.wav)
0.2 0.2
0.1 0.1
Amplitude
Amplitude
0 0
−0.1 −0.1
−0.2 −0.2
−0.3 −0.3
0.925 0.93 0.935 0.94 0.945 0.95 0.925 0.93 0.935 0.94 0.945 0.95
Time [s] Time [s]
Figure 1.5(a) show the input signal and figure 1.5(b) show the error signal fra LPC analysis
using the above Matlab code.
III
CHAPTER 1. LPC MODELING OF VOCAL TRACT
1.3 LPC-synthesis
LPC synthesis is used to reconstruct a signal from the residual signal and the transfer func-
tion of the vocal tract. Because the vocal tract transfer function is estimated from LPC es-
timation can this be used combined with the residual / error signal from LPC analysis to
construct the original signal.
g, a
e s
LPC-systhesis
Figure 1.6 show a block diagram of LPC synthesis where e is the error signal found from
LPC analysis and g and a from LPC estimation. Reconstruction of the original signal s is
done by filtering the error signal with the vocal tract transfer function.
The above Matlab code calculate the original signal S from a error signal e and vocal tract
transfer function represented with a and g.
0.2 0.2
0.1 0.1
Amplitude
Amplitude
0 0
−0.1 −0.1
−0.2 −0.2
−0.3 −0.3
0.925 0.93 0.935 0.94 0.945 0.95 0.925 0.93 0.935 0.94 0.945 0.95
Time [s] Time [s]
Figure 1.7(a) show the error signal and figure 1.7(b) show the original signal which is found
from LPC synthesis using the above Matlab code.
IV
1.4. APPLICATION OF LPC
Bandwidth expansion is a method to increase the frequency range of a signal. The increase
in frequency is done by adding information about the higher frequency components. The
original frequency components (LPC coefficients) is found by using LPC estimation. Then
by adding the higher frequency components using code book for envelope extension and
excitation extension is it possible to increase the bandwidth of the signal.
LPC - synthesis
Figure 1.8 show the block diagram of bandwidth expansion using LPC and codebook (en-
velope and excitation extension) with additional frequency information.
The matlab code in appendics implements all on the above blockdiagram besides excitation
and envelope extension.
V
CHAPTER 1. LPC MODELING OF VOCAL TRACT
1.5 Appendix
d(n) + e(n)
-
u(n) y(n)
W
FIR
Orthogonality
∞
y(n) = û(n|Un ) = ∑ w∗k u(n − k) n = 0, 1, 2, ... (1.3)
k=0
£ ¤
J = E e(n)e∗ (n)] = E[|e(n)|2 (1.5)
ˆ
eo (n) = d(n) − d(n|Un) (1.9)
h i
Jmin = E |eo (n)|2 (1.10)
VI
1.5. APPENDIX
Wiener hopf
· µ ¶¸
∞
∗ ∗
E u(n − k) d (n) − ∑ woi u (m − i) = 0 k = 0, 1, 2, ... (1.11)
i=0
∞
∑ woi E [u(n − k)u∗ (n − i)] = E [u(n − k)d ∗ (n)] k = 0, 1, 2, ... (1.12)
i=0
∞
∑ woi r(i − k) = p(−k) k = 0, 1, 2, ... (1.15)
i=0
Rwo = p (1.16)
r(0) r(1) ... r(M − 1)
£ ¤ r∗ (1) r(0) ... r(M − 2)
R = u(n)uH (n) R =
(1.17)
. . .
∗ ∗
r (M − 1) r (M − 2) ... r(0)
wo = R−1 p (1.20)
VII
CHAPTER 1. LPC MODELING OF VOCAL TRACT
u(n) + e(n)
-
u(n-1) y(n)
W
Z -1
FIR
M
y(n) = û(n|Un−1 ) = ∑ w∗k u(n − k) (1.21)
k=1
M
e(n) = u(n) − ∑ w∗k u(n − k) (1.23)
k=1
M
e(n) = ∑ a∗k u(n − k) (1.24)
k=0
½
M 1 k=0
e(n) = ∑ a∗k u(n − k) ak = (1.25)
k=0 −wk k = 1, 2, .., M
VIII
1.5. APPENDIX
24 %%%%%%%%%%%%%%%%%
25
26 %L o a d i n g w a v f i l e − s t a r t
27 u s e d _ w a v _ f i l e = ’ man_nb . wav ’ ;
28 [ y , f s ] = wavread ( u s e d _ w a v _ f i l e ) ;
29 y = y(: ,1) ;
30
31 i f w a v e _ a s _ i n p u t == 0
32 y = s i n (2∗ p i ∗ 1 0 0 0 ∗ ( 0 : 4 0 0 0 0 ) ∗1/ f s ) ’ ;
33 end
34 %L o a d i n g w a v f i l e − end
35
36 %Downsample i n p u t s i g n a l − s t a r t
37 y = decimate (y , 2 ) ;
38 fs = fs /2
39 %Downsample i n p u t s i g n a l − end
40
41 %%%%%%%%%%%%%%%%%%%%%%
42 %Pre i n i t a l i z e − s t a r t
43 f r a m e s a m p l e s = f r a m e l e n g t h w i n d o w / ( 1 / f s ) ; %l e n g t h o f f r a m e f r o m i n p u t s i g n a l [ u n i t : s a m p l e s ]
44 f r a m e s a m p l e s o v e r l a p = f r a m e l e n g t h o v e r l a p / ( 1 / f s ) ; %l e n g t h o f o v e r l a p b e t w e e n t o f r a m e s [ u n i t : s a m p l e s ]
45
46 y = y ( 1 : l e n g t h ( y ) − mod ( l e n g t h ( y ) , f r a m e s a m p l e s ) ) ; %f i x t h e l e n g t h o f i n p u t s i g n a l f o r f r a m i n g
47 minmaxy = [ min ( y ) max ( y ) ] ; %min and max v a l u e s o f i n p u t s i g n a l ( u s e d f o r p l o t t i n g )
48 %Pre i n i t a l i z e − end
49 %%%%%%%%%%%%%%%%%%%%
50
51 %Frameing − s t a r t
52 d i m e n s i o n y = s i z e ( y ) ; %u s e d f o r r e c o n s t r u c t i o n ( c o n t a i n t h e t r u e d i m e n s i o n s o f t h e i n p u t s i g n a l )
53 d i m e n s i o n y f r a m e = [ f r a m e s a m p l e s l e n g t h ( y ) / f r a m e s a m p l e s ] ; %u s e d f o r f r a m e i n g [ s a m p l e s i n frame , number o f f r a m e s ]
54
55 %f r a m i n g t h e d a t a
56 for i = 1: dimensionyframe (2)
57 y f r a m e ( : , i ) = y ( 1 + ( f r a m e s a m p l e s −f r a m e s a m p l e s o v e r l a p ) ∗( i −1) : f r a m e s a m p l e s + ( f r a m e s a m p l e s −f r a m e s a m p l e s o v e r l a p ) ∗( i −1) ) ;
58 end
59 minmaxyframe = [ min ( y f r a m e ( : , o f f s e t ) ) max ( y f r a m e ( : , o f f s e t ) ) ] ;
60 %Frameing − end
61
62 %Window − s t a r t
63 i f hammingwindowed
64 yframewindow = y f r a m e . ∗ [ hamming ( d i m e n s i o n y f r a m e ( 1 ) ) ∗o n e s ( 1 , d i m e n s i o n y f r a m e ( 2 ) ) ] ;
65 else
66 yframewindow = y f r a m e ;
67 end
68 %Window − end
69
70 %M o d e l f i t t i n g − s t a r t
71 s i g n a l N B = yframewindow ;
72 %M o d e l f i t t i n g − end
73
74 %LPC e s t i m a t i o n − s t a r t
75 for i = 1: dimensionyframe (2)
76 [ autosignalNB ( : , i ) , l a g s ( i , : ) ] = x c o r r ( signalNB ( : , i ) ) ;
77 end
78 a u t o s i g n a l N B = a u t o s i g n a l N B ( f i n d ( l a g s ( i , : ) ==0) : end , : ) ;
79 [ a , g ] = l e v i n s o n ( autosignalNB , numberofLPCcoeff ) ;
80 %LPC e s t i m a t i o n − end
81
82 %F r e q u e n c y r e p o n s e o f LPC t r a n s f e r f u n c t i o n − s t a r t
83 [H, F ] = f r e q z ( g ( o f f s e t ) ^ 0 . 5 , a ( o f f s e t , : ) , f f t p o i n t s , f s ) ;
84 %F r e q u e n c y r e p o n s e o f LPC t r a n s f e r f u n c t i o n − end
85
86 %LPC p o l y t o LSF − s t a r t
87 lsf = poly2lsf (a( offset , : ) )
88 %LPC p o l y t o LSF − end
89
90 %F r e q u e n c y r e p o n s e o f LSF − s t a r t
91 [ H1 , F1 ] = f r e q z ( 1 , l s f , f f t p o i n t s , f s ) ;
92 %F r e q u e n c y r e p o n s e o f LSF − end
93
94 %LPC LSF t o p o l y − s t a r t
95 %a = l s f 2 p o l y ( l s f )
96 %LPC LSF t o p o l y − end
97
98 %LPC a n a l y s i s − s t a r t
99 errorNB = f i l t e r ( a ( o f f s e t , : ) , g ( o f f s e t ) . ^ 0 . 5 , signalNB ) ; % Error s i g n a l
100 %LPC a n a l y s i s − end
101
102 %LPC s y n t h e s i s − s t a r t
103 s i g n a l N B r e c o n s t r u c t e d = f i l t e r ( g ( o f f s e t ) ^ 0 . 5 , a ( o f f s e t , : ) , errorNB ) ;
104 %LPC s y n t h e s i s − end
105
106 if plot_global
107 figure ( plotnumber )
108 plotnumber = plotnumber + 1;
109
110 subplot (2 ,1 ,1)
111 h o l d on
IX
CHAPTER 1. LPC MODELING OF VOCAL TRACT
X
1.5. APPENDIX
196 t i t l e ( t e x l a b e l ( s p r i n t f ( ’ R e c o n s t r u c t e d s i g n a l − ( f r a m e : %d o f %s ) ’ , o f f s e t , u s e d _ w a v _ f i l e ) , ’ l i t e r a l ’ ) )
197 x l i m ( [ 0 f r a m e s a m p l e s −1]∗1/ f s + ( f r a m e l e n g t h w i n d o w − f r a m e l e n g t h o v e r l a p ) ∗( o f f s e t −1) )
198 x l a b e l ( ’ Time [ s ] ’ ) , y l a b e l ( ’ A m p l i t u d e ’ ) , y l i m ( [ minmaxyframe ] ) , g r i d
199
200 if epsfiles
201 p r i n t −d e p s c − t i f f −r 3 0 0 e p s / l p c _ s y n t h e s i s _ s i g n a l _ r e c o n s t r u c t i o n _ B J
202 end
203
204 end
XI