Linear Prediction Models (779813), страница 4
Текст из файла (страница 4)
Note that for a given set of samples [x, xI],f X | X I ( x | x I ) is a constant, and it is reasonable to assume thatf A| X I (a | x I ) = f A (a ) .8.4.1 Probability Density Function of Predictor OutputThe pdf fX|A,XI(x|a,xI) of the signal x, given the predictor coefficient vector aand the initial samples xI, is equal to the pdf of the input signal e:f X | A, X I (x | a , x I )= f E (x − Xa )where the input signal vector is given by(8.64)Linear Prediction Models250e = − Xa(8.65)and f E (e ) is the pdf of e. Equation (8.64) can be expanded as e(0 ) e(1) e( 2) = e( N − 1) x (0) x ( −1) x (1) x (0) x ( 2) − x (1) x ( N − 1) x ( N − 2) x ( −2)x ( −3)x ( −1)x ( −2)x ( 0)x ( −1)x ( N − 3)x ( N − 4) a1 x (1 − P ) a 2 x ( 2 − P ) a3 x ( N − P − 1) a P (8.66)x(− P)Assuming that the input excitation signal e(m) is a zero-mean, uncorrelated,2Gaussian process with a variance of σ e , the likelihood function in Equation(8.64) becomesf X | A , X I ( x | a , x I ) = f E ( x − Xa )=1()N /22πσ e2 1exp (x − Xa ) T ( x − Xa ) 2σ 2e(8.67)An alternative form of Equation (8.67) can be obtained by rewritingEquation (8.66) in the following form: e0 − a P e1 0 e 0 3 = e4 0 e N −1 0− aP0− a2− a1− a21− a1− a201− a10−a −a0 00− aPP020− aP001000000− a110− a2− a10 x−P 0 x− P +1 0 x− P+2 0 x − P +3 1 x N −1 (8.68)In a compact notation Equation (8.68) can be written ase = Ax(8.69)MAP Estimation of Predictor Coefficients251Using Equation (8.69), and assuming that the excitation signal e(m) is a zero2mean, uncorrelated process with variance σ e , the likelihood function ofEquation (8.67) can be written asf X |A , X I ( x a , x I )=1(2πσ e2 )N / 21exp −x T A T Ax 2σ 2e(8.70)8.4.2 Using the Prior pdf of the Predictor CoefficientsThe prior pdf of the predictor coefficient vector is assumed to have aGaussian distribution with a mean vector µa and a covariance matrix Σaa:f A (a ) =1(2π )P / 2 Σ aa 1 / 21−1exp − (a − µ a )T Σ aa(a − µ a ) 2(8.71)Substituting Equations (8.67) and (8.71) in Equation (8.63), the posteriorpdf of the predictor coefficient vector f A| X , X I (a | x , x I ) can be expressed asf A| X , X I (a | x , x I ) =11f X | X I (x | x I ) (2π )( N + P) / 2σ eN Σ aa1/ 2 1 1 −1((x − Xa )T (x − Xa ) + (a − µ a )T Σ aaa − µ a ) × exp − 2 2 σ e (8.72)The maximum a posteriori estimate is obtained by maximising the loglikelihood function:∂∂ 1−1((ln f A| X , X I (a | x , x I ) = x − Xa )T (x − Xa ) + (a − µ a )T Σ aaa − µ a ) = 0∂a∂a σ e2[](8.73)This yields(aˆ MAP = Σ aa X T X + σ e2 I)−1 Σ aa X T x + σ e2 (ΣX T X + σ e2 Iaa)−1 µ a(8.74)Linear Prediction Models252Note that as the Gaussian prior tends to a uniform prior, the determinantcovariance matrix Σaa of the Gaussian prior increases, and the MAP solutiontends to the least square error solution:(aˆ LS = X T X)−1 (X T x )(8.75)Similarly as the observation length N increases the signal matrix XTXbecomes more significant than Σaa and again the MAP solution tends to aleast squared error solution.8.5 Sub-Band Linear Prediction ModelIn a Pth order linear prediction model, the P predictor coefficients model thesignal spectrum over its full spectral bandwidth.
The distribution of the LPparameters (or equivalently the poles of the LP model) over the signalbandwidth depends on the signal correlation and spectral structure.Generally, the parameters redistribute themselves over the spectrum tominimize the mean square prediction error criterion. An alternative to aconventional LP model is to divide the input signal into a number of subbands and to model the signal within each sub-band with a linear predictionmodel as shown in Figure 8.12. The advantages of using a sub-band LPmodel are as follows:(1) Sub-band linear prediction allows the designer to allocate a specificnumber of model parameters to a given sub-band. Different numbersof parameters can be allocated to different bands.(2) The solution of a full-band linear predictor equation, i.e. Equation(8.10) or (8.16), requires the inversion of a relatively largecorrelation matrix, whereas the solution of the sub-band LP modelsrequire the inversion of a number of relatively small correlationmatrices with better numerical stability properties.
For example, apredictor of order 18 requires the inversion of an 18×18 matrix,whereas three sub-band predictors of order 6 require the inversion ofthree 6×6 matrices.(3) Sub-band linear prediction is useful for applications such as noisereduction where a sub-band approach can offer more flexibility andbetter performance.Sub-Band Linear Prediction Model253In sub-band linear prediction, the signal x(m) is passed through a bank of Nband-pass filters, and is split into N sub-band signals xk(m), k=1, …,N. Thekth sub-band signal is modelled using a low-order linear prediction model asPkx k (m)=∑ a k (i ) x k (m − i )+ g k ek (m)(8.76)i =1where [ak, gk] are the coefficients and the gain of the predictor model for thekth sub-band. The choice of the model order Pk depends on the width of thesub-band and on the signal correlation structure within each sub-band.
Thepower spectrum of the input excitation of an ideal LP model for the kth subband signal can be expressed as1PEE ( f , k ) = 0f k ,start < f < f k ,end(8.77)otherwisewhere fk,start, fk,end are the start and end frequencies of the kth sub-bandsignal. The autocorrelation function of the excitation function in each subband is a sinc function given byree (m) = Bk s inc [m( Bk − f k 0 ) / 2](8.78)DownsamplerLPCmodelDownsamplerLPCmodelDownsamplerLPCmodelDownsamplerLPCmodelLPCparametersInput signalFigure 8.12 Configuration of a sub-band linear prediction model.Linear Prediction Models254where Bk and fk0 are the bandwidth and the centre frequency of the kth subband respectively. To ensure that each sub-band LP parameters only modelthe signal within that sub-band, the sub-band signals are down-sampled asshown in Figure 8.12.8.6 Signal Restoration Using Linear Prediction ModelsLinear prediction models are extensively used in speech and audio signalrestoration.
For a noisy signal, linear prediction analysis models thecombined spectra of the signal and the noise processes. For example, thefrequency spectrum of a linear prediction model of speech, observed inadditive white noise, would be flatter than the spectrum of the noise-freespeech, owing to the influence of the flat spectrum of white noise. In thissection we consider the estimation of the coefficients of a predictor modelfrom noisy observations, and the use of linear prediction models in signalrestoration.
The noisy signal y(m) is modelled asy( m) = x (m) + n(m)P= ∑ ak x (m − k )+ e( m) + n (m)(8.79)k =1where the signal x(m) is modelled by a linear prediction model withcoefficients ak and random input e(m), and it is assumed that the noise n(m)is additive. The least square error predictor model of the noisy signal y(m) isgiven byR yy aˆ = r yy(8.80)where Ryy and ryy are the autocorrelation matrix and vector of the noisysignal y(m). For an additive noise model, Equation (8.80) can be written as(R xx + Rnn )(a + a~ ) = (rxx + rnn )(8.81)where a˜ is the error in the predictor coefficients vector due to the noise. Asimple method for removing the effects of noise is to subtract an estimate ofthe autocorrelation of the noise from that of the noisy signal.
The drawbackSignal Restoration Using Linear Prediction Models255of this approach is that, owing to random variations of noise, correlationsubtraction can cause numerical instability in Equation (8.80) and result inspurious solutions. In the following, we formulate the p.d.f. of the noisysignal and describe an iterative signal-restoration/parameter-estimationprocedure developed by Lee and Oppenheim.From Bayes’ rule, the MAP estimate of the predictor coefficient vectora, given an observation signal vector y=[y(0), y(1), ..., y(N–1)], and theinitial samples vector xI isf A|Y , X I (a | y, x I ) =f Y | A, X I ( y | a , x I ) f A, X I (a , x I )(8.82)fY , X I ( y, x I )Now consider the variance of the signal y in the argument of the termf Y | A, X I ( y | a , x I ) in Equation (8.82).
The innovation of y(m) can be definedasPε ( m) = y ( m) − ∑ a k y ( m − k )k =1(8.83)P= e( m ) + n ( m ) − ∑ a k n ( m − k )k =1The variance of y(m), given the previous P samples and the coefficientvector a, is the variance of the innovation signal ε(m), given byVar [ y ( m ) y ( m − 1), , y ( m − P ), a ]= σ ε2+σ e2+ σ −σ2n2nP∑ a k2(8.84)k =1where σ e2 and σ n2 are the variance of the excitation signal and the noiserespectively. From Equation (8.84), the variance of y(m) is a function of thecoefficient vector a.
Consequently, maximisation of fY|A,XI(y|a,xI) withrespect to the vector a is a non-linear and non-trivial exercise.Lim and Oppenheim proposed the following iterative process in whichan estimate aˆ of the predictor coefficient vector is used to make an estimatexˆ of the signal vector, and the signal estimate xˆ is then used to improve theestimate of the parameter vector aˆ , and the process is iterated untilLinear Prediction Models256convergence. The posterior pdf of the noise-free signal x given the noisysignal y and an estimate of the parameter vector aˆ is given byf X | A,Y (x | aˆ, y ) =f Y|A, X ( y| aˆ , x ) f X |A ( x| aˆ )f Y|A ( y| aˆ )(8.85)Consider the likelihood term fY|A,X( y| aˆ, x ).