Spectral Subtraction (Vaseghi - Advanced Digital Signal Processing and Noise Reduction), страница 3
Описание файла
Файл "Spectral Subtraction" внутри архива находится в папке "Vaseghi - Advanced Digital Signal Processing and Noise Reduction". PDF-файл из архива "Vaseghi - Advanced Digital Signal Processing and Noise Reduction", который расположен в категории "". Всё это находится в предмете "теория управления" из 5 семестр, которые можно найти в файловом архиве МГТУ им. Н.Э.Баумана. Не смотря на прямую связь этого архива с МГТУ им. Н.Э.Баумана, его также можно найти и в других разделах. Архив можно найти в разделе "книги и методические указания", в предмете "теория управления" в общих файлах.
Просмотр PDF-файла онлайн
Текст 3 страницы из PDF
These methods differ in their approach to estimation of thenoise spectrum, in their method of averaging the noisy signal spectrum, andin their post processing method for the removal of processing distortions.Non-linear spectral subtraction methods are heuristic methods that utiliseestimates of the local SNR, and the observation that at a low SNR oversubtraction can produce improved results. For an explanation of theimprovement that can result from over-subtraction, consider the followingexpression of the basic spectral subtraction equation:| Xˆ ( f ) | = | Y ( f ) | − | N ( f ) |≈ | X ( f ) |+| N( f ) |−| N( f ) |≈| X ( f ) | + VN ( f )(11.23)where VN(f) is the zero-mean random component of the noise spectrum. IfVN(f) is well above the signal X(f) then the signal may be considered as lostto noise. In this case, over-subtraction, followed by non-linear processing ofthe negative estimates, results in a higher overall attenuation of the noise.This argument explains why subtracting more than the noise average cansometimes produce better results.
The non-linear variants of spectralsubtraction may be described by the following equation:| Xˆ ( f ) | = Y ( f ) | −α (SNR ( f ) )| N ( f ) | NL(11.24)where α (SNR( f )) is an SNR-dependent subtraction factor and | N ( f ) | NLis a non-linear estimate of the noise spectrum. The spectral estimate isfurther processed to avoid negative estimates as346Spectral Subtraction| Xˆ ( f ) || Xˆ ( f ) | = | β Y ( f )|if | Xˆ ( f ) | > | β Y ( f ) |otherwise(11.25)One form of an SNR-dependent subtraction factor for Equation (11.24) isgiven bysd ( | N ( f ) | )α (SNR ( f ) ) = 1 +(11.26)| N( f )|where the function sd(|N(f)| is the standard deviation of the noise atfrequency f. For white noise, sd(|N(f)|=σn, where σ n2 is the noise variance.Substitution of Equation (11.26) in Equation (11.24) yields sd ( | N ( f ) | ) | Xˆ ( f ) | = | Y ( f ) | − 1 + | N( f )|| N( f )| (11.27)In Equation (11.27) the subtraction factor depends on the mean and thevariance of the noise.
Note that the amount over-subtracted is the standarddeviation of the noise. This heuristic formula is appealing because at oneextreme for deterministic noise with a zero variance, such as a sine wave,α(SNR(f))=1, and at the other extreme for white noise α(SNR(f))=2. Inapplication of spectral subtraction to speech recognition, it is found that thebest subtraction factor is usually between 1 and 2.In the non-linear spectral subtraction method of Lockwood and Boudy,the spectral subtraction filter is obtained fromH ( f )=| Y ( f ) | 2 −| N ( f ) | 2NL(11.28)| Y ( f ) |2Lockwood and Boudy suggested the following function as a non-linearestimator of the noise spectrum:| N ( f ) |2NL()= - max| N ( f ) | 2 , SNR ( f ),| N ( f ) | 2overframesM(11.29)347Non-Linear Spectral SubtractionTime frameelannk chaner bTime framenkr baTime framekbannneleFilt(b) noisy speech at 12dBFilt(a) original clean speechTime framechaelannchanker bFilt(c) Non-linear spectral subtractioneFiltrchannel(d) Non-linear spectral subtraction with smoothingFigure 11.6 Illustration of the effects of non-linear spectral subtraction.The estimate of the noise spectrum is a function of the maximum value ofnoise spectrum over M frames, and the signal-to-noise ratio.
One form forthe non-linear function Φ(·) is given by the following equation:()- max| N ( f ) | 2 , SNR ( fMoverframesmax(| N ( f ) | ) Over M frames) =1 + γ SNR ( f )2(11.30)where γ is a design parameter. From Equation (11.30) as the SNR decreasesthe output of the non-linear estimator Φ(·) approaches max(| N ( f )| 2 ), and asthe SNR increases it approaches zero. For over-subtraction, the noiseestimate is forced to be an over-estimation by using the following limitingfunction:()| N ( f ) | 2 ≤ - max| N ( f ) | 2 , SNR ( f ), | N ( f ) | 2 over M frames ≤ 3| N ( f ) | 2(11.31)348Spectral Subtractionγ y(m)γphase[Y(f)]Noisy signaly(m)DFT|Y(f)| b^x(m)PSPLPFIDFT^X(f)=Y(f)–αN(f)Y(f)=X(f)+N(f)Silencedetector+αNoise spectrumestimatorN(f)Figure 11.7 Block diagram configuration of a spectral subtraction system.PSP = post spectral subtraction processing.The maximum attenuation of the spectral subtraction filter is limited toH ( f ) ≥ β , where usually the lower bound β ≥ 0.01 .
Figure 11.6 illustratesthe effects of non-linear spectral subtraction and smoothing in restoration ofthe spectrum of a speech signal.11.4 Implementation of Spectral SubtractionFigure 11.7 is a block diagram illustration of a spectral subtraction system.It includes the following subsystems:(a) a silence detector for detection of the periods of signal inactivity;the noise spectra is updated during these periods;(b) a discrete Fourier transformer (DFT) for transforming the timedomain signal to the frequency domain; the DFT is followed by amagnitude operator;(c) a lowpass filter (LPF) for reducing the noise variance; the purposeof the LPF is to reduce the processing distortions due to noisevariations;(d) a post-processor for removing the processing distortions introducedby spectral subtraction.;(e) an inverse discrete Fourier transform (IDFT) for transforming theprocessed signal to the time domain.(f) an attenuator γ for attenuation of the noise during silent periods.349Implementation of Spectral SubtractionThe DFT-based spectral subtraction is a block processing algorithm.
Theincoming audio signal is buffered and divided into overlapping blocks of Nsamples as shown in Figure 11.7. Each block is Hanning (or Hamming)windowed, and then transformed via a DFT to the frequency domain. Afterspectral subtraction, the magnitude spectrum is combined with the phase ofthe noisy signal, and transformed back to the time domain. Each signalblock is then overlapped and added to the preceding and succeeding blocksto form the final output.The choice of the block length for spectral analysis is a compromisebetween the conflicting requirements of the time resolution and the spectralresolution.
Typically a block length of 5–50 milliseconds is used. At asampling rate of say 20 kHz, this translates to a value for N in the range of100–1000 samples. The frequency resolution of the spectrum is directlyproportional to the number of samples, N. A larger value of N produces abetter estimate of the spectrum. This is particularly true for the lower part ofthe frequency spectrum, since low-frequency components vary slowly withthe time, and require a larger window for a stable estimate.
The conflictingrequirement is that, owing to the non-stationary nature of audio signals, thewindow length should not be too large, so that short-duration events are notobscured.The main function of the window and the overlap operations (Figure11.8) is to alleviate discontinuities at the endpoints of each output block.Although there are a number of useful windows with differentfrequency/time characteristics, in most implementations of the spectralsubtraction, a Hanning window is used. In removing distortions introducedby spectral subtraction, the post-processor algorithm makes use of suchinformation as the correlation of each frequency channel from one block tothe next, and the durations of the signal events and the distortions.
ThetimeFigure 11.8 Illustration of the window and overlap process in spectral subtraction.A m plitude350Spectral Subtraction8006004002000-200-400-600-800-1000-120002004006008001000120014001600180020001400160018002000A m plitude(a)8006004002000-200-400-600-800-1000-120020040060080010001200A m plitude(b)8006004002000-200-400-600-800-1000-120020040060080010001200140016001800200Tim e(c)Figure 11.9 (a) A noisy signal. (b) Restored signal after spectral subtraction.(c) Noise estimate obtained by subtracting (b) from (a).correlation of the signal spectral components, along the time dimension, canbe partially controlled by the choice of the window length and the overlap.The correlation of spectral components along the time domain increaseswith decreasing window length and increasing overlap.