Atjaunināt sīkdatņu piekrišanu

E-grāmata: Speech Enhancement

Edited by , Edited by , Edited by
  • Formāts: PDF+DRM
  • Sērija : Signals and Communication Technology
  • Izdošanas datums: 30-Mar-2006
  • Izdevniecība: Springer-Verlag Berlin and Heidelberg GmbH & Co. K
  • Valoda: eng
  • ISBN-13: 9783540274896
  • Formāts - PDF+DRM
  • Cena: 106,47 €*
  • * ši ir gala cena, t.i., netiek piemērotas nekādas papildus atlaides
  • Ielikt grozā
  • Pievienot vēlmju sarakstam
  • Šī e-grāmata paredzēta tikai personīgai lietošanai. E-grāmatas nav iespējams atgriezt un nauda par iegādātajām e-grāmatām netiek atmaksāta.
  • Formāts: PDF+DRM
  • Sērija : Signals and Communication Technology
  • Izdošanas datums: 30-Mar-2006
  • Izdevniecība: Springer-Verlag Berlin and Heidelberg GmbH & Co. K
  • Valoda: eng
  • ISBN-13: 9783540274896

DRM restrictions

  • Kopēšana (kopēt/ievietot):

    nav atļauts

  • Drukāšana:

    nav atļauts

  • Lietošana:

    Digitālo tiesību pārvaldība (Digital Rights Management (DRM))
    Izdevējs ir piegādājis šo grāmatu šifrētā veidā, kas nozīmē, ka jums ir jāinstalē bezmaksas programmatūra, lai to atbloķētu un lasītu. Lai lasītu šo e-grāmatu, jums ir jāizveido Adobe ID. Vairāk informācijas šeit. E-grāmatu var lasīt un lejupielādēt līdz 6 ierīcēm (vienam lietotājam ar vienu un to pašu Adobe ID).

    Nepieciešamā programmatūra
    Lai lasītu šo e-grāmatu mobilajā ierīcē (tālrunī vai planšetdatorā), jums būs jāinstalē šī bezmaksas lietotne: PocketBook Reader (iOS / Android)

    Lai lejupielādētu un lasītu šo e-grāmatu datorā vai Mac datorā, jums ir nepieciešamid Adobe Digital Editions (šī ir bezmaksas lietotne, kas īpaši izstrādāta e-grāmatām. Tā nav tas pats, kas Adobe Reader, kas, iespējams, jau ir jūsu datorā.)

    Jūs nevarat lasīt šo e-grāmatu, izmantojot Amazon Kindle.

A strong reference on the problem of signal and speech enhancement, describing the newest developments in this exciting field. The general emphasis is on noise reduction, because of the large number of applications that can benefit from this technology.

We live in a noisy world! In all applications (telecommunications, hands-free communications, recording, human-machine interfaces, etc.) that require at least one microphone, the signal of interest is usually contaminated by noise and reverberation. As a result, the microphone signal has to be "cleaned" with digital signal processing tools before it is played out, transmitted, or stored.This book is about speech enhancement. Different well-known and state-of-the-art methods for noise reduction, with one or multiple microphones, are discussed. By speech enhancement, we mean not only noise reduction but also dereverberation and separation of independent signals. These topics are also covered in this book. However, the general emphasis is on noise reduction because of the large number of applications that can benefit from this technology.The goal of this book is to provide a strong reference for researchers, engineers, and graduate students who are interested in the problem of signal and speech enhancement. To do so, we invited well-known experts to contribute chapters covering the state of the art in this focused field.
1 Introduction
1(8)
Jacob Benesty, Shoji Makino, Jingdong Chen
1.1 Speech Enhancement
1(2)
1.2 Challenges and Opportunities
3(1)
1.3 Organization of the Book
4(3)
1.4 Further Reading
7(1)
References
8(1)
2 Study of the Wiener Filter for Noise Reduction
9(34)
Jacob Benesty, Jingdong Chen, Yiteng (Arden) Huang, Simon Doclo
2.1 Introduction
9(2)
2.2 Estimation of the Clean Speech Samples
11(2)
2.3 Estimation of the Noise Samples
13(1)
2.4 Important Relationships Between Noise Reduction and Speech Distortion
14(7)
2.5 Particular Case: White Gaussian Noise
21(2)
2.6 Better Ways to Manage Noise Reduction and Speech Distortion
23(6)
2.6.1 A Suboptimal Filter
23(3)
2.6.2 Noise Reduction Exploiting the Speech Model
26(1)
2.6.3 Noise Reduction with Multiple Microphones
27(2)
2.7 Simulation Experiments
29(8)
2.8 Conclusions
37(1)
References
38(5)
3 Statistical Methods for the Enhancement of Noisy Speech
43(24)
Rainer Martin
3.1 Introduction
43(1)
3.2 Spectral Analysis
44(1)
3.3 The Wiener Filter and its Implementation
45(6)
3.4 Estimation of Spectral Amplitudes
51(3)
3.4.1 MMSE Estimation
51(2)
3.4.2 Maximum Likelihood and MAP Estimation
53(1)
3.5 MMSE Estimation Using Super-Gaussian Speech Models
54(4)
3.6 Background Noise Power Estimation
58(2)
3.6.1 Minimum Statistics Noise Power Estimation
58(2)
3.7 The MELPe Speech Coder
60(2)
3.8 Conclusions
62(1)
References
63(4)
4 Single- and Multi-Microphone Spectral Amplitude Estimation Using a Super-Gaussian Speech Model
67(30)
Thomas Lotter
4.1 Introduction
67(1)
4.2 Single-Channel Statistical Filter
68(15)
4.2.1 Statistical Model
70(8)
4.2.2 Speech Estimators
78(5)
4.3 Multichannel Statistical Filter
83(5)
4.3.1 Joint Statistical Model
84(2)
4.3.2 Multichannel MAP Spectral Amplitude Estimation
86(2)
4.4 Experimental Results
88(5)
4.5 Conclusions
93(1)
References
93(4)
5 From Volatility Modeling of Financial Time-Series to Stochastic Modeling and Enhancement of Speech Signals
97(18)
Israel Cohen
5.1 Introduction
97(2)
5.2 Problem Formulation
99(2)
5.3 Spectral Analysis
101(3)
5.4 Statistical Model for Speech Signals
104(1)
5.5 Model Estimation
105(1)
5.6 Experimental Results
106(2)
5.7 Conclusions
108(3)
References
111(4)
6 Single-Microphone Noise Suppression for 3G Handsets Based on Weighted Noise Estimation
115(20)
Akihiko Sugiyama, Masanori Kato, Masahiro Serizawa
6.1 Introduction
115(2)
6.2 Conventional Noise Suppression Algorithm
117(3)
6.2.1 MMSE-STSA
117(2)
6.2.2 Problem in Noise Estimation
119(1)
6.3 New Noise Suppression Algorithm
120(4)
6.3.1 Weighted Noise Estimation
121(2)
6.3.2 Spectral Gain Modification
123(1)
6.3.3 Computational Requirements
123(1)
6.4 Evaluation
124(7)
6.4.1 Objective Evaluation for Noise Estimation
125(2)
6.4.2 Subjective Evaluation
127(4)
6.5 Conclusions
131(1)
References
131(4)
7 Signal Subspace Techniques for Speech Enhancement
135(26)
Firas Jabloun, Benoit Champagne
7.1 Introduction
135(2)
7.2 Signal and Noise Models
137(1)
7.3 Linear Signal Estimation
138(5)
7.3.1 Least-Squares Estimator
139(1)
7.3.2 The Linear Minimum Mean Squared Error Estimator
139(1)
7.3.3 The Time-Domain Constrained Estimator
140(1)
7.3.4 The Spectral-Domain Constrained Estimator
141(2)
7.4 Handling Colored Noise
143(3)
7.4.1 Prewhitening
143(1)
7.4.2 The Generalized Eigenvalue Decomposition Method
144(1)
7.4.3 The Rayleigh Quotient Method
145(1)
7.5 A Filterbank Interpretation
146(2)
7.5.1 The Frequency to Eigendomain Transformation
146(1)
7.5.2 The Eigen Filterbank
146(2)
7.6 Implementation Issues
148(4)
7.6.1 Estimating the Covariance Matrix
149(1)
7.6.2 Parameter Analysis
150(2)
7.7 Fast Subspace Estimation Techniques
152(3)
7.7.1 Fast Eigenvalue Decomposition Methods
153(1)
7.7.2 Subspace Tracking Methods
153(1)
7.7.3 The Frame Based EVD (FBEVD) Method
154(1)
7.8 Some Recent Developments
155(2)
7.8.1 Auditory Masking
155(1)
7.8.2 Multi-Microphone Systems
156(1)
7.8.3 Subband Processing
156(1)
7.9 Conclusions
157(1)
References
157(4)
8 Speech Enhancement: Application of the Kalman Filter in the Estimate-Maximize (EM) Framework
161(38)
Sharon Gannot
8.1 Introduction
161(5)
8.2 Signal Model
166(2)
8.3 EM - Based Algorithm
168(4)
8.3.1 State Estimation (E-Step)
169(1)
8.3.2 Parameter Estimation (M-Step)
170(1)
8.3.3 Reduced Complexity
171(1)
8.3.4 Discussion
171(1)
8.4 Parameter Estimation Using Higher-Order Statistics
172(2)
8.5 Gradient-Based Sequential Algorithm
174(1)
8.6 All-Kalman Speech and Parameter Estimation
175(6)
8.6.1 Dual Scheme
176(2)
8.6.2 Joint Scheme
178(3)
8.7 Experimental Study
181(8)
8.7.1 Experimental Setup
181(1)
8.7.2 Verifying the Gaussian Assumption
182(1)
8.7.3 Objective Evaluation
183(3)
8.7.4 Subjective Evaluation
186(2)
8.7.5 Comparison Between EM-Based Algorithms
188(1)
8.7.6 Evaluation of the UKF
188(1)
8.8 Conclusions
189(6)
References
195(4)
9 Speech Distortion Weighted Multichannel Wiener Filtering Techniques for Noise Reduction
199(30)
Simon Doclo, Ann Spriet, Jan Wouters, Marc Moonen
9.1 Introduction
199(2)
9.2 GSC and Spatially Pre-Processed SDW-MWF
201(6)
9.2.1 Notation and General Structure
201(3)
9.2.2 Generalized Sidelobe Canceller
204(1)
9.2.3 Speech Distortion Weighted Multichannel Wiener Filter
205(2)
9.3 Frequency-Domain Criterion for SDW-MWF
207(6)
9.3.1 Frequency-Domain Notation
207(1)
9.3.2 Normal Equations
208(2)
9.3.3 Adaptive Algorithm
210(2)
9.3.4 Practical Implementation
212(1)
9.4 Approximations for Reducing the Complexity
213(6)
9.4.1 Block-Diagonal Correlation Matrices
213(3)
9.4.2 Diagonal Correlation Matrices
216(1)
9.4.3 Unconstrained Algorithms
217(1)
9.4.4 Summary
218(1)
9.5 Experimental Results,
219(5)
9.5.1 Setup and Performance Measures
219(1)
9.5.2 SNR Improvement and Robustness Against Microphone Mismatch
220(3)
9.5.3 Tracking Performance
223(1)
9.6 Conclusions
224(1)
References
225(4)
10 Adaptive Microphone Arrays Employing Spatial Quadratic Soft Constraints and Spectral Shaping 229(18)
Sven Nordholm, Hai Quang Dam, Nedelko Grbic, Siow Yong Low
10.1 Introduction
229(2)
10.2 Signal Modelling and Problem Formulation
231(4)
10.2.1 Analysis and Synthesis Filterbanks
232(1)
10.2.2 The Wiener Solution
233(1)
10.2.3 The Space Constrained Source Covariance Information
234(1)
10.3 Robust Soft Constrained Adaptive Microphone Array (RSCAMA)
235(3)
10.3.1 Problem Formulation
235(2)
10.3.2 A Recursive Algorithm for the RSCAMA
237(1)
10.4 Noise Statistics Updated Adaptive Microphone Array (NSUAMA)
238(4)
10.4.1 Problem Formulation
238(1)
10.4.2 The Noise Covariance Detector
238(2)
10.4.3 Estimation of Power Spectrum of SOI
240(1)
10.4.4 The NSUAMA Algorithm
241(1)
10.5 Evaluations
242(3)
10.5.1 The Simulation Scenario
242(1)
10.5.2 Results for RSCAMA and NSUAMA Beamformers
242(3)
10.6 Conclusions
245(1)
References
245(2)
11 Single-Microphone Blind Dereverberation 247(24)
Tomohiro Nakatani, Masato Miyoshi, Keisuke Kinoshita
11.1 Introduction
247(2)
11.2 Overview of Existing Approaches
249(2)
11.2.1 Blind Inverse Filtering
249(1)
11.2.2 Dereverberation Based on Speech Signal Features
250(1)
11.3 Harmonicity of Speech Signals and Its Robust Estimation
251(4)
11.3.1 Model of Speech Harmonicity
251(1)
11.3.2 Adaptive Harmonic Filtering
252(1)
11.3.3 Robust F0 Estimation and Voicing Detection
253(2)
11.4 Harmonicity Based Dereverberation - HERB
255(5)
11.4.1 Basic Idea
255(1)
11.4.2 Model of Reverberant Speech Signal
256(2)
11.4.3 Dereverberation Filter
258(1)
11.4.4 Interpretation of the Dereverberation Filter
258(2)
11.5 Implementation of a Prototype System
260(2)
11.5.1 Dereverberation Filter Calculation
261(1)
11.5.2 Heuristics Improving Accuracy of Fo Estimation and Voicing Decisions with Reverberation
261(1)
11.6 Simulation Experiments
262(3)
11.6.1 Task: Dereverberation of Word Utterances
262(1)
11.6.2 Energy Decay Curves of Impulse Responses
262(1)
11.6.3 Speaker Dependent Word Recognition Rate
263(2)
11.7 Future Directions
265(3)
11.7.1 Theoretical Extension of HERB
266(1)
11.7.2 Accuracy Improvement of Speech Model
266(2)
11.7.3 Reduction of Training Data Size
268(1)
11.8 Conclusions
268(1)
References
269(2)
12 Separation and Dereverberation of Speech Signals with Multiple Microphones 271(28)
Yiteng (Arden) Huang, Jacob Benesty, Jingdong Chen
12.1 Introduction
271(3)
12.2 Signal Model and Problem Formulation
274(2)
12.3 Blind Identification of a SIMO System
276(3)
12.4 Separating Reverberant Speech and Concurrent Interference
279(5)
12.4.1 Example: Removing Interference Signals in a 2 x 3 MIMO Acoustic System
279(2)
12.4.2 Generalization
281(3)
12.5 Speech Dereverberation
284(3)
12.5.1 Principle
284(2)
12.5.2 The Least-Squares Implementation
286(1)
12.6 Simulations
287(9)
12.6.1 Performance Measures
287(1)
12.6.2 Experimental Setup
288(2)
12.6.3 Experimental Results
290(6)
12.7 Conclusions
296(1)
References
297(2)
13 Frequency-Domain Blind Source Separation 299(30)
Hiroshi Sawada, Ryo Mukai, Shoko Araki, Shoji Makino
13.1 Introduction
299(2)
13.2 BSS for Convolutive Mixtures
301(1)
13.3 Overview of Frequency-Domain Approach
302(2)
13.4 Complex-Valued ICA
304(2)
13.5 Source Localization
306(5)
13.5.1 Basic Theory for Nearfield Model
307(1)
13.5.2 DOA Estimation with Farfield Model
308(3)
13.6 Permutation Alignment
311(4)
13.6.1 Localization Approach
312(1)
13.6.2 Correlation Approach
312(2)
13.6.3 Integrated Method
314(1)
13.7 Scaling Alignment
315(2)
13.8 Spectral Smoothing
317(3)
13.8.1 Windowing
318(1)
13.8.2 Minimizing Error by Adjusting Scaling Ambiguity
319(1)
13.9 Experimental Results
320(4)
13.9.1 Linear Array
320(2)
13.9.2 Planar Array
322(2)
13.10 Conclusions
324(1)
References
324(5)
14 Subband Based Blind Source Separation 329(24)
Shoko Araki, Shoji Makino
14.1 Introduction
329(2)
14.2 BSS of Convolutive Mixtures
331(2)
14.2.1 Model Description
331(1)
14.2.2 Frequency-Domain BSS and Related Issue
332(1)
14.3 Subband Based BSS
333(6)
14.3.1 Configuration of Subband BSS
333(3)
14.3.2 Time-Domain BSS Implementation for a Separation Stage
336(1)
14.3.3 Solving the Permutation and Scaling Problems
337(2)
14.4 Basic Experiments for Subband BSS
339(5)
14.4.1 Experimental Setup
339(1)
14.4.2 Subband System
340(1)
14.4.3 Conventional Frequency-Domain BSS
340(1)
14.4.4 Conventional Fullband Time-Domain BSS
341(1)
14.4.5 Results
341(2)
14.4.6 Discussion
343(1)
14.5 Frequency-Appropriate Processing for Further Improvement
344(5)
14.5.1 Longer Separation Filters in Low Frequency Bands
345(1)
14.5.2 Overlap-Blockshift in Low Frequency Bands
346(1)
14.5.3 Discussion
347(2)
14.6 Conclusions
349(1)
References
350(3)
15 Real-Time Blind Source Separation for Moving Speech Signals 353(18)
Ryo Mukai, Hiroshi Sawada, Shoko Araki, Shoji Makino
15.1 Introduction
353(2)
15.2 ICA Based BSS of Convolutive Mixtures
355(3)
15.2.1 Frequency-Domain ICA
355(1)
15.2.2 Permutation and Scaling Problems
356(1)
15.2.3 Low Delay Blockwise Batch Algorithm
357(1)
15.3 Residual Crosstalk Cancellation
358(4)
15.3.1 Straight and Crosstalk Components of BSS
358(1)
15.3.2 Model of Residual Crosstalk Component Estimation
359(1)
15.3.3 Adaptive Algorithm and Spectrum Estimation
360(2)
15.4 Experiments and Discussions
362(5)
15.4.1 Experimental Conditions
362(2)
15.4.2 Performance for Fixed Sources
364(1)
15.4.3 Moving Target and Moving Interference
365(1)
15.4.4 Performance of Blockwise Batch Algorithm with Postprocessing
366(1)
15.4.5 Performance of Online Algorithm
367(1)
15.5 Conclusions
367(1)
References
368(3)
16 Separation of Speech by Computational Auditory Scene Analysis 371(32)
Guy J. Brown, DeLiang Wang
16.1 Introduction
371(1)
16.2 Auditory Scene Analysis
372(1)
16.3 Computational Auditory Scene Analysis
373(18)
16.3.1 Peripheral Auditory Processing and Feature Extraction
375(1)
16.3.2 Monaural Approaches
376(6)
16.3.3 Binaural Approaches
382(5)
16.3.4 Frameworks for Cue Integration
387(4)
16.4 Integrating CASA with Speech Recognition
391(3)
16.5 CASA Compared to ICA
394(1)
16.6 Challenges for CASA
395(3)
16.7 Conclusions
398(1)
References
398(5)
Index 403