Virtanen T., Singh R., Raj B. — Techniques for Noise Robustness in Automatic Speech Recognition :: Электронная библиотека попечительского совета мехмата МГУ

Automatic speech recognition (ASR) systems are finding increasing use in everyday life. Many of the commonplace environments where the systems are used are noisy, for example users calling up a voice search system from a busy cafeteria or a street. This can result in degraded speech recordings and adversely affect the performance of speech recognition systems. As the use of ASR systems increases, knowledge of the state-of-the-art in techniques to deal with such problems becomes critical to system and application engineers and researchers who work with or on ASR technologies. This book presents a comprehensive survey of the state-of-the-art in techniques used to improve the robustness of speech recognition systems to these degrading external influences.

Key features:

Reviews all the main noise robust ASR approaches, including signal separation, voice activity detection, robust feature extraction, model compensation and adaptation, missing data techniques and recognition of reverberant speech.
Acts as a timely exposition of the topic in light of more widespread use in the future of ASR technology in challenging environments.
Addresses robustness issues and signal degradation which are both key requirements for practitioners of ASR.
Includes contributions from top ASR researchers from leading research units in the field

Content:
Chapter 1 Introduction (pages 1–5): Tuomas Virtanen, Rita Singh and Bhiksha Raj
Chapter 2 The Basics of Automatic Speech Recognition (pages 7–30): Rita Singh, Bhiksha Raj and Tuomas Virtanen
Chapter 3 The Problem of Robustness in Automatic Speech Recognition (pages 31–50): Bhiksha Raj, Tuomas Virtanen and Rita Singh
Chapter 4 Voice Activity Detection, Noise Estimation, and Adaptive Filters for Acoustic Signal Enhancement (pages 51–85): Rainer Martin and Dorothea Kolossa
Chapter 5 Extraction of Speech from Mixture Signals (pages 87–108): Paris Smaragdis
Chapter 6 Microphone Arrays (pages 109–157): John McDonough and Kenichi Kumatani
Chapter 7 From Signals to Speech Features by Digital Signal Processing (pages 159–192): Matthias Wolfel
Chapter 8 Features Based on Auditory Physiology and Perception (pages 193–227): Richard M. Stern and Nelson Morgan
Chapter 9 Feature Compensation (pages 229–250): Jasha Droppo
Chapter 10 Reverberant Speech Recognition (pages 251–281): Reinhold Haeb?Umbach and Alexander Krueger
Chapter 11 Adaptation and Discriminative Training of Acoustic Models (pages 283–310): Yannick Esteve and Paul Deleglise
Chapter 12 Factorial Models for Noise Robust Speech Recognition (pages 311–345): John R. Hershey, Steven J. Rennie and Jonathan Le Roux
Chapter 13 Acoustic Model Training for Robust Speech Recognition (pages 347–368): Michael L. Seltzer
Chapter 14 Missing?Data Techniques: Recognition with Incomplete Spectrograms (pages 369–398): Jon Barker
Chapter 15 Missing?Data Techniques: Feature Reconstruction (pages 399–432): Jort Florent Gemmeke and Ulpu Remes
Chapter 16 Computational Auditory Scene Analysis and Automatic Speech Recognition (pages 433–462): Arun Narayanan and Deliang Wang
Chapter 17 Uncertainty Decoding (pages 463–486): Hank Liao