ICASSP 2008 - 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing - March 30 - April 4, 2008 - Las Vegas, Nevada, U.S.A.

T-7: Modulation Frequency Analysis and Modification of Signals

Monday Morning, March 31
09:00 - 12:00

Presented by

Les Atlas, University of Washington, USA, Somsak Sukittanon, University of Tennesee, USA and Steven M. Schimmel, University of Zurich, Switzerland

Abstract

If regular Fourier frequency analysis models steady-state signals and systems, then modulation frequency processing represents the analysis and filtering of patterns of signals’ deterministic variability from steady-state. In digital signal processing terms, modulation frequency analysis performs spectral analysis of a signal’s time-dependent transform (or, equivalently, subband) amplitude envelopes, and organizes the subband envelope spectra in a convenient and often compact representation called the modulation spectrogram. As a psycho-acoustically motivated yet realizable representation of signals, the modulation spectrogram has proven to be a useful analysis tool in areas such as underwater acoustics [11], audio compression (e.g., [3,19,21]), signal classification (e.g., [10,18]), and speech recognition, and a powerful means to signal modification for applications such as speech and music enhancement (e.g., [2,16]), single channel source separation, and noise suppression.

Results in speech processing, psychophysics, and auditory sciences consistently demonstrate the qualitative importance of modulation frequency (e.g., [4,5,7]). The modulation frequency concept, however, has little quantitative foundation which has been consistently followed. Quite recently, building upon decades of work by others, we have summarized signal processing principles for well-founded analysis and design of modulation frequency analysis and filtering systems. Moreover, recent breakthroughs, such as the discovery that modulation envelopes of natural signals are usually complex, allow modification in modulation with no added artifact and satisfaction of general superposition constraints.

In this tutorial we present a quantitative foundation of modulation frequency that bridges the gap with previous qualitative results. The objectives of this tutorial are to bring the audience up to date with the latest theoretical insights into modulation frequency analysis, to demonstrate the relevance of the theory on a selection of applications, to discuss remaining open research questions, and to introduce the Modulation Toolbox for MATLAB™. It provides the attendants the theoretical knowledge, practical examples and a CD-ROM of MATLAB™ tools to apply modulation frequency analysis and processing to her/his area of research.

The tutorial is intended for research professionals in academia and industry who are interested in a novel approach to traditional signal processing problems which goes beyond Fourier and wavelet techniques. A basic signal processing background is expected, but no prior knowledge of modulation theory is assumed.

Outline

Part 1: Theory of modulation frequency analysis
Presented by Les Atlas

  • History
    1. Rice's representation, Zadeh's and Kailath's early contributions
    2. Oppenheim’s and Stockham's homomorphic demultiplication
    3. Dugundji's pre-envelope (analytic signal) and the Hilbert envelope
    4. Modern coherent signal models
  • What's wrong with Hilbert envelopes?
    1. Real vs. complex envelope
    2. Effect of the envelope detection operation on speech modulation spectrograms
      • Incoherent envelope (Hilbert or magnitude envelope)
      • Coherent envelope (instantaneous frequency estimation, frequency reassignment)
  • Analysis (energetic representation) versus analysis/synthesis (modulation filtering)
  • Desirable properties of a modulation filter
    1. Superposition
    2. Frequency-shift invariance
  • Carrier estimates
    1. Instantaneous frequency from analytic signals
    2. Discriminant and bandlimited instantaneous frequency estimators
  • Foundations of bandlimited carrier estimates
  • Remaining theoretical research questions and opportunities

Part 2: Modulation analysis and filtering for speech signals
Presented by Steven Schimmel

  • Introduction to the Modulation Toolbox (MATLAB™) and software demo
  • Analysis: features of speech modulation spectrograms
    1. Pitch, syllabic rate, phonemic rate, harmonics, and formants
    2. Importance of modulation phase
  • Application to audio coding
    1. Low-bitrate modulation representation of audio
    2. Demonstration of a low-bitrate audio coder
  • Filtering
    1. Modulation filters and effective modulation frequency response
    2. Modulation filtering for improved speech intelligibility
    3. Talker separation based on modulation analysis
  • Challenges
    • Speech carrier estimation, in relation to harmonics and formants

Part 3: Applications of modulation spectral analysis
Presented by Somsak Sukittanon

  • Comparison and connection of different modulation frequency methods
    1. Non-parametric approaches, e.g. 1) time-frequency, 2) synchronized block averaging (cyclostationary assumption), and 3) modulation spectrogram
    2. Parametric approaches, e.g. LPSD (Linear Prediction Spectral Domain)
  • Auditory inspired modulation frequency using non-uniform modulation frequency decomposition
    1. Modulation scale
    2. Multi-scale spectro temporal modulations
    3. Fepstrum analysis
  • Applications to pattern classification
    1. Speech vs. non-speech classification
    2. Speech and speaker recognition
    3. Content identification
    4. In-building machine classification

Speaker Biography

Les E. Atlas received his M.S. and Ph.D. degrees in Electrical Engineering from Stanford University in 1979 and 1984, respectively. He joined the University of Washington in 1984, where he is a Professor of Electrical Engineering. His research is in digital signal processing, with specializations in acoustic analysis, time-frequency representations, and signal recognition and coding. His research is supported by DARPA, the Office of Naval Research, the Army Research Lab, and the Washington Research Foundation. Dr. Atlas received a National Science Foundation Presidential Young Investigator Award and a 2004 Fulbright Senior Research Scholar Award. He was General Chair of the 1998 IEEE International Conference on Acoustics, Speech, and Signal Processing, Chair of the IEEE Signal Processing Society Technical Committee on Theory and Methods, and a member-at-large of the Signal Processing Society’s Board of Governors. He is IEEE fellow. His webpage is at http://www.ee.washington.edu/people/faculty/atlas/.

Somsak Sukittanon received the M.S. and Ph.D. degrees in Electrical Engineering from the University of Washington, Seattle, in 1999 and 2004, respectively. He joined the University of Tennessee, Martin in 2007.During 1999 and 2000, he worked at Cantametrix Inc., Bellevue, WA (now acquired by Gracenote Inc., CA). He created several parametric music content extractors that contributed to a novel and powerful music content search capability. In summer 2003, he was an intern at Microsoft Research, Redmond, WA, in the Communication, Collaboration, and Signal Processing Group. Between 2004 and 2006, he was a R&D engineer at Virtual DSP Corporation and ShotSpotter Inc. where he worked on acoustic classification over sensor networks. His research interests include audio processing and classification, embedded systems, signal enhancement, and machine learning. He received the annual Outstanding Faculty Teaching Award for the department of electrical engineering at the University of Washington in 2005. His webpage is at http://www.ee.washington.edu/research/isdl/people/somsaksukittanon/.

Steven M. Schimmel received the M.S. degree in Computer Science from the Delft University of Technology, The Netherlands, in 2001, and the Ph.D. degree in Electrical Engineering at the University of Washington, Seattle in 2007. He currently holds a post-doc position at the University of Zurich, Switzerland. At the University of Washington, his doctoral research focused on the theory of modulation transforms, and on applications of modulation analysis and filtering of speech to hearing aids and cochlear implants. Under the supervision of Dr. Atlas he worked on target talker enhancement in cocktail party conditions using modulation analysis and filtering. His current research focuses on signal processing algorithms for binaural speech and music enhancement for hearing devices. His webpage can be found at http://isdl.ee.washington.edu/people/stevenschimmel/.


©2010 Conference Management Services, Inc. -||- email: webmaster@icassp2008.com -||- Last updated Thursday, January 24, 2008