Research Seminars
Monaural Acoustical Scene Analysis through Harmonic-Temporal Clustering of the
Power Spectrum
Jonathan Le Roux
"Audition" team,
Department of Cognitive Sciences, Ecole Normale Superieure, Paris,
France,
and
Sagayama/Ono Laboratory, Graduate School of Information Science and Technology,
The University of Tokyo, Japan
13 November 2007
Abstract
The design of effective algorithms for single-channel analysis of complex and
varied acoustical scenes is a very important and challenging
problem. We present here a framework called Harmonic-Temporal
Clustering (HTC) which relies on the description of the power
spectrum as a combination of constrained Gaussian Mixture Models.
The parameters of the models are simultaneously estimated through
a global fitting to the observed power spectrum in the time-frequency
domain. The optimal solution can be used both for F0 estimation
in various noisy environments as well as in multiple speaker
situations, and to perform single channel speech enhancement,
background retrieval and speaker separation.
(Jonathan will also talk about probabilistic methods for musical, speech and
acoustical signal processing at the Sagayama/Ono laboratory.)
|