Multiresolution STFT Phase Estimation with Frame-Wise Posterior Window Length Decision

Presented at the 14. International Conference on Digital Audio Effects DAFX'11, September 19-23, 2011, Paris, France.

Conference homepage

Abstract

This paper presents an extension to the dual-win\-dow-length Real-Time Iterative Spectrogram Inversion phase estimation algorithm (RTISI). Instead of a transient detection in advance, the phase estimator itself determines the correct window length when the phase information for all window lengths have already been estimated. This way, we get significant improvements compared with the previous method. Additionally, we extend this estimator to configurations with three or more window lengths.

Paper

PDF

Slides

PDF (4 MB)

Note that the sound examples in the slides are deactivated.

Sound Examples

Source: Eminem "Cleanin' Out My Closet" (beginning). See also this page from N. Juillerat et al. for an other multiresolution STFT approach and the same example. All sound examples are stored in the FLAC format (Free Lossless Audio Coding). Details and decoding software can be found here.

Algorithm Original Single-resolution Phase Estimation
512 Samples 1024 Samples 4096 Samples
Griffin/Lim FLAC FLAC FLAC FLAC
RTISI FLAC FLAC FLAC FLAC
Original Multiresolution Phase Estimation
512/4096 Dual Resolution 512/.../4096 Quad-Res
Trans.Detect. Posterior Minimax
FLAC FLAC FLAC FLAC