Artificial Bandwidth Extension using Sum-Product Networks

Result of the Month

ROTM1406.png

Sum-Product networks (SPNs) are a recently proposed deep network architecture for representing probability distributions. They allow a high degree of dependency among the random variables, while still allowing efficient inference. In particular, SPNs showed convincing results on the ill-posed problem of image completion, i.e. predicting missing parts of an image given the observed part. We applied SPNs to the related task of artificial bandwidth extension, i.e. recovering the lost high frequencies in telephone speech, using the observed telephone low-band. To this end, we incorporated SPNs as observation models in hidden Markov models and used most-probable explanation (MPE) inference for reconstructing the lost frequency bins. The extended signals have a natural high-frequency structure in the spectrogram, and improve the state-of-the art in terms of log-spectral distortion and in informal listening tests.

Contact: Robert Peharz

The upper left figure shows the original spectrogram of the example utterance: 'Bin green at zed 5 now'. The upper right and lower left figures show two baseline methods for bandwidth extension (HMM using linear prediction coefficients and HMM using GMMs, respectively. The figure on the lower right shows the proposed method (HMM using SPNs).

The results have been presented at this year's ICASSP -- take a look at the paper or listen to the wav.-files!

1. June 2014 - 1. July 2014