Artificial Bandwidth Extension using Sum-Product Networks
- Sun, Jun 01, 2014
Sum-Product networks (SPNs) are a recently proposed deep network architecture for representing probability distributions. They allow a high degree of dependency among the random variables, while still allowing efficient inference. In particular, SPNs showed convincing results on the ill-posed problem of image completion, i.e. predicting missing parts of an image given the observed part. We applied SPNs to the related task of artificial bandwidth extension, i.e. recovering the lost high frequencies in telephone speech, using the observed telephone low-band. To this end, we incorporated SPNs as observation models in hidden Markov models and used most-probable explanation (MPE) inference for reconstructing the lost frequency bins. The extended signals have a natural high-frequency structure in the spectrogram, and improve the state-of-the art in terms of log-spectral distortion and in informal listening tests.
The upper left figure shows the original spectrogram of the example utterance: ‘Bin green at zed 5 now’. The upper right and lower left figures show two baseline methods for bandwidth extension (HMM using linear prediction coefficients and HMM using GMMs, respectively. The figure on the lower right shows the proposed method (HMM using SPNs).