Signal Processing and Speech Communication Laboratory
hometheses & projects › Deep clustering of pathological speakers

Deep clustering of pathological speakers

Status
Open
Type
Master Thesis
Announcement date
30 Oct 2024
Mentors
Research Areas

Abstract

Speech pathologies often result in a loss of quality of life, since they may impede one’s ability to communicate verbally and engage socially. Assessment of speech sound is pivotal to the medical care of people with speech pathologies, and often done using expert perceptual ratings as the ground truth, and predefined categories. However, such ratings are known to be noisy to some extent, and to potientially contain cognitive biases. In this thesis, we plan to cluster speakers in an unsupervised way using current deep learning approaches. Only a posteriori we plan to conduct perceptual tests to explore within-cluster similarites and between-cluster differences, bearing the potential to discover categories of speakers that were unknown beforehand.

Your Tasks

  • literature review,
  • explore different clustering approaches with available speech data, e.g., VAE-, DNN-, DAE-, GAN-, GNN-based
  • search for subclusters within clusters
  • explore within-cluster similarites and between-cluster differences (first informally, than in a perceptual test)
  • documentation of the work (thesis writing, optional: paper writing)

Your Profile

  • interest in speech science and technology, in particular, deep clustering & representation learning
  • interest in health-related applications,
  • good knowlegde of Python and relevant packages,
  • good communication skills.

Contact

Philipp Aichinger (philipp.aichinger@meduniwien.ac.at) Barbara Schuppler (b.schuppler@tugraz.at)