Deep clustering of pathological speakers
- Status
- Open
- Type
- Master Thesis
- Announcement date
- 30 Oct 2024
- Mentors
- Research Areas
Abstract
Speech pathologies often result in a loss of quality of life, since they may impede one’s ability to communicate verbally and engage socially. Assessment of speech sound is pivotal to the medical care of people with speech pathologies, and often done using expert perceptual ratings as the ground truth, and predefined categories. However, such ratings are known to be noisy to some extent, and to potientially contain cognitive biases. In this thesis, we plan to cluster speakers in an unsupervised way using current deep learning approaches. Only a posteriori we plan to conduct perceptual tests to explore within-cluster similarites and between-cluster differences, bearing the potential to discover categories of speakers that were unknown beforehand.
Your Tasks
- literature review,
- explore different clustering approaches with available speech data, e.g., VAE-, DNN-, DAE-, GAN-, GNN-based
- search for subclusters within clusters
- explore within-cluster similarites and between-cluster differences (first informally, than in a perceptual test)
- documentation of the work (thesis writing, optional: paper writing)
Your Profile
- interest in speech science and technology, in particular, deep clustering & representation learning
- interest in health-related applications,
- good knowlegde of Python and relevant packages,
- good communication skills.
Contact
Philipp Aichinger (philipp.aichinger@meduniwien.ac.at) Barbara Schuppler (b.schuppler@tugraz.at)