Signal Processing and Speech Communication Laboratory
homephd theses › Robust Test-Time Adaptation for Visual and Multimodal Learning under Distribution Shifts

Robust Test-Time Adaptation for Visual and Multimodal Learning under Distribution Shifts

Status
In work
Student
Jixiang Lei
Mentors
Research Areas

In real-world applications of deep learning, distributional shifts between training and test data can significantly degrade model performance—particularly in tasks such as image classification, semantic segmentation, and multimodal perception. This doctoral thesis explores robust Test-Time Adaptation (TTA) strategies designed to dynamically adapt models during inference, without access to source data or labels.

The research focuses on three main directions:

  1. model-based adaptation, involving lightweight fine-tuning or normalization techniques;
  2. prompt injection for transformer-based vision and multimodal models to guide behavior without altering core parameters;
  3. tailored objective design, leveraging self-supervised or uncertainty-aware losses such as entropy minimization and confidence filtering.

These methods will be evaluated under various domain shift scenarios, such as sensor noise, weather variation, and modality gaps, using synthetic and real-world benchmarks. The overarching goal is to develop principled and scalable TTA methods that enhance the robustness, reliability, and adaptability of AI systems deployed in safety-critical and dynamic environments.