Semi-Supervised Speech Embedding Fusion for Parkinson’s Detection

We developed a novel fusion architecture that combines semi-supervised speech embeddings to detect Parkinson’s Disease (PD) using natural speech recordings collected from over 1,300 participants in both home and clinical environments.

Leveraged deep speech embeddings (Wav2Vec 2.0, WavLM, ImageBind) to capture rich vocal features indicative of PD, moving beyond traditional handcrafted features.
Designed a fusion model that projects and aligns multi-model speech embeddings into a unified feature space, improving classification performance relative to baseline approaches.
Achieved high classification accuracy (AUROC ≈ 88.9%, accuracy ≈ 85.7%) on internal evaluation and demonstrated generalizability on external clinical datasets.
Conducted detailed bias and robustness analyses showing equitable performance across sex, ethnicity, and disease stages, supporting broader real-world applicability.

Share on

Bluesky Facebook LinkedIn Mastodon X (formerly Twitter)

Tariq Adnan

Share on