Research Projects

In Progress


Personalized LLM Assistant for PD Management

Designing a preference-aligned LLM assistant that contextualizes digital severity trends, medication states, and lifestyle factors to support continuous, individualized PD care.

  • Generates structured clinical summaries grounded in digital biomarkers and user context
  • Tracks severity in relation to medication timing, therapy engagement, and daily variability
  • Employs prompt engineering, personalization via RLHF, and safety-aligned response routing
  • Conducts simulated and real user studies to evaluate trust, clarity, grounding, and safety

Longitudinal PD Symptom Monitoring

Building a remote monitoring framework that tracks digital severity trajectories over time and aligns them with clinical scores, medication states, and self-reported symptom burden.

  • Monthly multimodal assessments via ROUTE-PD dataset (12-month repeated measurements)
  • Predict digital severity trends and correlate with neurologist-rated UPDRS changes
  • Model intra-individual progression using ranking, linear-mixed, and temporal regression methods
  • Identify digital markers that reflect symptom fluctuations and medication effects between visits

Video-Based PD Severity Prediction

Developing a model that predicts clinician-rated MDS-UPDRS Part III severity from task-based video recordings, enabling objective, at-home motor assessment.

  • Extract multimodal digital biomarkers from finger-tapping, smile/facial movement, and speech tasks
  • Predict continuous clinical severity scores using regression and multitask modeling
  • Evaluate model reliability across medication states, PD stages, and demographic strata
  • Integrate temporal consistency checks to prepare for longitudinal deployment

Published


PARK: Remote AI Screening for Parkinson’s Disease

We developed PARK, a multimodal AI-driven remote screening tool that identifies Parkinson’s disease from webcam-based recordings of speech, facial expression, and motor tasks. PARK was evaluated across three independent datasets spanning supervised and unsupervised settings, demonstrating strong classification performance and high usability in diverse populations.

  • Leveraged short standardized tasks (speech, facial expression, finger tapping) captured via webcam and extracted clinically relevant features for multimodal analysis.
  • Evaluated PARK on 1,865 participants (including 670 with PD) across multiple real-world cohorts, with AUROC between 0.85–0.87 and accuracy ~80–83% on held-out test sets.
  • Demonstrated balanced performance across demographic subgroups (age, sex, ethnicity) and agreement with neurologist assessments on external validation subsets.
  • Designed uncertainty-aware prediction mechanisms that withhold low-confidence outputs to support safe use in unsupervised, at-home deployment.
  • Collected structured usability feedback showing high participant satisfaction and perceived utility in both supervised and home settings.

Download Paper

Accessible, At-Home PD Detection via Multi-Task Video Analysis (UFNet)

We developed UFNet, an uncertainty-calibrated multimodal fusion network that detects Parkinson’s disease from at-home webcam tasks (finger tapping, smiling, and speech), using only a computer with a camera, microphone, and internet connection. The work introduces the first large-scale, multi-task PD video dataset and shows that fusing task-specific models with uncertainty-aware attention improves both performance and safety for remote screening.

  • Collected a multi-task video dataset of 845 participants (272 with PD) performing three standardized tasks (finger-tapping, smile, and pangram speech), yielding 1,102 complete multi-task sessions and 3,306 total videos for model development and evaluation.
  • Built task-specific shallow neural networks with Monte Carlo (MC) dropout for each modality, extracting clinically motivated features (finger-tapping kinematics, facial action/motion features, and WavLM speech embeddings) and estimating prediction uncertainty.
  • Proposed UFNet, which projects task-specific features into a shared space and uses uncertainty-calibrated self-attention to down-weight noisier or less reliable tasks while leveraging complementary information across modalities.
  • Demonstrated that UFNet outperforms single-task models and standard multimodal fusion baselines, achieving around 87–88% accuracy and ~93% AUROC on a subject-separated test set, with further gains when withholding low-confidence predictions.
  • Incorporated confidence-aware decision policies (MC dropout, conformal prediction, and calibration techniques) to selectively abstain on uncertain cases, improving calibration and reducing harmful mispredictions in a screening context.
  • Showed no significant performance bias across sex and race, and best performance for individuals aged 50–80, while also validating the approach on the external YouTubePD dataset to test robustness across different video sources.

Download Paper

Semi-Supervised Speech Embedding Fusion for Parkinson’s Detection

We developed a novel fusion architecture that combines semi-supervised speech embeddings to detect Parkinson’s Disease (PD) using natural speech recordings collected from over 1,300 participants in both home and clinical environments.

  • Leveraged deep speech embeddings (Wav2Vec 2.0, WavLM, ImageBind) to capture rich vocal features indicative of PD, moving beyond traditional handcrafted features.
  • Designed a fusion model that projects and aligns multi-model speech embeddings into a unified feature space, improving classification performance relative to baseline approaches.
  • Achieved high classification accuracy (AUROC ≈ 88.9%, accuracy ≈ 85.7%) on internal evaluation and demonstrated generalizability on external clinical datasets.
  • Conducted detailed bias and robustness analyses showing equitable performance across sex, ethnicity, and disease stages, supporting broader real-world applicability.

Download Paper

AI-Enabled Parkinson’s Disease Screening Using Smile Videos

We developed an AI-based screening framework that detects Parkinson’s disease from brief smile videos captured on a smartphone or webcam, demonstrating high accuracy and broad applicability, including in diverse population samples.

  • Collected one of the largest annotated smile–video datasets to date, involving participants with and without PD from multiple settings.
  • Extracted facial landmarks and motion features indicative of hypomimia (reduced facial expressivity), a hallmark PD motor symptom.
  • Trained and validated machine learning models achieving ≈88% accuracy for distinguishing PD from non-PD using only smile videos.
  • Demonstrated generalizability in external validation cohorts, including clinical and international datasets.
  • Showed the approach is scalable and accessible, enabling remote, low-cost initial screening that could complement traditional clinical evaluations and improve early detection access.

Download Paper

Hi5: 2D Hand Pose Estimation with Zero Human Annotation

We introduce Hi5, a large synthetic dataset and data synthesis pipeline for 2D hand pose estimation that requires no human annotation, enabling diverse and accurate model training with only consumer-grade hardware.

  • Developed a data synthesis pipeline using high-fidelity 3D hand models, diverse genders and skin tones, and dynamic environments to generate realistic 2D hand images with automatic keypoint annotation.
  • Constructed the Hi5 dataset comprising ~583,000 labeled images, produced in 48 hours on a single consumer computer with full pose annotations.
  • Demonstrated that models trained on Hi5 perform competitively on real hand pose benchmarks and show robustness under occlusions and varied conditions.
  • Illustrated cost-effective synthetic data generation as a viable alternative to expensive manual annotation, expanding accessibility for pose estimation research.

Download Paper

User-Centered Framework for Empowering People with Parkinson’s Disease

We developed and evaluated a user-centered teleneurology platform designed to empower individuals with Parkinson’s Disease (PD) by offering remote access to screening tasks, educational resources, and responsive interfaces.

  • Designed a web-accessible PD support system integrating speech, motor, and facial mimicry tasks for remote interaction and self-exploration.
  • Incorporated interactive components such as a GPT-driven chatbot, local neurologist finder, and actionable PD prevention/management resources.
  • Conducted a user validation study with 91 participants (including people with and without PD) to assess usability and experience.
  • Achieved above-average usability outcomes and positive feedback, informing iterative improvements and future remote PD care research.

Download Paper

UACD: User Attributed Core Decomposition for Influential Spreaders

We propose User Attributed Core Decomposition (UACD), a novel graph analytic method that identifies the most influential spreaders in large social networks by combining network topology with rich user-specific information (followers, friends, tweet counts, verified status) in a distributed setting. UACD significantly improves accuracy and scalability over existing local and global centrality measures by incorporating both structural and user attributes.

  • Introduced UACN (User Attributed Core Number), a measure that augments traditional k-core decomposition with user features to better capture real influence potential.
  • Designed UACD to use only local neighborhood information, avoiding the prohibitive memory/runtime costs of global centrality measures on massive graphs.
  • Provided a distributed implementation of the algorithm on AWS EC2, demonstrating scalability to networks with tens of millions of nodes.
  • Empirically evaluated UACD against state-of-the-art spreader identification methods using standard metrics (e.g., Kendall τ, Spearman ρ, modified Jaccard similarity), showing ~12.5% higher accuracy on average.
  • Achieved up to 175× faster runtime than global centrality approaches while maintaining high ranking quality across real Twitter datasets.

Download Paper

TallnWide: Fast, Scalable and Geo-Distributed PCA for Big Data Analytics

We introduce TallnWide, a novel algorithm for performing Principal Component Analysis (PCA) on extremely large, high-dimensional datasets that are distributed across geographic locations, addressing both scalability and communication overhead in big data analytics.

  • PCA is widely used for dimensionality reduction but faces scalability issues when both rows and columns grow large; existing methods can suffer from memory overflow and high communication cost in distributed environments.
  • TallnWide leverages a zero-noise-limit Probabilistic PCA model with a block-division strategy to manage tall and wide data efficiently, suppressing intermediate data explosion.
  • The algorithm introduces communication-efficient coordination among geographically distributed datacenters by transmitting only necessary parameters and minimizing idle computation time.
  • Empirical evaluation on real large-scale datasets shows TallnWide handles significantly higher dimensions (10× larger) and achieves up to 2.9× faster runtime compared to conventional approaches in geo-distributed settings.
  • For reproducibility and future extension, TallnWide’s implementation is available as open-source, enabling further research in scalable dimension reduction on distributed data.

Download Paper