Machine-Learning-Based Targeted Plasma Proteomic Analysis for Predicting Motor Progression in Parkinson’s Disease: An Interpretable Approach to Personalized Disease Management
Bioengineering, 2026
Lin W., Grewal S.
| Disease area | Application area | Sample type | Products |
|---|---|---|---|
Neurology | Patient Stratification | Plasma | Olink Target 96 |
Abstract
The accurate prediction of motor progression in Parkinson’s disease (PD) remains a major clinical challenge that limits personalized treatment planning and efficient clinical trial design. In this study, we developed and validated a machine-learning framework integrating a targeted panel of plasma proteins measured by Olink proximity extension assays with clinical variables to stratify patients according to their progression risk. We analyzed baseline plasma samples from 211 early-stage PD patients enrolled in the Parkinson’s Progression Markers Initiative (PPMI) cohort using four targeted Olink panels, from which 28 circulating proteins were retained after quality-control filtering. Patients were classified as rapid or slow progressors based on their annualized change in MDS-UPDRS Part III scores. Among the algorithms tested, Random Forest achieved the highest discriminative performance with an area under the receiver operating characteristic curve (AUC) of 0.751 (95% CI: 0.684–0.811), which exceeded that of clinical predictors alone (AUC 0.666). The integration of targeted proteomic and clinical features further improved model performance (AUC 0.773; p = 0.009). Nested cross-validation confirmed minimal optimistic bias (AUC 0.743). To enhance clinical interpretability, we applied SHapley Additive exPlanations (SHAP) analysis, which identified interleukin-6 (IL-6), brain-derived neurotrophic factor (BDNF), and vascular endothelial growth factor A (VEGF-A) as the most influential predictors. SHAP feature rankings were highly stable across cross-validation folds (mean Spearman ρ = 0.91). The robustness of these findings was confirmed through sensitivity analyses using extreme quartile comparisons (AUC 0.823), treatment-naïve subgroup analysis (AUC 0.738), and a clinically anchored outcome definition based on the minimal clinically important difference (AUC 0.739). A decision curve analysis demonstrated a net clinical benefit across threshold probabilities of 0.25–0.70. Our results establish targeted plasma protein profiling combined with interpretable machine learning as a promising tool for PD motor progression risk stratification, with potential applications in individualized patient counseling regarding motor prognosis and the selection of candidates for disease-modifying trials.