Large language model-based biological age prediction in large-scale populations
Nature Medicine, 2025
Li Y., Huang Q., Jiang J., Du X., Xiang W., Zhang S., Pan Z., Zhao L., Cui Y., Ke L., Yin B., Liu L., Feng G., Yan S., Gao L., Liu Y., Yuan Y., Guo Y., Yang Y., Ma W., Yang Y., Di Q.
Disease area | Application area | Sample type | Products |
---|---|---|---|
Aging | Patient Stratification | Plasma | Olink Explore 3072/384 |
Abstract
Accurate and convenient assessment of individual aging is crucial for identifying health risks and preventing aging-related diseases. Nonetheless, current aging proxies often face challenges such as methodological limitations, weak associations with adverse outcomes and limited generalizability. Here we propose a framework that leverages large language models (LLMs) to estimate individual overall and organ-specific aging using only health examination reports. We validated this approach across six population-based cohorts, encompassing over 10 million participants and demonstrated effectiveness and reliability. Our results showed that the LLM-predicted overall age achieved a concordance index (C-index) of 0.757 (95% CI 0.752-0.761) for all-cause mortality, significantly outperforming other aging proxies such as telomere length, frailty index, eight epigenetic ages and four machine-learning models predictions. The overall age gap was strongly associated with multiple aging-related phenotypes and health outcomes, showing a hazard ratio of 1.055 (95% CI 1.050-1.060) for all-cause mortality. For organ-specific aging, LLM-predicted ages and age gaps also demonstrated superior performance in predicting corresponding organ-specific diseases compared to machine-learning models. Additionally, we examined the dynamic aging assessment capability of LLMs and applied age gaps to identify proteomic biomarkers associated with accelerated aging and to develop risk prediction models of 270 diseases. Interpretability analyses were also conducted to explore the decision-making process of LLMs. In conclusion, our LLM-based aging assessment framework offers a precise, reliable and cost-effective approach for estimating overall and organ-specific aging. It has potential for personalized aging assessment and health management in large-scale general populations.