Aims: Childhood obesity is a complex condition influenced by genetic, maternal, and early-life factors. We aimed to develop an interpretable machine learning model to predict body mass index (BMI) from childhood to early adulthood and identify critical risk contributors.
Methods: Data from 2,868 participants in the Raine Study Gen2 cohort were used. BMI was assessed longitudinally at ages 8–27. We integrated over 200 epidemiological features with seven BMI-related polygenic scores (PGS). Models included traditional machine learning methods and Kolmogorov–Arnold Networks (KAN), an explainable deep learning approach capable of producing mathematical formulae for prediction.
Results: The KAN model achieved the highest R² (0.81 at age 8, declining to 0.34 at age 27) when using both genetic and epidemiological data. The strongest predictor across all ages was BMI z-score at 5 years, especially for younger age groups. In adolescence and early adulthood, PGS became increasingly influential. Other contributors included maternal/paternal anthropometrics, skinfold measures, and parental education. The model’s transparent structure allowed derivation of explicit formulas and visual interpretation of feature influence over time.
Conclusion: This study presents an explainable AI approach for predicting BMI development across the life course. Our findings emphasize the predictive power of early-life BMI and support integrating genetic and epidemiological data for personalized obesity risk assessment. These insights may guide early intervention strategies and clinical decision-making.