Front Mol Biosci. 2026 Feb 9;13:1730023. doi: 10.3389/fmolb.2026.1730023. eCollection 2026.
ABSTRACT
BACKGROUND AND OBJECTIVE: Non-alcoholic fatty liver disease (NAFLD) represents the most prevalent chronic hepatic metabolic disorder globally. Without timely intervention, it can progress to non-alcoholic steatohepatitis (NASH), liver fibrosis, and even hepatocellular carcinoma. Early detection and diagnosis are critical for disease management. metabolomics, a powerful tool for identifying diagnostic metabolic biomarkers of diseases, is frequently integrated with machine learning (ML) algorithms to improve analytical efficiency. This study aims to compare serum metabolomic profiles between NAFLD patients and healthy controls, identify differential metabolites, and employ machine learning algorithms to discover biomarkers with diagnostic value.
METHODS: This study enrolled 26 healthy controls and 165 patients diagnosed with NAFLD via ultrasound, and performed serum untargeted metabolomics analysis. Specifically, metabolomics techniques were used to detect serum metabolites, while orthogonal partial least squares-discriminant analysis (OPLS-DA) was applied to screen for significantly differential metabolites between groups and conduct pathway enrichment analysis. In the ML phase, the dataset was split at an 8:2 ratio: 80% of the data (131 NAFLD cases and 21 healthy controls) was used for model training, and 20% (34 NAFLD cases and five healthy controls) served as an independent test set to validate model performance.
RESULTS: Metabolomic differential analysis identified 942 significantly differential metabolites (656 upregulated and 286 downregulated) between the NAFLD and healthy control groups, which were primarily enriched in caffeine metabolism, cholesterol metabolism, and the FoxO and AMPK signaling pathways. After training and validating machine learning models, serum metabolites maresin 1, canavaninosuccinate, paraxanthine, and 1-methyluric acid demonstrated robust diagnostic performance for NAFLD and can serve as independent predictive biomarkers, with 1-methyluric acid exhibiting the highest diagnostic contribution.
CONCLUSION: Integration of untargeted metabolomics and machine learning effectively distinguishes NAFLD patients from healthy controls. cholesterol metabolism, caffeine metabolism, and the FoxO and AMPK signaling pathways may participate in NAFLD pathogenesis. ML-validated metabolites 1-methyluric acid, paraxanthine, canavaninosuccinate, and maresin one hold potential as diagnostic biomarkers and therapeutic targets for NAFLD, with 1-methyluric acid exhibiting the highest diagnostic relevance. In summary, serum metabolomics provides stable, accurate biomarkers for NAFLD early warning and diagnosis, and this study offers data and resource support for optimizing its clinical.
PMID:41737828 | PMC:PMC12926098 | DOI:10.3389/fmolb.2026.1730023

