JCO Clin Cancer Inform. 2026 Mar;10:e2500159. doi: 10.1200/CCI-25-00159. Epub 2026 Mar 18.
ABSTRACT
PURPOSE: Chronic lymphocytic leukemia (CLL) treatment paradigms have evolved significantly, yet real-world evidence (RWE) on guideline implementation and patient characteristics remains limited.
MATERIALS AND METHODS: This multicenter retrospective study leveraged artificial intelligence (AI) to analyze structured and unstructured data from four Belgian hospitals (January 1, 2018-October 31, 2021). Structured data including diagnosis codes, laboratory results, treatment records, and national registries were standardized using the Observational Medical Outcomes Partnership (OMOP) Common Data Model. Unstructured clinical notes and reports were processed using a transformer-based natural language processing (NLP) pipeline. We examined clinical characteristics, diagnostic testing, and treatment patterns among patients with newly diagnosed CLL.
RESULTS: Of 22 variable groups analyzed, 50.0% was derived from structured data only, 36.4% from unstructured data only (NLP-extracted), and 13.6% from mixed sources. Five hundred eighty-six patients with CLL were identified, with a median age of 74 years. One hundred seventy-four patients (29.7%) initiated first-line (1L) treatment, and 41 progressed to second-line treatment. Of 1L treated patients, 68.4% had at least one prespecified comorbidity, including 12.1% with significant cardiovascular disease. TP53/del17p testing was documented in 34.3% of patients before 1L treatment, with aberrations detected in 42.8%. Bruton's tyrosine kinase inhibitors (BTKi; 35.6%) were the most common 1L treatment, followed by chemoimmunotherapy (CIT; 25.9%). CIT use declined (30.6% to 17.5%), whereas BTKi use remained stable (34.2% to 38.1%) between 2018 and 2021.
CONCLUSION: This AI-augmented study demonstrates the feasibility and scalability of combining NLP-derived insights with OMOP-standardized structured data to generate reproducible RWE in hematology. Our results highlight an elderly CLL population with significant comorbidities and a shift toward targeted therapies. While treatment patterns aligned with guidelines, data quality depended on source documentation accessibility. Improved integration of molecular testing into electronic health records is essential for enhancing clinical decision making, patient outcomes, and future research.
PMID:41849725 | DOI:10.1200/CCI-25-00159

