Deep learning to identify a gene signature associated with molecular subtypes that predicts prognosis in colorectal cancer.

May 20, 2021·
Du CAI
Du CAI
,
Wei Wang
Min-Er ZHONG
Min-Er ZHONG
De-Jun FAN
De-Jun FAN
,
Xuanhui Liu
Cheng-Hang LI
Cheng-Hang LI
Ze-Ping HUANG
Ze-Ping HUANG
Qi-Qi ZHU
Qi-Qi ZHU
Min-Yi LV
Min-Yi LV
Chu-Ling HU
Chu-Ling HU
Xin DUAN
Xin DUAN
,
Xiaojian Wu
Feng GAO
Feng GAO
· 0 min read
Abstract
Background: Identifying robust prognostic risk groups of colorectal cancer (CRC) will significantly improve patients’ outcomes. However, CRC has been demonstrated to be molecularly heterogeneous which affected clinical decision-making. Recently, a comprehensive study proposed four consensus molecular subtypes (CMSs) of CRC with a comprehensive biological and clinical characterization, but a cost-effective clinical assay for prognosis is lacking. To fill this gap, we present a supervised framework using deep learning to identify CMS associated gene signature for prognosis. Methods: A total of 1,729 CRC patients with complete follow-up data were included in this study. We first applied a supervised deep learning-based framework in the training cohort ( n = 624) to extract the CMS-associated deep features and then identified a gene panel highly correlated to these deep features. Subsequently, the prognostic power of this gene signature was evaluated on 6 independent CRC datasets. Results: We identified a 21-gene signature associated CMS subtypes and a trained risk model significantly predicted patients’ disease-free survival (DFS) on six independent CRC datasets ( n = 1,729): Training cohort ( n = 624, HR = 2.53, 95% CI: 1.53-4.18, P < 0.001), Validation 1 cohort ( n = 557, HR = 1.77, 95%CI: 1.27 – 2.47, P < 0.001) and Validation 2 cohort merged by other four datasets ( n = 548, HR = 2.10, 95%CI: 1.50 – 2.93, P < 0.001). Especially, this 21-gene signature can also stratify stage 2 and 3 patients into distinct survival groups: Training cohort ( n = 338, HR = 2.14, 95%CI: 1.18-3.85, P < 0.01), Validation 1 cohort ( n = 457, HR = 1.63, 95%CI: 1.12 – 2.37, P < 0.01) and Validation 2 cohort ( n = 437, HR = 1.73, 95%CI: 1.37 – 2.82, P < 0.001), outperformed Oncotype DX on the same cohorts. Conclusions: To summarize, using our DL-based framework, we successfully developed a CMS-associated gene signature for robust prognostic prediction in CRC. Compared with genome-wide expression profile-based CMS classification system, the 21-gene panel can be easily deployed in clinical practice to facilitate decision making.
Type
Publication
Journal of Clinical Oncology
publication
Du CAI
Authors
Postdoc
I focus on leveraging explainable AI and large foundation models to advance medical imaging and digital pathology in colorectal cancer research.
Min-Er ZHONG
Authors
Postdoc
I am a surgeon and clinical researcher focused on deep learning and translational studies in colorectal cancer.
De-Jun FAN
Authors
Associate Professor
My research explores the intersection of gastrointestinal endoscopy (GIE) and artificial intelligence (AI), along with the biological mechanisms of colorectal cancer development.
Cheng-Hang LI
Authors
Research Assistant
I am a research assistant focusing on deep learning, multimodal feature fusion, and medical AI system development.
Ze-Ping HUANG
Authors
Medical Student
I am a medical trainee in colorectal surgery, focusing on bioinformatics and translational research in colorectal cancer.
Qi-Qi ZHU
Authors
Surgeon
I am a surgeon focusing on colorectal cancer and translational bioinformatics in clinical practice.
Min-Yi LV
Authors
PhD Student
I am a PhD student at Guangzhou National Laboratory, focusing on colorectal cancer research, biostatistics, and evidence-driven clinical modeling.
Chu-Ling HU
Authors
PhD Student
I am a PhD student focusing on AI-driven colorectal cancer research and clinically useful model development.
Xin DUAN
Authors
Postdoc
I focus on medical image analysis and artificial intelligence for cancer research, including molecular subtyping and predictive modeling.
Feng GAO
Authors
Professor
My research leverages AI and big data to improve diagnostics, prognostics, and ultimately, outcomes in cancer and other biomedical fields.