Minimizing genes for cancer detection using a genetic algorithm
Journal of Decision Making and Healthcare, Volume 3, Issue 1, April 2026, Pages: 39–45
YEN-JU TSUI
Master Program for Biomedical Engineering, China Medical University, Taichung 40402, Taiwan
FENG-SHENG TSAI
Department of Biomedical Informatics and Research Center for Interneural Computing, China Medical University, Taichung 40402, Taiwan
HAO-REN YAO
Information Retrieval Lab, Georgetown University, Washington, DC 20057, USA
Abstract
The cancer genome atlas database contains extensive genomic data on various cancers, while the catalogue of somatic mutations in cancer (COSMIC) provides a curated list of oncogenes. Although oncogenic data from TCGA can be extracted to analyze cancer types, utilizing the entire genomic dataset for screening is computationally resource-intensive. Therefore, this study aims to employ a genetic algorithm to identify the minimum number of genes required for accurate cancer detection. In this research, gene expression data from six types of cancer and their corresponding normal tissues were extracted. A subset of 716 genes and their expression values were randomly selected based on a gene activation probability \(p\). Through the GA process, including data input, fitness evaluation, selection, crossover, mutation, and iteration, the classification accuracy for various values of \(p\) was determined. The fitness function was calculated using a classification neural network, where the accuracy of the network, trained and tested on the activated gene expression values, served as the fitness score. The GA enables the activation of randomly selected genes to evolve through generations, increasingly optimizing the identification of genes necessary for classifying these six cancer types and normal tissues. Experimental parameter tuning involved the gene activation probability, crossover probability, and mutation probability. The result indicates that for any crossover probability and active mutation, a stable accuracy exceeding \(93.5\%\) is achieved when \(p \geq 0.1\).
Cite this Article as
Yen-Ju Tsui, Feng-Sheng Tsai and Hao-Ren Yao, Minimizing genes for cancer detection using a genetic algorithm, Journal of Decision Making and Healthcare, 3(1), 39–45, 2026