Journal of Decision Making and Healthcare

Electronic ISSN: 3008-1572

DOI: 10.69829/jdmh

Minimizing genes for cancer detection using a genetic algorithm

Journal of Decision Making and Healthcare, Volume 3, Issue 1, April 2026, Pages: 39–45

YEN-JU TSUI

Master Program for Biomedical Engineering, China Medical University, Taichung 40402, Taiwan

FENG-SHENG TSAI

Department of Biomedical Informatics and Research Center for Interneural Computing, China Medical University, Taichung 40402, Taiwan

HAO-REN YAO

Information Retrieval Lab, Georgetown University, Washington, DC 20057, USA

Abstract

The cancer genome atlas database contains extensive genomic data on various cancers, while the catalogue of somatic mutations in cancer (COSMIC) provides a curated list of oncogenes. Although oncogenic data from TCGA can be extracted to analyze cancer types, utilizing the entire genomic dataset for screening is computationally resource-intensive. Therefore, this study aims to employ a genetic algorithm to identify the minimum number of genes required for accurate cancer detection. In this research, gene expression data from six types of cancer and their corresponding normal tissues were extracted. A subset of 716 genes and their expression values were randomly selected based on a gene activation probability \(p\). Through the GA process, including data input, fitness evaluation, selection, crossover, mutation, and iteration, the classification accuracy for various values of \(p\) was determined. The fitness function was calculated using a classification neural network, where the accuracy of the network, trained and tested on the activated gene expression values, served as the fitness score. The GA enables the activation of randomly selected genes to evolve through generations, increasingly optimizing the identification of genes necessary for classifying these six cancer types and normal tissues. Experimental parameter tuning involved the gene activation probability, crossover probability, and mutation probability. The result indicates that for any crossover probability and active mutation, a stable accuracy exceeding \(93.5\%\) is achieved when \(p \geq 0.1\).


Cite this Article as

Yen-Ju Tsui, Feng-Sheng Tsai and Hao-Ren Yao, Minimizing genes for cancer detection using a genetic algorithm, Journal of Decision Making and Healthcare, 3(1), 39–45, 2026