This paper proposes explainable machine learning models for enhancing gene expression classification. The proposed multi-class 1-norm support vector machine (MC-SVM-1) algorithm adopts the One-Versus-All multi-class strategy, leveraging binary 1-norm SVM models. The inherent sparsity of the 1-norm SVM solution enables automatic suppression of numerous dimensions associated with null weights. This feature elimination significantly enhances classification outcomes for high-dimensional gene expression datasets. Empirical test results on 25 gene expression datasets demonstrate that our MC-SVM-1 algorithm effectively reduces 99% of full dimensions, leading to respective accuracy increases of 7.1% and 4.03% when compared to training SVM and random forest models on the complete gene expression dataset dimensions. Subsequently, principal component analysis and locally interpretable model-agnostic explanations techniques are used to gain insights into how the classification model effectively handles the selected features extracted from gene expression datasets.
Tạp chí khoa học Trường Đại học Cần Thơ
Lầu 4, Nhà Điều Hành, Khu II, đường 3/2, P. Xuân Khánh, Q. Ninh Kiều, TP. Cần Thơ
Điện thoại: (0292) 3 872 157; Email: tapchidhct@ctu.edu.vn
Chương trình chạy tốt nhất trên trình duyệt IE 9+ & FF 16+, độ phân giải màn hình 1024x768 trở lên