The gene expression classification is the most important study in cancer diagnosis and drug discovery. Nevertheless, this task is very complicated to achieve accurate results because datasets have a very large number of dimensions and very few datapoints. In this paper, we propose the new ensemble learning algorithms with support vector machines (SVM) for efficiently handling the gene expression classification task. The Sherman-Morrison-Woodbury formula is used in the Newton SVM (NSVM) algorithm proposed by Mangasarian to make an extension of Newton SVM for dealing with datasets having a very large number of dimensions. Followed which, the ensemble learning trains the new extended Newton SVM for classifying gene expression datasets with simultaneously large number of datapoints and dimensions. The numerical test results on high-dimensional gene expression datasets show that our ensemble learning algorithms of Newton SVM are significantly faster and/or more accurate than the highly efficient standard SVM algorithm LibSVM.
Số tạp chí In: Thai-Nghe, N., Do, TN., Haddawy, P. (eds) Intelligent Systems and Data Science. ISDS 2023. Communications in Computer and Information Science, vol 1950. Springer, Singapore.(2023) Trang: 304-312
Tạp chí khoa học Trường Đại học Cần Thơ
Lầu 4, Nhà Điều Hành, Khu II, đường 3/2, P. Xuân Khánh, Q. Ninh Kiều, TP. Cần Thơ
Điện thoại: (0292) 3 872 157; Email: tapchidhct@ctu.edu.vn
Chương trình chạy tốt nhất trên trình duyệt IE 9+ & FF 16+, độ phân giải màn hình 1024x768 trở lên