In recent years, numerous studies have been fo-cusing on metagenomic data to improve the ability of human disease prediction. Although we face the complexity of disease, some proposed frameworks reveal promising performances in using metagenomic data to predict disease. Type 2 diabetes (T2D) diagnosis by metagenomic data is one of the challenging tasks compared to other diseases. The prediction performances for T2D usually reveal poor results which are around 65% in accuracy in state-of-the-art. In this study, we propose a method com-bining K-means clustering algorithm and unsupervised binning approaches to improve the performance in metagenome-based disease prediction. We illustrate by experiments on metagenomic datasets related to Type 2 Diabetes that the proposed method embedded clusters generated by K-means allows to increase the performance in prediction accuracy reaching approximately or more than 70%.
Tạp chí khoa học Trường Đại học Cần Thơ
Lầu 4, Nhà Điều Hành, Khu II, đường 3/2, P. Xuân Khánh, Q. Ninh Kiều, TP. Cần Thơ
Điện thoại: (0292) 3 872 157; Email: tapchidhct@ctu.edu.vn
Chương trình chạy tốt nhất trên trình duyệt IE 9+ & FF 16+, độ phân giải màn hình 1024x768 trở lên