In this paper, we propose the autokSVM algorithm being possible to automatically tune hyper-parameters of k local SVMs for classifying large datasets. The autokSVM is able to determine the number of clusters k to partition the large training data, followed which it learns the non-linear SVM model in each cluster to classify the data locally in the parallel way on multi-core computers. The autokSVM combines the grid search, the .632 bootstrap estimator, the hill climbing heuristic to optimize hyper-parameters in the local non-linear SVM training. The numerical test results on 4 datasets from UCI repository and 3 benchmarks of handwritten letters recognition showed that our proposal is efficient compared to the standard LibSVM and the original kSVM. An example of its effectiveness is given with an accuracy of 96.74% obtained in the classification of Forest covertype dataset having 581,012 datapoints in 54 dimensional input space and 7 classes in 334.45 s using a PC Intel(R) Core i7-4790 CPU, 3.6 GHz, 4 cores.
Tạp chí: The 5 th Academic Conference on Natural Science for Young Scientists, Master and PhD. Students from Asean Countries, 4-7 October, 2018,DaLat, Vietnam
Tạp chí khoa học Trường Đại học Cần Thơ
Lầu 4, Nhà Điều Hành, Khu II, đường 3/2, P. Xuân Khánh, Q. Ninh Kiều, TP. Cần Thơ
Điện thoại: (0292) 3 872 157; Email: tapchidhct@ctu.edu.vn
Chương trình chạy tốt nhất trên trình duyệt IE 9+ & FF 16+, độ phân giải màn hình 1024x768 trở lên