We propose new parallel learning algorithms of local support vector machines (local SVMs) for effectively non-linear classification of large datasets. The algorithms of local SVMs perform the training task of large datasets with two main steps. The first one is to partition the full dataset into $k$ subsets of data, and then the second one is to learn non-linear SVMs from $k$ subsets to locally classify them in parallel way on multi-core computers. The $k$ local SVMs algorithm ($k$SVM) uses $k$means clustering algorithm to partition the data into $k$ clusters, then constructs in parallel non-linear SVM models to classify data clusters locally. The decision tree with labeling support vector machines ($t$SVM) uses C4.5 decision tree algorithm to split the full dataset into terminal-nodes, and then it learns in parallel local SVM models for classifying impurity terminal-nodes with mixture of labels. The $kr$SVM algorithm is to train random ensemble of $k$SVM. The numerical test results on 4 datasets from UCI repository, 3 benchmarks of handwritten letters recognition and a color image collection of one-thousand small objects show that our proposed algorithms of local SVMs ($k$SVM, $t$SVM, $kr$SVM) are efficient compared to the standard SVM (LibSVM) in terms of training time and accuracy for dealing with large datasets.
Tạp chí khoa học Trường Đại học Cần Thơ
Lầu 4, Nhà Điều Hành, Khu II, đường 3/2, P. Xuân Khánh, Q. Ninh Kiều, TP. Cần Thơ
Điện thoại: (0292) 3 872 157; Email: tapchidhct@ctu.edu.vn
Chương trình chạy tốt nhất trên trình duyệt IE 9+ & FF 16+, độ phân giải màn hình 1024x768 trở lên