The amount of image data becomes larger and larger, both image size (due the higher resolution) and image number. It is estimated for personal use only, an average single user will take 100,000 images during his life. The growth of image data is illustrated by the dataset size, for example ImageNet benchmark dataset is made of more than 14 million images and more than 21,000 classes. This is very challenging for classification algorithms. They have to deal with time and space complexity and very imbalanced data when using SVM algorithms. We present extensions of Power Mean SVM to deal with such data. The first one is an incremental version to deal with the space complexity, the second one is a parallel version of the incremental version to deal with time complexity and the last one is the use of a balanced bagging algorithm for training binary classifiers to deal with imbalanced data. We evaluate our parallel incremental version of balanced bagging PmSVM on the 1,000 classes of ImageNet (ILSVRC 2010). The results show that our algorithm can be run on standard PC (with eg. 2 or 4 GB RAM); it is 255 times faster than the original version and 1,276 times faster than state-of-the-art linear classifier, LIBLINEAR with 80 cores.
Tạp chí khoa học Trường Đại học Cần Thơ
Lầu 4, Nhà Điều Hành, Khu II, đường 3/2, P. Xuân Khánh, Q. Ninh Kiều, TP. Cần Thơ
Điện thoại: (0292) 3 872 157; Email: tapchidhct@ctu.edu.vn
Chương trình chạy tốt nhất trên trình duyệt IE 9+ & FF 16+, độ phân giải màn hình 1024x768 trở lên