Our investigation aims to classify Intangible Cultural Heritage (ICH) images in the Mekong Delta, Vietnam. To pursue this goal, we collect an images dataset of 17 ICH categories and manually annotate them. We start with fine-tuning recent pre-trained deep learning models such as VGG19, ResNet50, Inception-v3, and Xception for classifying our own dataset. Followed which, we propose to train support vector machine (SVM) models using many popular visual features including the handcrafted features such as the scale-invariant feature transform (SIFT) and the bag-of-words (BoW) model, the histogram of oriented gradients (HOG), the GIST, and the automated deep learning of invariant features extracted by VGG19, ResNet50, Inception-v3, and Xception. The comparative study of the classification for the images dataset of 17 ICH categories shows that fine-tuning pre-trained deep learning models and SVM models using automated deep learning of invariant features outperform SVM models trained on handcrafted features. Fine-tuning Inception-v3, Xception, and two non-linear SVM models learned from Inception-v3 and Xception features achieve 60.46%, 61.54%, 61.54%, and 62.89% accuracy, respectively. We propose to combine non-linear SVM models using different visual features to improve the classification result performed by any single one. Both the triplets SVM-Xception, SVM-Inception-v3, and SVM-VGG19 and SVM-Xception, SVM-Inception-v3, and SVM-SIFT-BoW achieve equal accuracy of 65.32%.
Tạp chí khoa học Trường Đại học Cần Thơ
Lầu 4, Nhà Điều Hành, Khu II, đường 3/2, P. Xuân Khánh, Q. Ninh Kiều, TP. Cần Thơ
Điện thoại: (0292) 3 872 157; Email: tapchidhct@ctu.edu.vn
Chương trình chạy tốt nhất trên trình duyệt IE 9+ & FF 16+, độ phân giải màn hình 1024x768 trở lên