With the rapid development of social media platforms as well as the current pandemic, the majority of activities are performed online. The user comments obtained from the digital channels are crucial in order that the agencies or organizations can improve and develop their brand. Thus, an automatic system is necessary to analyze the sentiment of a customer feedback. Recently, the well-known pre-trained language models for Vietnamese (PhoBERT) have achieved high performance in comparison with other approaches. However, this method may not focus on the local information in the sentiment like phrases or fragments. In this paper, we propose a PhoBERT-based convolutional neural networks (CNN) for text classification. The output of contextualized embeddings of the PhoBERT’s last four layers is fed into the CNN. This makes the network capable of obtaining more local information from the text. Besides, the PhoBERT output is also given to the transformer encoder layers in order to employ the self-attention technique, and this also makes the model more focused on the important information of the text segments. The experimental results demonstrate that the proposed approach gives competitive performance compared to the existing studies on three public datasets with Vietnamese texts.
Tạp chí khoa học Trường Đại học Cần Thơ
Lầu 4, Nhà Điều Hành, Khu II, đường 3/2, P. Xuân Khánh, Q. Ninh Kiều, TP. Cần Thơ
Điện thoại: (0292) 3 872 157; Email: tapchidhct@ctu.edu.vn
Chương trình chạy tốt nhất trên trình duyệt IE 9+ & FF 16+, độ phân giải màn hình 1024x768 trở lên