Our investigation aims to propose a high-performance abstractive text summarization model for Vietnamese languages. We based on the transformer network with a full encoder-decoder to study the high-quality features of the training data. Next, we scaled down the network size to increase the number of documents the model can summarize in a time frame. We trained the model with a large-scale dataset, including 880,895 documents in the training set and 110, 103 in the testing set. The summarizing speed for the testing set significantly improves with 5.93 hours when using a multiple-core CPU and 0.31 hours on a small GPU. The numerical test results of F1 are also close to the state-of-the-art with 51.03% in ROUGE-1, 18.17% in ROUGE-2, and 31.60% in ROUGE-L.
Tạp chí khoa học Trường Đại học Cần Thơ
Lầu 4, Nhà Điều Hành, Khu II, đường 3/2, P. Xuân Khánh, Q. Ninh Kiều, TP. Cần Thơ
Điện thoại: (0292) 3 872 157; Email: tapchidhct@ctu.edu.vn
Chương trình chạy tốt nhất trên trình duyệt IE 9+ & FF 16+, độ phân giải màn hình 1024x768 trở lên