Automatic Keywords-based Classification of Vietnamese Texts

Hướng dẫn

Tìm kiếm nâng cao

Tựa bài viết

Tìm

Tác giả

Năm xuất bản

Tóm tắt

Lĩnh vực

Phân loại

Số tạp chí

Bản tin định kỳ

Báo cáo thường niên

Tạp chí khoa học ĐHCT

Tạp chí tiếng anh ĐHCT

Tạp chí trong nước

Tạp chí quốc tế

Kỷ yếu HN trong nước

Kỷ yếu HN quốc tế

Book chapter

Automatic Keywords-based Classification of Vietnamese Texts

(2020) Trang:

Tác giả: Huỳnh Tuấn Hảo, Trương Quốc Định, Huỳnh Xuân Hiệp, Tăng Thị Quỳnh Nga, Ha Xuan Son, Duong Trung Nghia

Tạp chí: IEEE International Conference on Research, Innovation and Vision for the Future

Liên kết: https://doi.org/10.1109/RIVF48685.2020.9140761

Tóm tắt

Text classification is a sophisticated field of research in natural language processing that deals with the problem of automatically classifying new documents into pre-defined classes. It is a complex procedure involving not only selecting the right training models, but also integrating numerous fine-tuned processes, e.g. pre-processing, transformation, and dimensionality reduction. Researchers either develop new classification models or improve the existing approaches by investigating new techniques. An ideal text classifier would mimic how humans assign text to topics. People usually categorize documents by scanning their important words rather than reading the whole text source. With this process in mind, the authors propose a framework to categorize documents and apply the idea of keyword-based classification. The authors have collected real text data from various websites and utilize the TextRank algorithm and Jaccard similarity coefficient. A wide range of experiments has been conducted to show that the proposed framework achieves good results.

Các bài báo khác

Vietnamese Text Classification with TextRank and Jaccard Similarity Coefficient

5 (2020) Trang: 363-369

Tác giả: Huỳnh Tuấn Hảo, Trương Quốc Định, Huỳnh Xuân Hiệp, Duong Trung Nghia

Tạp chí: Advances in Science, Technology and Engineering Systems Journal

Tóm tắt

Vietnamese | English

Tạp chí khoa học Trường Đại học Cần Thơ
Khu II, Đại học Cần Thơ, Đường 3/2, Phường Ninh Kiều, Thành phố Cần Thơ, Việt Nam
Điện thoại: (0292) 3 872 157; Email: tapchidhct@ctu.edu.vn

Chương trình chạy tốt nhất trên trình duyệt IE 9+ & FF 16+, độ phân giải màn hình 1024x768 trở lên

Vui lòng chờ...