Text Classification Models and Topic Models: An Overall Picture and a Case Study in Vietnamese

Hướng dẫn

Tìm kiếm nâng cao

Tựa bài viết

Tìm

Tác giả

Năm xuất bản

Tóm tắt

Lĩnh vực

Phân loại

Số tạp chí

Bản tin định kỳ

Báo cáo thường niên

Tạp chí khoa học ĐHCT

Tạp chí tiếng anh ĐHCT

Tạp chí trong nước

Tạp chí quốc tế

Kỷ yếu HN trong nước

Kỷ yếu HN quốc tế

Book chapter

Text Classification Models and Topic Models: An Overall Picture and a Case Study in Vietnamese

Tran Khanh Dang, Josef Küng, Tai M. Chung (2022) Trang: 377-392

Tác giả: Lâm Nhựt Khang, Vu-Luan Le Tran, Jugal Kalita

Tạp chí: Communications in Computer and Information Science

Liên kết: https://doi.org/10.1007/978-981-19-8069-5_25

Tóm tắt

Document classifiers are supervised learning models in which documents are assigned labels based on models that are trained on labeled datasets. The accuracy of a classifier depends on the size and quality of training datasets, which are costly and time-consuming to construct. Besides, a suitable word representation method may improve the quality of the text classifier. In this paper, we study the effect of different word representation methods on 16 classification models trained on a labeled dataset. Then, we experiment with the ability to discover latent topics using 6 topic models. Based on experimental results using combination of classification models and topic models, we propose a method to label datasets for training classification models using topic models and classification models. Although we perform experiments on a Vietnamese document dataset, our approach may apply to any datasets and does not require any labeled datasets for bootstrapping.

Các bài báo khác

Deep Vision Transformer and T5-Based for Image Captioning

(2024) Trang: 306-311

Tác giả: Lâm Nhựt Khang, Huy Thanh Nguyen, Vinh Phuoc Mai, Jugal Kalita

Tạp chí: 2023 RIVF International Conference on Computing and Communication Technologies (RIVF)

Tóm tắt

A Transformer-based Educational Virtual Assistant Using Diacriticized Latin Script

11 (2023) Trang: 90094 - 90104

Tác giả: Lâm Nhựt Khang, Loc Huu Nguy, Lê Văn Lâm, Jugal Kalita

Tạp chí: IEEE Access

Tóm tắt

Vision Transformer and Bidirectional RoBERTa: A Hybrid Image Captioning Model Between VirTex and CPTR

1781 (2023) Trang: 124-137

Tác giả: Lâm Nhựt Khang, Lê Thị Diễm Kiều, Truong Dinh Ngo, Jugal Kalita

Tạp chí: Communications in Computer and Information Science, InInternational Advanced Computing Conference 2022

Tóm tắt

Abstractive Text Summarization Using the BRIO Training Paradigm

(2023) Trang: 92-99

Tác giả: Lâm Nhựt Khang, Đoàn Gia Thiều, Phạm Thừa Khang, Jugal Kalita

Tạp chí: Association for Computational Linguistics (ACL 2023), In Findings of the Association for Computational Linguistics: ACL 2023, Toronto, Canada, 2023

Tóm tắt

SWVie-Food: A Dataset for Recognizing Foods in Southwest Vietnam Based on Deep Learning

(2023) Trang: 488-493

Tác giả: Lâm Nhựt Khang, Nguyễn Thị Mỹ Khánh, Nguyễn Duy Khang, Nguyễn Hiếu Nghĩa, Nguyễn Thị Kim Yến, Andrew Ware

Tạp chí: RIVF International Conference on Computing and Communication Technologies, Vietnam, 20-22/12, 2022

Tóm tắt

Vietnamese Text Summarization Based on Elementary Discourse Units

(2023) Trang: 1-5

Tác giả: Lâm Nhựt Khang, Jugal Kalita, Nguyen Ngoc Tai

Tạp chí: 2022 6th International Conference on Natural Language Processing and Information Retrieval, Thailand

Tóm tắt

Monitoring Attendance and Checking School Uniforms Using YOLOv8

Nguyen Thai-Nghe, Thanh-Nghi Do, Peter Haddawy (2023) Trang: 200-207

Tác giả: Lâm Nhựt Khang, La Thanh Trọng, Nguyễn Duy Khang, Lê Minh Mẫn, Trương Triệu Vỹ, Andrew Ware

Tạp chí: Communications in Computer and Information Science, ISDS 2023

Tóm tắt

Cooking Recipe Generation Based on Ingredients Using ViT5

Thi Dieu Linh Nguyen, Elena Verdú, Anh Ngoc Le, Maria Ganzha (2023) Trang: 34-39

Tác giả: Lâm Nhựt Khang, Y-Nhi Thi Pham, Jugal Kalita

Tạp chí: Lecture Notes in Networks and Systems

Tóm tắt

Vietnamese Text Summarization Based on Neural Network Models

124 (2022) Trang: 85-96

Tác giả: Lâm Nhựt Khang, Tuong Thanh Do, Nguyet-Hue Thi Pham, Jugal Kalita

Tạp chí: Lecture Notes on Data Engineering and Communications Technologies

Tóm tắt

A Drowsiness Detection System Based on Eye Landmarks Using IoT

Tran Khanh Dang, Josef Küng, Tai M. Chung (2022) Trang: 714-722

Tác giả: Lâm Nhựt Khang, Vinh Phuoc Mai, Gia-Binh Quach Dang, Quoc-Bao Hong Ngo, Nhat-Hao Quan Huynh, Mai Phuc Lieu, Jugal Kalita

Tạp chí: Communications in Computer and Information Science

Tóm tắt

Constructing Vietnamese WordNet: A Case Study

26 (2022) Trang: 1313–1322

Tác giả: Lâm Nhựt Khang, Jugal Kalita

Tạp chí: Computacion y Sistemas

Tóm tắt

Building a Chatbot on a Closed Domain using RASA

(2020) Trang: 144–148

Tác giả: Lâm Nhựt Khang, Lê Nhật Nam, Jugal Kalita

Tạp chí: The 4th International Conference on Natural Language Processing and Information Retrieval, Online, 18-20/12/2020

Tóm tắt

Using Artificial Intelligence and IoT for Constructing a Smart Trash Bin

(2021) Trang: 427-435

Tác giả: Lâm Nhựt Khang, Nguyen Hoang Huynh, Nguyen Bao Ngoc, To Thi Huynh Nhu, Nguyen Thanh Thao, Pham Hoang Hao, Vo Van Kiet, Bui Xuan Huynh, Jugal Kalita

Tạp chí: International Conference on Future Data and Security Engineering, online 24-26/11/2021

Tóm tắt

Facial Expression Recognition and Image Description Generation in Vietnamese

(2021) Trang: 63-69

Tác giả: Lâm Nhựt Khang, Kim-Ngoc Thi NGUYEN, Loc Huu NGUY, Jugal Kalita

Tạp chí: The 7th International Conference on Fuzzy Systems and Data Mining, 26-29/10/2021

Tóm tắt

THỰC NGHIỆM TÓM TẮT RÚT TRÍCH VĂN BẢN TIẾNG VIỆT

(2019) Trang: 468-475

Tác giả: Lâm Nhựt Khang, Tran Bao Ngoc, Phan Chi Khang

Tạp chí: Hội nghị khoa học Quốc gia Nghiên cứu cơ bản về ứng dụng công nghệ thông tin (FAIR), Đại học Huế, 7-8/6/2019

Tóm tắt

Using Topic Models to Label Documents for Classification

Tran Khanh DangJosef KüngMakoto TakizawaTai M. Chung (2020) Trang: 443-451

Tác giả: Lâm Nhựt Khang, Truong Thanh Lam, Jugal Kalita

Tạp chí: Future Data and Security Engineering. Big Data, Security and Privacy, Smart City and Industry 4.0 Applications

Tóm tắt

Improving Vietnamese WordNet using word embedding

(2019) Trang: 110-114

Tác giả: Lâm Nhựt Khang, Tran Tri Thong, To Huynh Tuan, Jugal Kalita

Tạp chí: International Conference on Natural Language Processing and Information Retrieval, Tokushima, Japan, June 28-30, 2019

Tóm tắt

XÂY DỰNG TỪ ĐIỂN MỚI BẰNG PHƯƠNG PHÁP ÁNH XẠ TRUNG GIAN

(2016) Trang: 861-870

Tác giả: Lâm Nhựt Khang

Tạp chí: FAIR, Can Tho 4-5/8/2016

Tóm tắt

Automatically Creating Multilingual Lexical Resources

(2014) Trang: 3077-3078

Tác giả: Lâm Nhựt Khang

Tạp chí: AAAI, Québec City, Québec, Canada, July 27–31, 2014

Tóm tắt

Phrase translation using a bilingual dictionary and n-gram data: A case study from Vietnamese to English

(2015) Trang: 65-69

Tác giả: Lâm Nhựt Khang, Feras Al Tarouti, Jugal Kalita

Tạp chí: NAACL-MWE, Denver, Colorado, USA, June 4, 2015

Tóm tắt

Automatically Creating a Large Number of New Bilingual Dictionaries

(2015) Trang: 2174-2180

Tác giả: Lâm Nhựt Khang, Feras Al Tarouti, Jugal Kalita

Tạp chí: AAAI, Austin, Texas, USA, January 25–30, 2015,

Tóm tắt

Automatically constructing Wordnet synsets

(2014) Trang: 106-111

Tác giả: Lâm Nhựt Khang, Feras Al Tarouti, Jugal Kalita

Tạp chí: ACL, Baltimore, Maryland, USA, June 22-27, 2014

Tóm tắt

Creating Lexical Resources for Endangered Languages

(2014) Trang: 54-62

Tác giả: Lâm Nhựt Khang, Feras Al Tarouti, Jugal Kalita

Tạp chí: ACL- ComputEL, Baltimore, Maryland, USA, June 22-27, 2014

Tóm tắt

Creating reverse bilingual dictionaries

(2013) Trang: 524-528

Tác giả: Lâm Nhựt Khang, Jugal Kalita

Tạp chí: The Conference of the NAACL- HLT. Atlanta, Georgia, USA. 9-14 June 2013

Tóm tắt

XÂY DỰNG WORDNET CHO TIẾNG VIỆT

(2017) Trang: 1007-1014

Tác giả: Lâm Nhựt Khang, Võ Lê Minh Trung, Nguyễn Huỳnh Hữu Đức

Tạp chí: FAIR, Da Nang17-18/8/2017

Tóm tắt

Vietnamese | English

Tạp chí khoa học Trường Đại học Cần Thơ
Khu II, Đại học Cần Thơ, Đường 3/2, Phường Ninh Kiều, Thành phố Cần Thơ, Việt Nam
Điện thoại: (0292) 3 872 157; Email: tapchidhct@ctu.edu.vn

Chương trình chạy tốt nhất trên trình duyệt IE 9+ & FF 16+, độ phân giải màn hình 1024x768 trở lên

Vui lòng chờ...