Đăng nhập
 
Tìm kiếm nâng cao
 
Tên bài báo
Tác giả
Năm xuất bản
Tóm tắt
Lĩnh vực
Phân loại
Số tạp chí
 

Bản tin định kỳ
Báo cáo thường niên
Tạp chí khoa học ĐHCT
Tạp chí tiếng anh ĐHCT
Tạp chí trong nước
Tạp chí quốc tế
Kỷ yếu HN trong nước
Kỷ yếu HN quốc tế
Book chapter
Bài báo - Tạp chí
1781 (2023) Trang: 124-137
Tạp chí: Communications in Computer and Information Science, InInternational Advanced Computing Conference 2022

Image captioning neural networks are trained simultaneously on image recognition sub-models and natural language processing sub-models to generate description sentences for images. This paper presents several image captioning models based on the encoder-decoder framework. We change the neural sub-models used for the encoder as well as the decoder, and make comparisons. First, we experiment with several ResNet architectures (viz., ResNet-50, ResNet-101, and ResNet-152) as encoders, and Transformer or bidirectional Transformer models as decoders. Second, we use the combination of the Vision Transformer as a visual encoder, and the standard Transformer or RoBERTa as the language decoder. Finally, we propose an image captioning model using Vision Transformer for encoding images and bidirectional Transformer for predicting image captions. The models are trained on the Flickr8k dataset in English and Vietnamese and evaluated using the BLEU metric. The combination model between the Vision Transformer and the bidirectional RoBERTa model outperforms the existing image captioning models, including VirTex and CPTR models. The BLEU-1, BLEU-2, BLEU-3, and BLEU-4 scores of our best image captioning model are 0.870, 0.661, 0.443, and 0.331 on the English dataset, and 0.829, 0.647, 0.483, and 0.387 on the Vietnamese dataset.

Các bài báo khác
(2023) Trang: 92-99
Tạp chí: Association for Computational Linguistics (ACL 2023), In Findings of the Association for Computational Linguistics: ACL 2023, Toronto, Canada, 2023
(2023) Trang: 488-493
Tạp chí: RIVF International Conference on Computing and Communication Technologies, Vietnam, 20-22/12, 2022
(2023) Trang: 1-5
Tạp chí: 2022 6th International Conference on Natural Language Processing and Information Retrieval, Thailand
Nguyen Thai-Nghe, Thanh-Nghi Do, Peter Haddawy (2023) Trang: 200-207
Tạp chí: Communications in Computer and Information Science, ISDS 2023
Thi Dieu Linh Nguyen, Elena Verdú, Anh Ngoc Le, Maria Ganzha (2023) Trang: 34-39
Tạp chí: Lecture Notes in Networks and Systems
124 (2022) Trang: 85-96
Tạp chí: Lecture Notes on Data Engineering and Communications Technologies
Tran Khanh Dang, Josef Küng, Tai M. Chung (2022) Trang: 714-722
Tạp chí: Communications in Computer and Information Science
Tran Khanh Dang, Josef Küng, Tai M. Chung (2022) Trang: 377-392
Tạp chí: Communications in Computer and Information Science
26 (2022) Trang: 1313–1322
Tạp chí: Computacion y Sistemas
(2020) Trang: 144–148
Tạp chí: The 4th International Conference on Natural Language Processing and Information Retrieval, Online, 18-20/12/2020
(2021) Trang: 427-435
Tạp chí: International Conference on Future Data and Security Engineering, online 24-26/11/2021
(2021) Trang: 63-69
Tạp chí: The 7th International Conference on Fuzzy Systems and Data Mining, 26-29/10/2021
(2019) Trang: 468-475
Tạp chí: Hội nghị khoa học Quốc gia Nghiên cứu cơ bản về ứng dụng công nghệ thông tin (FAIR), Đại học Huế, 7-8/6/2019
Tran Khanh DangJosef KüngMakoto TakizawaTai M. Chung (2020) Trang: 443-451
Tạp chí: Future Data and Security Engineering. Big Data, Security and Privacy, Smart City and Industry 4.0 Applications
(2019) Trang: 110-114
Tạp chí: International Conference on Natural Language Processing and Information Retrieval, Tokushima, Japan, June 28-30, 2019
(2016) Trang: 861-870
Tác giả: Lâm Nhựt Khang
Tạp chí: FAIR, Can Tho 4-5/8/2016
(2014) Trang: 3077-3078
Tác giả: Lâm Nhựt Khang
Tạp chí: AAAI, Québec City, Québec, Canada, July 27–31, 2014
(2015) Trang: 2174-2180
Tạp chí: AAAI, Austin, Texas, USA, January 25–30, 2015,
(2014) Trang: 106-111
Tạp chí: ACL, Baltimore, Maryland, USA, June 22-27, 2014
(2014) Trang: 54-62
Tạp chí: ACL- ComputEL, Baltimore, Maryland, USA, June 22-27, 2014
(2013) Trang: 524-528
Tạp chí: The Conference of the NAACL- HLT. Atlanta, Georgia, USA. 9-14 June 2013
(2017) Trang: 1007-1014
Tạp chí: FAIR, Da Nang17-18/8/2017
 


Vietnamese | English






 
 
Vui lòng chờ...