Vision Transformer and Bidirectional RoBERTa: A Hybrid Image Captioning Model Between VirTex and CPTR

Hướng dẫn

Tìm kiếm nâng cao

Tựa bài viết

Tìm

Tác giả

Năm xuất bản

Tóm tắt

Lĩnh vực

Phân loại

Số tạp chí

Bản tin định kỳ

Báo cáo thường niên

Tạp chí khoa học ĐHCT

Tạp chí tiếng anh ĐHCT

Tạp chí trong nước

Tạp chí quốc tế

Kỷ yếu HN trong nước

Kỷ yếu HN quốc tế

Book chapter

Vision Transformer and Bidirectional RoBERTa: A Hybrid Image Captioning Model Between VirTex and CPTR

Số tạp chí 1781(2023) Trang: 124-137

Tác giả: Lâm Nhựt Khang, Lê Thị Diễm Kiều, Truong Dinh Ngo, Jugal Kalita

Tạp chí: Communications in Computer and Information Science, InInternational Advanced Computing Conference 2022

Liên kết: https://doi.org/10.1007/978-3-031-35641-4_9

Tóm tắt

Image captioning neural networks are trained simultaneously on image recognition sub-models and natural language processing sub-models to generate description sentences for images. This paper presents several image captioning models based on the encoder-decoder framework. We change the neural sub-models used for the encoder as well as the decoder, and make comparisons. First, we experiment with several ResNet architectures (viz., ResNet-50, ResNet-101, and ResNet-152) as encoders, and Transformer or bidirectional Transformer models as decoders. Second, we use the combination of the Vision Transformer as a visual encoder, and the standard Transformer or RoBERTa as the language decoder. Finally, we propose an image captioning model using Vision Transformer for encoding images and bidirectional Transformer for predicting image captions. The models are trained on the Flickr8k dataset in English and Vietnamese and evaluated using the BLEU metric. The combination model between the Vision Transformer and the bidirectional RoBERTa model outperforms the existing image captioning models, including VirTex and CPTR models. The BLEU-1, BLEU-2, BLEU-3, and BLEU-4 scores of our best image captioning model are 0.870, 0.661, 0.443, and 0.331 on the English dataset, and 0.829, 0.647, 0.483, and 0.387 on the Vietnamese dataset.

Các bài báo khác

Blockchain-Based Platform for IoT Sensor Data Management

Số tạp chí 1949(2023) Trang: 138--152

Tác giả: Trần Công Án, Trần Minh Tài, Phan Lâm Nhựt Huy, Phạm Huỳnh Ngọc

Tạp chí: Communications in Computer and Information Science

Tóm tắt

PETSAI: Physical Education Teaching Support with Artificial Intelligence

Số tạp chí 1949(2023) Trang: 227-242

Tác giả: Mã Trường Thành, Huỳnh Thanh Nhàn, Trần Việt Châu, Phan Bích Chung, Phạm Nguyên Khang, Đỗ Thanh Nghị

Tạp chí: Communications in Computer and Information Science

Tóm tắt

Improve Medicine Prescribing Performance Using Recommendation Systems

Số tạp chí In: Thai-Nghe, N., Do, TN., Haddawy, P. (eds) Intelligent Systems and Data Science. ISDS 2023. Communications in Computer and Information Science, vol 1950. Springer, Singapore.(2023) Trang: 304-312

Tác giả: Đinh Thành Nhân, Nguyễn Bá Duy, Vũ Xuân dũng, Trương Quốc Định

Tạp chí: Communications in Computer and Information Science

Tóm tắt

Building a Health Monitoring System

Số tạp chí Nguyen Thai-Nghe, Thanh-Nghi Do, Peter Haddawy(2023) Trang: 81-94

Tác giả: Võ Trí Thức, Đỗ Thanh Nghị

Tạp chí: Communications in Computer and Information Science

Tóm tắt

Improved Gene Expression Classification Through Multi-class Support Vector Machines Feature Selection

Số tạp chí Nguyen Thai-Nghe, Thanh-Nghi Do, Peter Haddawy(2023) Trang: 119-130

Tác giả: Đỗ Thanh Nghị, Trần Nguyễn Minh Thư

Tạp chí: Communications in Computer and Information Science

Tóm tắt

Ensemble Learning with SVM for High-Dimensional Gene Expression Data

Số tạp chí Nguyen Thai-Nghe, Thanh-Nghi Do, Peter Haddawy(2023) Trang: 29–40

Tác giả: Đỗ Thanh Nghị, Trần Nguyễn Minh Thư

Tạp chí: Communications in Computer and Information Science

Tóm tắt

Extractive Text Summarization on Large-Scale Dataset Using K-Means Clustering and Word Embedding

Số tạp chí 141(2023) Trang: 489-501

Tác giả: Nguyễn Tí Hon, Đỗ Thanh Nghị

Tạp chí: Lecture Notes on Data Engineering and Communications Technologies

Tóm tắt

Pre-training Classification and Clustering Models for Vietnamese Automatic Text Summarization

Số tạp chí Harish Sharma, Vivek Shrivastava, Kusum Kumari Bharti, Lipo Wang(2023) Trang: 65-77

Tác giả: Nguyễn Tí Hon, Đỗ Thanh Nghị

Tạp chí: Lecture Notes in Networks and Systems

Tóm tắt

LORAP: Local Deep Neural Network for Solar Radiation Prediction

Số tạp chí 1925(2023) Trang: 366-380

Tác giả: Trang Thanh Trí, Mã Trường Thành, Đỗ Thanh Nghị

Tạp chí: Communications in Computer and Information Science

Tóm tắt

Biggest Margin Tree for the Multi-class Classification

Số tạp chí 1925(2023) Trang: 34-48

Tác giả: Võ Trí Thức, Đỗ Thanh Nghị

Tạp chí: Communications in Computer and Information Science

Tóm tắt

LAVETTES: Large-scAle-dataset Vietnamese ExTractive TExt Summarization Models

Số tạp chí 1925(2023) Trang: 273-288

Tác giả: Nguyễn Tí Hon, Mã Trường Thành, Đỗ Thanh Nghị

Tạp chí: Communications in Computer and Information Science

Tóm tắt

English Language Education and Students’ Development of Social Capital

Số tạp chí Tran Le Huu Nghia, Ly Thi Tran, Mai Tuyet Ngo(2023) Trang: 117-139

Tác giả: Vo Phuong Quyen, Phương Hoàng Yến

Tạp chí: English Language Education for Graduate Employability in Vietnam

Tóm tắt

English Language Education and University Students’ Development of Human Capital

Số tạp chí Tran Le Huu Nghia, Ly Thi Tran, Mai Tuyet Ngo(2023) Trang: 97-115

Tác giả: Phương Hoàng Yến, Vo Phuong Quyen

Tạp chí: English Language Education for Graduate Employability in Vietnam

Tóm tắt

OBSERVATION OF INDEL DISTRIBUTION ON LANDRACE AND IMPROVED INDICA RICE (ORYZA SATIVA L.) CULTIVARS

Số tạp chí Trong: Câu lạc bộ lưu học sinh Việt Nam tại Nhật Bản - PGS.TS. Ngô Minh Thủy(2023) Trang: 129-144

Tác giả: Nguyễn Thành Tâm, Maria Stefanie Dwiyanti, Yohei Koide, Yuji Kishima

Tạp chí: Research works by young Japan alumni (2022)

Tóm tắt

Greenhouse Gases Emission from Root Vegetables Cultivation in Vietnam Mekong Delta: Cases Study of Sweet Potato and Taro in Kien Giang Province

Số tạp chí CLB Cựu lưu học sinh Việt Nam tại Nhật Bản - Vietnam Association of Japan Alumni (VAJA)(2023) Trang: 83-91

Tác giả: Lê Trần Thanh Liêm, Nguyễn Thị Kim Phước

Tạp chí: Tuyển tập Công trình Khoa học của các Tiến sĩ trẻ tốt nghiệp tại Nhật Bản (2022) - Research Work by Japan Alumni (2022)

Tóm tắt

An Implementation and Evaluation of Layer 2 for Ethereum with zk-Rollup

Số tạp chí 13831(2023) Trang: 107--115

Tác giả: Trần Công Án, Võ Thành Vũ, Trần Công Nghị, Nguyễn Thanh Hải

Tạp chí: Lecture Notes in Computer Science

Tóm tắt

Medical Records Sharing System Based on Blockchain: A Case Study in Vietnam

Số tạp chí 324(2023) Trang: 179–188

Tác giả: Trần Công Án, Lâm Phi Long, Nguyễn Thanh Hải

Tạp chí: Smart Innovation, Systems and Technologies

Tóm tắt

TRÁCH NHIỆM PHÁP LÝ CỦA CÁC TỔ CHỨC CUNG CẤP DỊCH VỤ TRUNG GIAN TRONG VIỆC BẢO VỆ QUYỀN TÁC GIẢ TRÊN MÔI TRƯỜNG INTERNET

Số tạp chí Khoa Luật-Trường Đại học Đà Lạt(2023) Trang: 287-300

Tác giả: Nguyễn Thị Ngọc Tuyền, Nguyễn Phan Khôi

Tạp chí: Pháp luật về sở hữu trí tuệ trong môi trường số tại Việt Nam

Tóm tắt

Perspectives of English Teachers on the Effectiveness of the General English Program on Students’ Employability: A Case Study in Vietnam

Số tạp chí In: Tran Le Huu Nghia - Ly Thi Tran - Mai Tuyet Ngo(2023) Trang: 233-257

Tác giả: Nguyễn Thị Văn Sử, Nguyễn Anh Thi, Phan Thanh Ngọc Phượng, Huỳnh Thị Anh Thư

Tạp chí: English Language Education for Graduate Employability in Vietnam

Tóm tắt

Graduates’ Feedback on the Contribution of General English Courses to Their English Skills for Work Purposes

Số tạp chí In: Tran Le Huu Nghia - Ly Thi Tran - Mai Tuyet Ngo(2023) Trang:

Tác giả: Tran Le Huu Nghia, Nguyễn Anh Thi, Nguyen Phuong Anh, Nguyen Thi My Duyen

Tạp chí: English Language Education for Graduate Employability in Vietnam

Tóm tắt

Increasing Women's Chances in STEM Fields and Combating Challenges

Số tạp chí R. A. González-Lezcano(2023) Trang: 117-138

Tác giả: Karleah Haris, Lê Ngọc Kiều, Roseline Jindori Yunusa Vakkia, Afua Nyarkoa Ofori

Tạp chí: Advancing STEM Education and Innovation in a Time of Distance Learning

Tóm tắt

Using Deep Learning to Build a Chatbot Supporting the Promotion of Speciality Dishes in Mekong Delta

Số tạp chí 1950(2023) Trang: 194-203

Tác giả: Trần Quốc Khang, Phạm Nguyên Khang

Tạp chí: Intelligent Systems and Data Science

Tóm tắt

STRUCTURAL DIVERSITY AND OPTOELECTRONIC PROPERTIES OF CHEMICALLY MODIFIED PENTAGONAL QUANTUM DOTS

Số tạp chí Nguyen Thanh Tien and Ming-Fa Lin(2023) Trang: 419-440

Tác giả: Nguyễn Thành Tiên, Phạm Thị Bích Thảo, Ming-Fa Lin

Tạp chí: Chemical Modifications of Graphene-Like Materials

Tóm tắt

CHEMICAL AND PHYSICAL ENVIRONMENTS

Số tạp chí Nguyen Thanh Tien and Ming-Fa Lin(2023) Trang: 53-118

Tác giả: Nguyễn Thành Tiên, Nguyen Thi Dieu Hien, Vo Khuong Dien, Shih-Yang Lin, Wen-Dung Su, Wang Yu-Ming, Ming-Fa Lin

Tạp chí: Chemical Modifications of Graphene-Like Materials

Tóm tắt

Cơ chế hiến pháp bảo vệ quyền con người trong Nhà nước pháp quyền xã hội chủ nghĩa Việt Nam

Số tạp chí Võ Khánh Vinh, Đoàn Đức Lương, Nguyễn Duy Phương, Nguyễn Linh Giang(2023) Trang: 128-140

Tác giả: Võ Thị Phương Uyên, Nguyễn Nam Phương

Tạp chí: Pháp luật trong Nhà nước pháp quyền xã hội chủ nghĩa Việt Nam - Những vấn đề lý luận và thực tiễn

Tóm tắt

Skin Diseases Detection with Transfer Learning

Số tạp chí Saraswat, M., Chowdhury, C., Kumar Mandal, C., Gandomi, A.H. (eds)(2023) Trang: 139

Tác giả: Vo Van Quoc, Nguyễn Thái Nghe

Tạp chí: Lecture Notes in Networks and Systems

Tóm tắt

Predicting Academic Performance of High School Students

Số tạp chí In: Cong Vinh Phan · Thanh Dung Nguyen(2023) Trang: 123-135

Tác giả: Nguyen Dinh Thanh, Phạm Thị Ngọc Diễm

Tạp chí: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

Tóm tắt

Towards Automatic Internal Quality Grading of Mud Crabs: A Preliminary Study on Spectrometric Analysis

Số tạp chí In: Thai-Nghe, N., Do, TN., Haddawy, P. (eds) Intelligent Systems and Data Science(2023) Trang: 3-14

Tác giả: Trần Nhựt Thanh, Võ Hải Đăng, Ngô Chí Thịnh, Nguyễn Quốc Huy, Masayuki Fukuzawa

Tạp chí: Communications in Computer and Information Science

Tóm tắt

Development of a New Acoustic System for Nondestructive Internal Quality Assessment of Fruits

Số tạp chí In: Thai-Nghe, N., Do, TN., Haddawy, P. (eds) Intelligent Systems and Data Science(2023) Trang: 31-

Tác giả: Trần Nhựt Thanh, Nguyễn Cát Tường, Nguyễn Hữu Phước, Trương Gia Thuận, Nguyễn Chánh Nghiệm, Masayuki Fukuzawa

Tạp chí: Communications in Computer and Information Science

Tóm tắt

Đầu tiên Trước 1 2 3 Tiếp Cuối

Vietnamese | English

Tạp chí khoa học Trường Đại học Cần Thơ
Khu II, Đại học Cần Thơ, Đường 3/2, Phường Ninh Kiều, Thành phố Cần Thơ, Việt Nam
Điện thoại: (0292) 3 872 157; Email: tapchidhct@ctu.edu.vn

Chương trình chạy tốt nhất trên trình duyệt IE 9+ & FF 16+, độ phân giải màn hình 1024x768 trở lên

Vui lòng chờ...