Đăng nhập
 
Tìm kiếm nâng cao
 
Tên bài báo
Tác giả
Năm xuất bản
Tóm tắt
Lĩnh vực
Phân loại
Số tạp chí
 

Bản tin định kỳ
Báo cáo thường niên
Tạp chí khoa học ĐHCT
Tạp chí tiếng anh ĐHCT
Tạp chí trong nước
Tạp chí quốc tế
Kỷ yếu HN trong nước
Kỷ yếu HN quốc tế
Book chapter
Book chapter 2022
Số tạp chí Tran Khanh Dang, Josef Küng, Tai M. Chung(2022) Trang: 574–587
Tạp chí: Communications in Computer and Information Science

Currently, many studies are measuring the similarity between documents in a specific language, such as Vietnamese - Vietnamese and English - English. However, situations have recently appeared in the problem of copying articles. For example, English sources have been translated into Vietnamese and edited into their manuscripts. As a result, it is considered cross-language plagiarism. Therefore, this study has applied a new approach: translate from English to Vietnamese documents, then calculate and compare the translated document with documents modified or copied from a translated document. In the study, the main focus is on stages such as Translating English documents into Vietnamese, preprocessing documents, and determining the similarity between documents. The determination of similarity between documents mentioned in this topic is Cosine similarity based on Term Frequency (TF), Inverse Document Frequency (IDF), and word order similarity in the text. Combine these two metrics to give a similar result that is more accurate and convincing. The data is collected in 7 topics with related topics with the number of 15 documents with lengths from 2000 to more than 8000 words, successfully built a document translation integration system based on Google Translate Application Programming Interface (API) and similarity checking, Precision and Recall measures show very positive results over 80%.

Các bài báo khác
Số tạp chí Leonard Barolli(2022) Trang: 140–150
Tạp chí: Lecture Notes in Networks and Systems book series
Số tạp chí Ngoc Le Anh, Seok-Joo Koh, Thi Dieu Linh Nguyen, Jaime Lloret, Thanh Tung Nguyen(2022) Trang: 410–417
Tạp chí: Lecture Notes in Networks and Systems book series
Số tạp chí Ngoc Le Anh, Seok-Joo Koh, Thi Dieu Linh Nguyen, Jaime Lloret, Thanh Tung Nguyen(2022) Trang: 437–444
Tạp chí: Lecture Notes in Networks and Systems book series
Số tạp chí Nguyen Hoang Phuong, Vladik Kreinovich(2022) Trang: 251–264
Tạp chí: Studies in Computational Intelligence


Vietnamese | English






 
 
Vui lòng chờ...