Plagiarism is considered a serious problem in academic and educational contexts. Tools for detecting plagiarism are concerned by editors, writers, teachers, students, etc., to ensure that the produced papers are original and avoid plagiarism. Therefore, the algorithms of checking document similarity with high accuracy are essential. In this study, we present a method to check the similarity of students’ assignments compared to the archived assignments. This method is used to compute the similarity between each pair of sentences, integrated with semantic checking. In addition, we collect and build a dataset of synonyms, antonyms, and auxiliary verbs with negative meanings for semantic checking. Moreover, we manually generate a dataset to evaluate the similarity computation and seek similar assignments. Experimental results on 200 documents reveal an average precision of 0.9666 and a recall of 0.9100 in determining similar assignments with various levels of revisions.
Tạp chí khoa học Trường Đại học Cần Thơ
Lầu 4, Nhà Điều Hành, Khu II, đường 3/2, P. Xuân Khánh, Q. Ninh Kiều, TP. Cần Thơ
Điện thoại: (0292) 3 872 157; Email: tapchidhct@ctu.edu.vn
Chương trình chạy tốt nhất trên trình duyệt IE 9+ & FF 16+, độ phân giải màn hình 1024x768 trở lên