Text and Non-text Segmentation based on Connected Component Features

Hướng dẫn

Tìm kiếm nâng cao

Tựa bài viết

Tìm

Tác giả

Năm xuất bản

Tóm tắt

Lĩnh vực

Phân loại

Số tạp chí

Bản tin định kỳ

Báo cáo thường niên

Tạp chí khoa học ĐHCT

Tạp chí tiếng anh ĐHCT

Tạp chí trong nước

Tạp chí quốc tế

Kỷ yếu HN trong nước

Kỷ yếu HN quốc tế

Book chapter

Text and Non-text Segmentation based on Connected Component Features

1 (2015) Trang:

Tác giả: Le Viet Phuong, Trần Cao Đệ, Jean-Marc Ogier, Muriel Visani, Nibal Nayef

Tạp chí: International Conference on Document Analysis and Recognition (ICDAR’2015)

Liên kết:

Tóm tắt

Document image segmentation is crucial to OCR and other digitization processes. In this paper, we present a learning-based approach for text and non-text separation in document images. The training features are extracted at the level of connected components, a mid-level between the slow noise-sensitive pixel level, and the segmentation-dependent zone level. Given all types, shapes and sizes of connected components, we extract a powerful set of features based on size, shape, stroke width and position of each connected component. Adaboosting with Decision trees is used for labeling connected components. Finally, the classification of connected components into text and non-text is corrected based on classification probabilities and size as well as stroke width analysis of the nearest neighbors of a connected component. The performance of our approach has been evaluated on the two standard datasets: UW-III and ICDAR-2009 competition for document layout analysis. Our results demonstrate that the proposed approach achieves competitive performance for segmenting text and non-text in document images of variable content and degradation.

Các bài báo khác

Time-efficient Logo Spotting using Text/Non-text Separation as Preprocessing and Approximate Nearest Neighbor Search

(2016) Trang:

Tác giả: Le Viet Phuong, Trần Cao Đệ, Jean-Marc Ogier, Muriel Visani, Nibal Nayef

Tạp chí: CIFED, Toulouse, France, 8-11/3/2016

Tóm tắt

Logo Spotting on Document Images using Local Features

1 (2015) Trang:

Tác giả: Le Viet Phuong, Trần Cao Đệ, Đặng Quốc Bảo

Tạp chí: THE FIFTH SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY

Tóm tắt

LOGO SPOTTING AND RECOGNITION FOR DOCUMENT RETRIEVAL

22 (2014) Trang: 3056-3061

Tác giả: Le Viet Phuong, Nibal Nayef, Muriel Visani, Jean-Marc Ogier, Trần Cao Đệ

Tạp chí: the 22nd International Conference on Pattern Recognition, Stockholm, 24-28 August 2014,

Tóm tắt

Vietnamese | English

Tạp chí khoa học Trường Đại học Cần Thơ
Khu II, Đại học Cần Thơ, Đường 3/2, Phường Ninh Kiều, Thành phố Cần Thơ, Việt Nam
Điện thoại: (0292) 3 872 157; Email: tapchidhct@ctu.edu.vn

Chương trình chạy tốt nhất trên trình duyệt IE 9+ & FF 16+, độ phân giải màn hình 1024x768 trở lên

Vui lòng chờ...