A robust technique for character string extraction from complex document images

ASIA unversity > 資訊學院 > 資訊工程學系 > 會議論文 > Item 310904400/8854

Please use this identifier to cite or link to this item: http://asiair.asia.edu.tw/ir/handle/310904400/8854

Title:	A robust technique for character string extraction from complex document images
Authors:	Chen, Yen-Lin
Contributors:	Department of Computer Science and Information Engineering
Keywords:	Holograms;Image enhancement;Information technology;Background textures;Character strings;Complex document images;Document images;Homogeneous objects;Local features;New techniques;Object planes;Regions of interests;Robust techniques;Text extractions
Date:	2008
Issue Date:	2010-04-08 12:22:32 (UTC+0)
Publisher:	Asia University
Abstract:	A new technique for segmenting and extracting character strings from various real-life complex document images is proposed in this study. The proposed text extraction technique first decompose the document image into distinct object planes to extract and separate homogeneous objects including textual regions of interest, non-text objects such as graphics and pictures, and background textures. Then a text extraction procedure is applied to the resultant planes to extract character strings with different characteristics in the corresponding planes. The document image is processed regionally and adaptively according to its local features, and thus detailed characteristics of extracted textual objects can be well-preserved, especially small characters with thin strokes. From the experimental results and comparisons to the existing technique, the proposed approach demonstrates its effectiveness and advantages on extracting character strings with various illuminations, sizes, and font styles from various types of complex document images.
Relation:	Proceedings - International Symposium on Information Technology 2008, ITSim:1-9
Appears in Collections:	[資訊工程學系] 會議論文

Files in This Item:

File	Description	Size	Format
		0Kb	Unknown	455	View/Open
195.doc		30Kb	Microsoft Word	419	View/Open

Loading...