From nobody Wed Mar 27 09:00 EST 1996 Date: Wed, 27 Mar 1996 09:00:38 -0500 (EST) From: uid no body To: techreps@cs.buffalo.edu Subject: techrep: POST request Content-Type: text Content-Length: 2809 ContactPerson: taohong@cedar.buffalo.edu Remote host: nair.cedar.buffalo.edu Remote ident: taohong ### Begin Citation ### Do not delete this line ### %R 96-05 %U thesis.ps %A Tao Hong %T Degraded Text Recognition using Visual and Linguistic Context %D March 27, 1996 %I Department of Computer Science, SUNY Buffalo %K OCR, Document Recognition, Contextual Analysis %Y I.5 PATTERN RECOGNITION %X To improve the performance of an OCR system on degraded images of text, postprocessing techniques are critical. The objective of postprocessing is to correct errors or to resolve ambiguities in OCR results by using contextual information. Depending on the extent of context used, there are different levels of postprocessing. In current commercial OCR systems, word-level postprocessing methods, such as dictionary-lookup, have been applied successfully. However, many OCR errors cannot be corrected by word-level postprocessing. To overcome this limitation, passage-level postprocessing, in which global contextual information is utilized, is necessary. This thesis addresses problems in degraded text recognition and discusses potential solutions through passage-level postprocessing. The objective is to develop a postprocessing methodology from a broader perspective. In this work, two classes of inter-word contextual constraints, visual constraints and linguistic constraints, are exploited extensively. Given a text page with hundreds of words, many word image instances can be found visually similar. Formally, six types of visual inter-word relations are defined. Relations at the image level must be consistent with the relations at the symbolic level if word images in the text have been interpreted correctly. Based on the fact that OCR results often violate this consistency, methods of visual consistency analysis are designed to detect and correct OCR errors. Linguistic knowledge sources such as lexicography, syntax, and semantics, can be used to detect and correct OCR errors. Here, we focus on the word candidate selection problem. In this approach an OCR provides several alternatives for each word and the objective of postprocessing is to choose the correct decision among these choices. Two approaches of linguistic analysis, statistical and structural, are proposed for the problem of candidate selection. A word-collocation-based relaxation algorithm and a probabilistic lattice parsing algorithm are proposed. There exist some OCR errors which are not easily recoverable by either visual consistency analysis or linguistic consistency analysis. Integration of image analysis and language-level analysis provides a natural way to handle difficult words.