From nobody Wed Mar 27 09:00 EST 1996
Date: Wed, 27 Mar 1996 09:00:38 -0500 (EST)
From: uid no body <nobody>
To: techreps@cs.buffalo.edu
Subject: techrep: POST request
Content-Type: text
Content-Length: 2809

ContactPerson: taohong@cedar.buffalo.edu
Remote host: nair.cedar.buffalo.edu
Remote ident: taohong
### Begin Citation ### Do not delete this line ###
%R 96-05
%U thesis.ps
%A Tao Hong
%T Degraded Text Recognition using Visual and Linguistic Context
%D March 27, 1996
%I Department of Computer Science, SUNY Buffalo
%K OCR, Document Recognition, Contextual Analysis
%Y I.5     PATTERN RECOGNITION
%X To improve the performance of an OCR system on degraded images of text, 
postprocessing techniques are critical. The objective of postprocessing 
is to correct errors or to resolve ambiguities in OCR results by using  
contextual information. Depending on the extent of context used,  
there are different levels of postprocessing.  
In current commercial OCR systems, word-level postprocessing methods,
such as dictionary-lookup, have been applied successfully.  However,
many OCR errors cannot be corrected by word-level postprocessing. To
overcome this limitation, passage-level postprocessing, in which
global contextual information is utilized, is necessary.
This thesis addresses problems in degraded text recognition and
discusses potential solutions through passage-level postprocessing.
The objective is to develop a postprocessing methodology from a
broader perspective. In this work, two classes of inter-word
contextual constraints, visual constraints and linguistic
constraints, are exploited extensively.
Given a text page with hundreds of words, many word image instances 
can be found visually similar. 
Formally, six types of visual inter-word relations are defined.  
Relations at the image level must be consistent with the relations at the  
symbolic level if word images in the text have been interpreted correctly. 
Based on the fact that OCR results often violate this consistency,  
methods of visual consistency analysis are designed to detect and correct  
OCR errors. 
Linguistic knowledge sources such as lexicography, syntax, and  
semantics, can be used to detect and correct OCR errors. 
Here, we focus on the word candidate selection problem. 
In this approach an OCR provides several alternatives for each word and  
the objective of postprocessing is to choose the correct decision among  
these choices. 
Two approaches of linguistic analysis, statistical and structural,  
are proposed for the problem of candidate selection. 
A word-collocation-based relaxation algorithm and a probabilistic  
lattice parsing algorithm are proposed. 
There exist some OCR errors which are not easily recoverable 
by either visual consistency analysis or linguistic  consistency analysis. 
Integration of image analysis and language-level analysis  
provides a natural way to handle difficult words.