From seni Tue Nov 21 12:09 EST 1995 Date: Tue, 21 Nov 1995 12:08:59 -0500 From: Giovanni Seni To: milun@cs.Buffalo.EDU Subject: abstract for TR 95-38 Content-Type: text Content-Length: 2445 A critical feature of any computer system is its interface with the user. This has led to the development of user interface technologies such as mouse, touchscreen and pen-based input devices. Since handwriting is one of the most familiar communication media, pen-based interfaces combined with automatic handwriting recognition offers a very easy and natural input method. Pen-based interfaces are also essential in mobile computing because they are scalable. Recent advances in pen-based hardware and wireless communication have been influential factors in the renewed interest in on-line recognition systems. On-line handwriting recognition is fundamentally a pattern classification task; the objective is to take an input pattern, the handwritten signal collected on-line via a digitizing device, and classify it as one of a pre-specified set of words (i.e., the system's lexicon). Because exact recognition is very difficult, a lexicon is used to constrain the recognition output to a known vocabulary. Lexicon size affects recognition performance because the larger the lexicon, the larger the number of words that can be confused. Most of the research efforts in this area have been devoted to the recognition of isolated characters, or run-on hand-printed words. A smaller number of recognition systems have been devised for cursive words, a difficult task due to the presence of the letter segmentation problem (partitioning the word into letters), and large variation at the letter level. Most existing systems restrict the working dictionary sizes to less than a few thousand words. This research focused on the problem of cursive word recognition. In particular, I investigated the issues of how to efficiently deal with large lexicon sizes, the role of dynamic information over traditional feature-analysis models in the recognition process, the incorporation of letter context and avoidance of error-prone segmentation of the script by means of an integrated segmentation and recognition approach, and the use of domain information in the postprocessing stage. These ideas were used to good effect in a recognition system that I developed; this system, operating on a 21,000-word lexicon , was able to correctly recognize 88.1% (top-10) and 98.6% (top-10) of the writer-independent and writer-dependent test set words respectively. ----------------------------------END---------------------------- Thank you Davin. -gs