DocLab for web October 2006

11/9/2008

DOCLAB

Rensselaer Polytechnic Institute

Jonsson Engineering Center

Troy, NY 12180-3590

George Nagy

tel: (518) 276-6078

fax: (518) 276-6261

email: nagy@ecse.rpi.edu

Department of Electrical, Computer, and Systems Engineering

Department of Computer Science

New York State Center for Automated Technologies

OVERVIEW

FACULTY

CURRENT AND FORMER STUDENTS

CURRENT PROJECTS

COMPLETED PROJECTS

SPONSORS AND COLLABORATORS

RECENT PUBLICATIONS

OVERVIEW

DocLab offers facilities for document image analysis (DIA) and optical character recognition (OCR) to graduate and undergraduate students in computer engineering and computer science. We conduct research on the extraction of information from both hard-copy and computer-generated images. Our emphasis has been on printed text, but we are also interested in hand print and script, tables, business forms, engineering drawings, utility maps, ballots, and 2-D and 3-D images. We focus on the development and experimental verification of algorithms, but we have occasionally delivered commercial-grade software modules.

DocLab was established in 1985. We are located in the department of Electrical, Computer, and Systems Engineering (Head: Kim Boyer) in the Jonsson Engineering Center. We receive technical, administrative, and financial support from the New York State Center for Automation Technologies at Rensselaer (Director: Prof. John Wen). We are also affiliated with the Rensselaer Center for Image Processing Research and with the Rensselaer Center for Open Source Software (Profs. B. Roysam and M. Krishnamoorthy).

We are interested in all aspects of Document Image Analysis,
Optical Character Recognition, and Graphics Conversion:

style and language context

optical scanner and facsimile characterization

document and character defect modeling

format, layout, and functional component analysis

character segmentation and classification

consolidation of information from semiformatted web data

ontologies from tables and table ontologies

error recovery and error classification

efficient interaction for document and picture recognition

Applications include:

automated distribution and filing systems

op-scan election technologies and camera-based ballot interpretation

transaction processing, postal address reading

digital libraries, information retrieval

personal record keeping

document markup for the web

automated construction of web ontologies

conversion of medical images to symbolic form

All of our projects must meet the academic requirements of graduate research and scholarly publication, and must lead commercial initiatives by several years. The common thread among our projects is recognition systems that improve with use.

Faculty:

George Nagy, Professor, Computer Engineering

PhD Cornell University, 1962 (neural networks)

tel: (518) 276-6078; fax: (518) 276-6162; email: nagy@ecse.rpi.edu

Nagy has published over hundred research and survey papers on OCR and DIA. Some of the ideas that he and his colleagues have originated have been widely adopted: prototype-based compression for printed text, self-corrective (adaptive) classification, probabilistic decision trees, and X-Y trees for document format representation. He is also known for his early work with R. Casey at the IBM T.J. Watson Research Center on Chinese character recognition and on context-based character recognition through substitution ciphers.

Mukkai Krishnamoorthy, Associate Professor, Computer Science

PhD Indian Institute of Technology, Kanpur, 1976

tel: (518) 276-6993; fax: (518) 276-4033; email: moorthy@cs.rpi.edu

Krishnamoorthy's research interests are software engineering, graph algorithms, compilers, and combinatorics. In document image analysis, he has contributed significantly to syntactic segmentation and the analysis of printed tables.

Current Graduate Students:

MS Candidates:
Raghav Padmanabhan	Query by Table
Ramana Chakradhar
Anne Miller	Useability test for op-scan election ballots

Former DocLab students (in image analysis and OCR):

Piyushee Jha	MS May 2008	Table analysis for generating ontologies
B.S. Yanamadala	MS Dec 2007	Hypothesis tests for classifier evalulation
Xiaoli Zhang	PhD May 2007	Style quantification
Srinivas Andra	PhD Dec 2006	Style-constrained character recognition with SVMs
Ashutosh Joshi	PhD May 2006	Symbolic Indirect Correlation
Abhishek Gattani	MS August 2005	Mobile interactive visual pattern recognition
Jie Zou	PhD May 2004	Interactive visual pattern recognition
Tong Zhang	PhD May 2004	Volumetric change detection in concrete under compression
Adnan El-Nasan	PhD December 2003	On-line handwritten word recognition
Grant Deffenbaugh	PhD December 2002	Vision-based robot gripper
Asad Abu-Tarif	PhD December 2002	Volumetric brain image registration
Harsha Veeramacheneni	PhD December 2002	Style-conscious classification
Prateek Sarkar	PhD May 2000	Style consistency in pattern fields
Elisa Barney Smith	PhD December 1998	Characterization of scanning defects
Sutha Sivasubramaniam	MS December 1998	Icon - label association in digitized maps
Yihong Xu	PhD August 1998	Prototype extraction and adaptive OCR
Asad Abu Tarif	MS May 1998	Table processing and understanding
Kerim Kalafala	MS December 1997	Identification of street lines and intersections in a scanned urban topographic map
Andrew Shapira	PhD December 1997	Cycle parity generators and a general random number library
	MS 1990	Terrain visibility
Shuvayu Kanjilal	MS May 1997	Directory assistance query generation
Ed Green	PhD May 1996	Table image analysis
Tasso Anagnostopoulos	MS May 1996	Preclassifier for OCR
Edison de Jesus	PhD U. Campinas 1995-96	Schematic diagram conversion
Xiaoyin Wang	MS December 1995	Reliable n-tuple features for OCR
Dz Mou Jung	PhD May 1995	Joint feature and classifier design for OCR based on a small training set
	MS 1990	Comparison of algorithms for terrain visibility
Trevor Salla	MS May 1995	The validation and evaluation of synthetic image sets in OCR
Prateek Sarkar	MS December 1994	Random phase spatial sampling effects in digitized patterns
James Waclawik	MS December 1991	Parallel extraction of hierarchic projection profiles for very large binary images
Junichi Kanai	PhD December 1990	Knowledge-based document image analysis system
Mahesh Viswanathan	PhD December 1990	A syntactic approach to document representation and labeling
Mathews Thomas	MS May 1988	Knowledge representation schemes for a document analysis system
Marina Maculotti	Laureate (U. Genoa) 1988	Strutture date ed algorithmi per il riconoscimento di testi con formule matematiche

BACK TO TOP

CURRENT PROJECTS

CERVITOR

We develop computational methods that assist doctors to organize, represent and query information in large image databases. The repertoire of images under consideration — all images that reside in a cervigram archive of 60,000 images from NCI/NLM — forms a narrow image domain that has a limited and predictable variability. In such a domain, explicit representation of domain knowledge alleviates the semantic gap between the raw sensory recordings of a scene (i.e., raw image data), and objects and processes implied from images (i.e., semantic interpretation). Weexplore, in the domain of cervical images, an information hierarchy that proceeds from raw image data to low-level image features, to recognition of objects, and finally, to knowledge. (NSF supported joint project with Profs. D. Lopresti, X. Huang, G. Tang at Lehigh U.)

PERFECT

PERFECT is an acronym that stands for "Paper and Electronic Records for Elections: Cultivating Trust." We study the reliable processing of paper ballots and other hardcopy election records. Participating institutions include Lehigh University, Boise State University, Muhlenberg College, and Rensselaer Polytechnic Institute. Of current interest in DocLab is the development and evaluation of a camera-based portable ballot counter. (NSF supported joint project with Profs C. Borick, Z. Munson, D. Loprest, E. Barney Smith.)

TANGO (Table ANalysis for Generating Ontologies).

TANGO aims to develop conceptual-model-based extraction and table recognition, in new and innovative ways to: (i) understand a table’s structure and its conceptual content; (ii) discover the constraints that hold between concepts extracted from the table; (iii) match the recognized concepts with ones recognized in other tables; (iv) merge the resulting structures to create a domain ontology; and (v) adjust the created domain ontology so that it is a clean, complete, accurate, and redundancy-free conceptualization of the source tables. (NSF-supported joint project with Profs. D. Embley, D. Lonsdale et al. at BYU.)

Style context and adaptation in OCR and ICR

We achieve near single-font and single-writer classification performance in a multifont or multi-writer environment by taking advantage of style consistency among characters in the same field, word, line, or document.

Electronic Ink

We apply string matching algorithms to on-line handwriting analysis. Instead of training a recognizer, we compare bigram or longer segments of features of an unknown word to segments of a reference string of words at precomputed positions. Symbolic indirect correlation is our approach to recognizing sequences of features whose relative order, but not relative position, is preserved.

Camera Assisted Visual Interactive Recognition (CAVIAR)

We have developed algorithms and software for a camera-based recognition system. CAVIAR draws on sequential pattern recognition, image database, expert systems, pen computing, and digital camera technology. It recognizes wild flowers more accurately than machine vision and faster than most laypersons. The novelty of the approach is that human perceptual ability is exploited through interaction with the image of the unknown object. The computer remembers the characteristics of all previously seen classes, and displays the top-ranking candidates based on already detected features.

The interaction is based on the few primitive actions that can be executed easily with a stylus and a small, touch-sensitive display. The richness of the interaction results from its interpretation. When automated segmentation fails, the operator need only point to an incorrectly segmented part of the picture. Standard color, shape and texture features are instantly re-computed and the new top candidates are displayed for operator acceptance or further search.

We have developed an MS-Windows style prototype, using a public domain Intel computer vision library. Collaborators have ported the system to a digital camera and a pocket computer. Possible modes of deployment include web cameras with server-mediated classification, camera-back interaction, PDA-camera combinations, and self-contained stationary systems for industrial or luggage inspection. Our principal research objective is to establish a sound basis for partitioning the necessary tasks between the operator and the machine. We also hope to find partners to apply CAVIAR to industrial classification and training, and to K1-12 and university-level education.
A current application is conversion of cervicograms to symbolic form.

BACK TO TOPt

COMPLETED PROJECTS

Topographic Map Conversion

Registration of multimodal brain images

Analysis of X-ray microtomographs of fracture in concrete

Segmentation and labeling of digitized pages from technical journals.

Modeling random-phase spatial sampling of printed characters.

Integrated feature and classifier design.

Decision-tree classifiers.

Preclassifiers for OCR.

Isolated hand-printed digit recognition.

Validation of pseudo-random character defect models for OCR.

SPONSORS and COLLABORATORS

IRST, Trento, Italy
Hitachi Central Research Laboratory, Tokyo
Lucent Bell Laboratories
IBM T.J. Watson Research Center
Nortel Research, Montreal
National Information and Mapping Agency
Panasonic Information and Networking Technologies
Elsag-Bailey, Genoa
Palo Alto Research Center (PARC)
Xerox Webster Research Center

University of Nebraska - Lincoln
Brigham Young University, Provo, Utah
Lehigh University – Bethlehem, PA
Pace University, White Plains
University of Salerno, Italy
Queens University, Kingston, Ontario
Information Science Research Institute, UNLV
Center for Image Analysis, Uppsala
US Department of Education
College Library Technology and Cooperation Program

SELECTED PUBLICATIONS (since 2003)

(For additional items, please see G. Nagy’s list of publications

G. Nagy, S. Veeramachaneni, “Adaptive and interactive approaches to document analysis,” in Machine Learning in Document Analysis and Recognitionl (S. Marinai, H. Fujisawa, editors), Springer, Studies in Computational Intelligence, Vol. 90, ISBN 978-3-540-76279-9, pp. 221-257, 2008.

D. Lopresti, G. Nagy, S. Seth, X. Zhang, “Multi-character Field Recognition for Arabic and Chinese Handwriting,” in Arabic & Chinese Handwriting Recognition (D. Doermann, S. Jaeger, editors), Springer LNCS # 4768, pp. 218-230, 2008.

S. Veeramachaneni and G. Nagy, “Analytical results on style-constrained Bayesian classification of pattern fields,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 29, #7, pp. 1280-1285, July 2007.

G. Nagy, Digitizing, coding, annotating, disseminating, and preserving documents, Procs. IWRIDL-2006 workshop on Digital Libraries, Kolkota, India, 2006, ACM 1-59593-608-4, 2007 (invited).

J. Zou and G. Nagy, “Visible models for interactive pattern recognition,” Pattern Recognition Letters Vol. 28, pp 2335-2342, 2007.

J. Zou, Q. Ji, and G. Nagy, A Comparative Study of Local Matching Approach for Face Recognition, IEEE Transactions on Image Processing, Vol. 16, #10, pp. 2617-2628, October 2007.

D. Lopresti, D.W. Embley, M. Hurst, and G. Nagy, "Table Processing Paradigms: A Research Survey," International Journal of Document Analysis and Recognition, vol 8, no. 2-3, Springer, June 2006.

G. Nagy and D. Lopresti, "Interactive Document Processing and Digital Libraries," Procs. 2nd IEEE International Conference on Document Image Analysis for Libraries, Lyon, France, April 27-28, pp. 1-9, IEEE Computer Society Press 2006.

D.W. Embley, D. Lopresti, and G. Nagy, "Notes on Contemporary Table Recognition," Document Analysis Systems VII, 7th International Workshop, Procs. DAS 2006, H. Bunke and A. L. Spitz, Eds., vol. 3872, LNCS, pp. 164-175, Springer, Nelson, New Zealand, February 13-15, 2006.

Ashutosh Joshi, George Nagy, Daniel P. Lopresti and Sharad C. Seth, "A Maximum-Likelihood Approach to Symbolic Indirect Correlation," 18th ICPR, vol. 3, pp. 99-103, 2006.

Xiaoli Zhang and George Nagy, "Style Quantification of Scanned Multi-source Digits," 18th ICPR, vol. 2, pp. 1018-1021, 2006

Srinivas Andra and George Nagy, “Combining Dichotomizers for MAP Field Classification," 18th ICPR, vol. 4, pp. 210-214, 2006.

D. Lopresti, A. Joshi, and G. Nagy, "Match Graph Generation for Symbolic Indirect Correlation," Procs. SPIE Symposium on Document Recognition and Retrieval, vol.SPIE 6067, San Jose, CA, SPIE/IST, January 2006.

Yuri A. Tijerino, D.W. Embley, Deryle W. Lonsdale, and G. Nagy, "Towards ontology generation from tables," World Wide Web Journal, vol. 6, #3, Springer-Verlag, September 2005.

D. Lopresti and G. Nagy, "Mobile Interactive Support System for Time-Critical Document Exploitation," Symposium on Document Image Understanding, College Park, MD, November 2005.

G. Nagy, "Interactive, Mobile, Distributed Pattern Recognition," Proceedings of the International Conference on Image Analysis and Processing (ICIAP05), vol.LNCS 3617, pp. 37-49, Cagliari, Lecture Notes in Computer Science, Springer, August 2005 (invited).

A. Joshi and G. Nagy, "Online Handwriting Recognition Using Time-Order of Lexical and Signal Co-Occurrences," Proceedings of 12th Biennial Conference of the International Graphonomics Society, pp. 201-205, Salerno, Italy, June 2005.

A. Evans, J. Sikorski, P. Thomas, S-H Cha, C. Tappert, G. Zou, A. Gattani, and G. Nagy, "Computer Assisted Visual Interactive Recognition (CAVIAR) Technology," 2005 IEEE International Conference on Electro-Information Technology, Lincoln, NE, May 2005 (Proceedings on CD-ROM only). slides

P. Sarkar and G. Nagy, "Style consistent classification of isogenous patterns," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, #1, pp. 88-98, January 2005.

S. Veeramachaneni and G. Nagy, "Style context with second order statistics," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, #1, pp. 14-22, January 2005.

D. Lopresti and G. Nagy, "Chipless RFID for paper documents," Proceedings of Document Recognition and Retrieval XII, vol.5676, pp. 208-215, San Jose, CA, SPIE, January 2005.

S. Veeramachaneni and G. Nagy, "Adaptive classifiers for multisource OCR," International Journal of Document Analysis and Recognition, vol. 6, #3, March 2004.