11/9/2008

 

DOCLAB

 

 

Rensselaer Polytechnic Institute

Jonsson Engineering Center

Troy, NY 12180-3590

 

George Nagy

 

tel: (518) 276-6078

fax: (518) 276-6261

email: nagy@ecse.rpi.edu

 

Department of Electrical, Computer, and Systems Engineering

Department of Computer Science

 

New York State Center for Automated Technologies

 

OVERVIEW

FACULTY

CURRENT AND FORMER STUDENTS

CURRENT PROJECTS

COMPLETED PROJECTS

SPONSORS AND COLLABORATORS

RECENT PUBLICATIONS


 

OVERVIEW

 

DocLab offers facilities for document image analysis (DIA) and optical character recognition (OCR) to graduate and undergraduate students in computer engineering and computer science. We conduct research on the extraction of information from both hard-copy and computer-generated images. Our emphasis has been on printed text, but we are also interested in hand print and script, tables, business forms, engineering drawings, utility maps, ballots, and 2-D and 3-D images. We focus on the development and experimental verification of algorithms, but we have occasionally delivered commercial-grade software modules.

 

DocLab was established in 1985. We are located in the department of Electrical, Computer, and Systems Engineering (Head: Kim Boyer) in the Jonsson Engineering Center. We receive technical, administrative, and financial support from the New York State Center for Automation Technologies at Rensselaer (Director: Prof. John Wen). We are also affiliated with the Rensselaer Center for Image Processing Research and with the Rensselaer Center for Open Source Software (Profs. B. Roysam and M. Krishnamoorthy).

 

We are interested in all aspects of Document Image Analysis,
Optical Character Recognition, and Graphics Conversion:

 

style and language context

optical scanner and facsimile characterization

document and character defect modeling

format, layout, and functional component analysis

character segmentation and classification

consolidation of information from semiformatted web data

ontologies from tables and table ontologies

error recovery and error classification

efficient interaction for document and picture recognition

 

Applications include:

 

automated distribution and filing systems

op-scan election technologies and camera-based ballot interpretation

transaction processing, postal address reading

digital libraries, information retrieval

personal record keeping

document markup for the web

automated construction of web ontologies

conversion of medical images to symbolic form

 

All of our projects must meet the academic requirements of graduate research and scholarly publication, and must lead commercial initiatives by several years. The common thread among our projects is recognition systems that improve with use.


Faculty:

 

George Nagy, Professor, Computer Engineering

PhD Cornell University, 1962 (neural networks)

tel: (518) 276-6078; fax: (518) 276-6162; email: nagy@ecse.rpi.edu

 

Nagy has published over hundred research and survey papers on OCR and DIA. Some of the ideas that he and his colleagues have originated have been widely adopted: prototype-based compression for printed text, self-corrective (adaptive) classification, probabilistic decision trees, and X-Y trees for document format representation. He is also known for his early work with R. Casey at the IBM T.J. Watson Research Center on Chinese character recognition and on context-based character recognition through substitution ciphers.

 

Mukkai Krishnamoorthy, Associate Professor, Computer Science

PhD Indian Institute of Technology, Kanpur, 1976

tel: (518) 276-6993; fax: (518) 276-4033; email: moorthy@cs.rpi.edu

 

Krishnamoorthy's research interests are software engineering, graph algorithms, compilers, and combinatorics. In document image analysis, he has contributed significantly to syntactic segmentation and the analysis of printed tables.

BACK TO TOP


Current Graduate Students:

 

MS Candidates:

 

Raghav Padmanabhan

Query by Table

Ramana Chakradhar

 

Anne Miller

Useability test for op-scan election ballots

 

 

 

Former DocLab students (in image analysis and OCR):

 

Piyushee Jha

MS May 2008

Table analysis for generating ontologies

B.S. Yanamadala

MS Dec 2007

Hypothesis tests for classifier evalulation

Xiaoli Zhang

PhD May 2007

Style quantification

Srinivas Andra

PhD Dec 2006

Style-constrained character recognition with SVMs

Ashutosh Joshi

PhD May 2006

Symbolic Indirect Correlation

Abhishek Gattani

MS August 2005

Mobile interactive visual pattern recognition

Jie Zou

PhD May 2004

Interactive visual pattern recognition

Tong Zhang

PhD May 2004

Volumetric change detection in concrete under compression

Adnan El-Nasan

PhD December 2003

On-line handwritten word recognition

Grant Deffenbaugh

PhD December 2002

Vision-based robot gripper

Asad Abu-Tarif

PhD December 2002

Volumetric brain image registration

Harsha Veeramacheneni

PhD December 2002

Style-conscious classification

Prateek Sarkar

PhD May 2000

Style consistency in pattern fields

Elisa Barney Smith

PhD December 1998

Characterization of scanning defects

Sutha Sivasubramaniam

MS December 1998

Icon - label association in digitized maps

Yihong Xu

PhD August 1998

Prototype extraction and adaptive OCR

Asad Abu Tarif

MS May 1998

Table processing and understanding

Kerim Kalafala

MS December 1997

Identification of street lines and intersections in a scanned urban topographic map

Andrew Shapira

PhD December 1997

Cycle parity generators and a general random number library

           

MS 1990

Terrain visibility

Shuvayu Kanjilal

MS May 1997

Directory assistance query generation

Ed Green

PhD May 1996

Table image analysis

Tasso Anagnostopoulos

MS May 1996

Preclassifier for OCR

Edison de Jesus

PhD U. Campinas 1995-96

Schematic diagram conversion

Xiaoyin Wang

MS December 1995

Reliable n-tuple features for OCR

Dz Mou Jung

PhD May 1995

Joint feature and classifier design for OCR based on a small training set

 

MS 1990

Comparison of algorithms for terrain visibility

Trevor Salla

MS May 1995

The validation and evaluation of synthetic image sets in OCR

Prateek Sarkar

MS December 1994

Random phase spatial sampling effects in digitized patterns

James Waclawik

MS December 1991

Parallel extraction of hierarchic projection profiles for very large binary images

Junichi Kanai

PhD December 1990

Knowledge-based document image analysis system

Mahesh Viswanathan

PhD December 1990

A syntactic approach to document representation and labeling

Mathews Thomas

MS May 1988

Knowledge representation schemes for a document analysis system

Marina Maculotti

Laureate (U. Genoa) 1988

Strutture date ed algorithmi per il riconoscimento di testi con formule matematiche

BACK TO TOP


CURRENT PROJECTS

CERVITOR

We develop computational methods that assist doctors to organize, represent and query information in large image databases. The repertoire of images under consideration — all images that reside in a cervigram archive of 60,000 images from NCI/NLM — forms a narrow image domain that has a limited and predictable variability. In such a domain, explicit representation of domain knowledge alleviates the semantic gap between the raw sensory recordings of a scene (i.e., raw image data), and objects and processes implied from images (i.e., semantic interpretation). Weexplore, in the domain of cervical images, an information hierarchy that proceeds from raw image data to low-level image features, to recognition of objects, and finally, to knowledge. (NSF supported joint project with Profs. D. Lopresti, X. Huang, G. Tang at Lehigh U.)

 

PERFECT

PERFECT is an acronym that stands for "Paper and Electronic Records for Elections: Cultivating Trust." We study the reliable processing of paper ballots and other hardcopy election records. Participating institutions include Lehigh University, Boise State University, Muhlenberg College, and Rensselaer Polytechnic Institute. Of current interest in DocLab is the development and evaluation of a camera-based portable ballot counter. (NSF supported joint project with  Profs C. Borick, Z. Munson, D. Loprest, E. Barney Smith.)

TANGO (Table ANalysis for Generating Ontologies).

TANGO aims to develop conceptual-model-based extraction and table recognition, in new and innovative ways to: (i) understand a table’s structure and its conceptual content; (ii) discover the constraints that hold between concepts extracted from the table; (iii) match the recognized concepts with ones recognized in other tables; (iv) merge the resulting structures to create a domain ontology; and (v) adjust the created domain ontology so that it is a clean, complete, accurate, and redundancy-free conceptualization of the source tables. (NSF-supported joint project with Profs. D. Embley, D. Lonsdale et al. at BYU.)

Style context and adaptation in OCR and ICR

We achieve near single-font and single-writer classification performance in a multifont or multi-writer environment by taking advantage of style consistency among characters in the same field, word, line, or document.

Electronic Ink

We apply string matching algorithms to on-line handwriting analysis. Instead of training a recognizer, we compare bigram or longer segments of features of an unknown word to segments of a reference string of words at precomputed positions. Symbolic indirect correlation is our approach to recognizing sequences of features whose relative order, but not relative position, is preserved.

Camera Assisted Visual Interactive Recognition (CAVIAR)

We have developed algorithms and software for a camera-based recognition system. CAVIAR draws on sequential pattern recognition, image database, expert systems, pen computing, and digital camera technology. It recognizes wild flowers more accurately than machine vision and faster than most laypersons. The novelty of the approach is that human perceptual ability is exploited through interaction with the image of the unknown object. The computer remembers the characteristics of all previously seen classes, and displays the top-ranking candidates based on already detected features.

The interaction is based on the few primitive actions that can be executed easily with a stylus and a small, touch-sensitive display. The richness of the interaction results from its interpretation. When automated segmentation fails, the operator need only point to an incorrectly segmented part of the picture. Standard color, shape and texture features are instantly re-computed and the new top candidates are displayed for operator acceptance or further search.

We have developed an MS-Windows style prototype, using a public domain Intel computer vision library. Collaborators have ported the system to a digital camera and a pocket computer. Possible modes of deployment include web cameras with server-mediated classification, camera-back interaction, PDA-camera combinations, and self-contained stationary systems for industrial or luggage inspection. Our principal research objective is to establish a sound basis for partitioning the necessary tasks between the operator and the machine. We also hope to find partners to apply CAVIAR to industrial classification and training, and to K1-12 and university-level education.
A current application is conversion of cervicograms to symbolic form.

BACK TO TOPt


COMPLETED PROJECTS

Topographic Map Conversion

Registration of multimodal brain images

Analysis of X-ray microtomographs of fracture in concrete

Segmentation and labeling of digitized pages from technical journals.

Modeling random-phase spatial sampling of printed characters.

Integrated feature and classifier design.

Decision-tree classifiers.

Preclassifiers for OCR.

Isolated hand-printed digit recognition.

Validation of pseudo-random character defect models for OCR.

SPONSORS and COLLABORATORS


SELECTED PUBLICATIONS (since 2003)

          (For additional items, please see G. Nagy’s list of publications

G. Nagy, S. Veeramachaneni, “Adaptive and interactive approaches to document analysis,” in Machine Learning in Document Analysis and Recognitionl (S. Marinai, H. Fujisawa, editors), Springer,  Studies in Computational Intelligence, Vol. 90, ISBN 978-3-540-76279-9, pp. 221-257, 2008.

D. Lopresti, G. Nagy, S. Seth, X. Zhang, “Multi-character Field Recognition for Arabic and Chinese Handwriting,” in Arabic & Chinese Handwriting Recognition (D. Doermann, S. Jaeger, editors), Springer LNCS # 4768, pp. 218-230, 2008.

S. Veeramachaneni and G. Nagy, “Analytical results on style-constrained Bayesian classification of pattern fieldsIEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 29, #7, pp. 1280-1285, July 2007.

G. Nagy, Digitizing, coding, annotating, disseminating, and preserving documents, Procs. IWRIDL-2006 workshop on Digital Libraries, Kolkota, India, 2006, ACM 1-59593-608-4, 2007 (invited).

J. Zou and G. Nagy, “Visible models for interactive pattern recognitionPattern Recognition Letters Vol. 28, pp 2335-2342, 2007. 

J. Zou, Q. Ji, and G. Nagy, A Comparative Study of Local Matching Approach for Face Recognition, IEEE Transactions on Image Processing, Vol. 16, #10, pp. 2617-2628, October 2007.

D. Lopresti, D.W. Embley, M. Hurst, and G. Nagy, "Table Processing Paradigms:  A Research Survey," International Journal of Document Analysis and Recognition, vol 8, no. 2-3, Springer, June 2006.

G. Nagy and D. Lopresti, "Interactive Document Processing and Digital Libraries," Procs. 2nd IEEE International Conference on Document Image Analysis for Libraries, Lyon, France, April 27-28, pp. 1-9, IEEE Computer Society Press 2006.

D.W. Embley, D. Lopresti, and G. Nagy, "Notes on Contemporary Table Recognition," Document Analysis Systems VII, 7th International Workshop, Procs. DAS 2006, H. Bunke and A. L. Spitz, Eds., vol. 3872, LNCS, pp. 164-175, Springer, Nelson, New Zealand, February 13-15, 2006.

Ashutosh Joshi, George Nagy, Daniel P. Lopresti and Sharad C. Seth, "A Maximum-Likelihood Approach to Symbolic Indirect Correlation," 18th ICPR, vol. 3, pp. 99-103, 2006.

Xiaoli Zhang and George Nagy, "Style Quantification of Scanned Multi-source Digits," 18th ICPR, vol. 2, pp. 1018-1021, 2006

Srinivas Andra and George Nagy, “Combining Dichotomizers for MAP Field Classification," 18th ICPR, vol. 4, pp. 210-214, 2006.

D. Lopresti, A. Joshi, and G. Nagy, "Match Graph Generation for Symbolic Indirect Correlation," Procs. SPIE Symposium on Document Recognition and Retrieval, vol.SPIE 6067, San Jose, CA, SPIE/IST, January 2006.

Yuri A. Tijerino, D.W. Embley, Deryle W. Lonsdale, and G. Nagy, "Towards ontology generation from tables," World Wide Web Journal, vol. 6, #3, Springer-Verlag, September 2005.

D. Lopresti and G. Nagy, "Mobile Interactive Support System for Time-Critical Document Exploitation," Symposium on Document Image Understanding, College Park, MD, November 2005.

G. Nagy, "Interactive, Mobile, Distributed Pattern Recognition," Proceedings of the International Conference on Image Analysis and Processing (ICIAP05), vol.LNCS 3617, pp. 37-49, Cagliari, Lecture Notes in Computer Science, Springer, August 2005 (invited).

A. Joshi and G. Nagy, "Online Handwriting Recognition Using Time-Order of Lexical and Signal Co-Occurrences," Proceedings of 12th Biennial Conference of the International Graphonomics Society, pp. 201-205, Salerno, Italy, June 2005.

A. Evans, J. Sikorski, P. Thomas, S-H Cha, C. Tappert, G. Zou, A. Gattani, and G. Nagy, "Computer Assisted Visual Interactive Recognition (CAVIAR) Technology," 2005 IEEE International Conference on Electro-Information Technology, Lincoln, NE, May 2005 (Proceedings on CD-ROM only).          slides

P. Sarkar and G. Nagy, "Style consistent classification of isogenous patterns," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, #1, pp. 88-98, January 2005.

S. Veeramachaneni and G. Nagy, "Style context with second order statistics," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, #1, pp. 14-22, January 2005.

 D. Lopresti and G. Nagy, "Chipless RFID for paper documents," Proceedings of Document Recognition and Retrieval XII, vol.5676, pp. 208-215, San Jose, CA, SPIE, January 2005.

S. Veeramachaneni and G. Nagy, "Adaptive classifiers for multisource OCR," International Journal of Document Analysis and Recognition, vol. 6, #3, March 2004.

T. Zhang, E. Nagy, E. Landis, and G. Nagy, "3D Multiple Crack System: Extraction and Analysis using Motion-based Image Processing Technique," Proceedings of 17th ASCE Engineering Mechanics Conference, University of  Delaware, Newark, DE, June 13-16, 2004.

J. Zou and G. Nagy, "Evaluation of model-based interactive pattern recognition," Proceedings  of International Conference on Pattern Recognition XVII, vol.II, pp. 311-314, Cambridge, UK, August 2004.

T. Zhang and G. Nagy, "Surface tortuosity and its application to analyzing cracks in concrete," Proceedings of International Conference on Pattern Recognition XVII, vol.II, pp. 851-854, Cambridge, UK, August 2004.

G. Nagy, "Visual pattern recognition in the years ahead," Proceedings of International Conference on Pattern Recognition XVII, vol.IV, pp. 7-10, Cambridge, UK, August 2004 (invited).                 [slides]

G. Nagy, "Classifiers that improve with use," Proceedings of Conference on Pattern Recognition and Multimedia, pp. 79-86, IEICE, Tokyo, February 2004 (invited).  [slides]

G. Nagy and P. Sarkar, "Document style census for OCR," Proceedings of First International Workshop on Document Image Analysis for Libraries (DIAL04), pp. 134-147, Palo Alto, CA, IEEE Computer Society Press, January 2004.

G. Nagy, A. Joshi, M. Krishnamoorthy, D.P. Lopresti, S. Mehta, and S. Seth, "A nonparametric classifier for unsegmented text," Proceedings of Document Recognition and Retrieval XI, vol.5296, pp. 102-108, San Jose, CA, SPIE, January 2004.            [slides]

Yuri A. Tijerino, David W. Embley, Deryle W. Lonsdale, and G. Nagy, "Ontology generation from tables," Proceedings of 4th International Conference on Web Information Systems Engineering (WISE03), pp. 242-249, Rome, Italy, December 2003.

G. Nagy and S. Veeramachaneni, "A Ptolemaic model for OCR," Proceedings of International Conference on Document Analysis and Recognition, pp. 1060-1064, Edinburgh, UK, August 2003.

A. El Nasan, G. Nagy, and S. Veeramachaneni, "Handwriting recognition using position sensitive n-gram matching,"Proceedings of International Conference on Document Analysis and Recognition, pp. 577-582, Edinburgh, UK, August 2003.

E. Nagy, T. Zhang, W.R. Franklin, G. Nagy, and E. Landis, "3D Analysis of Tomographic Images," Proceedings of 16th ASCE Engineering Mechanics Conference, University of Washington, Seattle, July 16-18 2003.

G. Nagy, S.C. Seth, S.K. Metha, and Y. Lin, "Indirect Symbolic Correlation approach to unsegmented text recognition," Proceedings of Workshop on Document Image Analysis and Retrieval, Madison, WI, June 2003 (CD-ROM only).

D. Blostein, R. Harrap, G. Nagy, and R. Zanibbi, "Document representations," Proceedings of Fifth IAPR Workshop on Graphics Recognition (GREC 03), pp. 3-12, Barcelona, July 2003.