E-Book, Englisch, 464 Seiten
Chaudhuri Digital Document Processing
1. Auflage 2007
ISBN: 978-1-84628-726-8
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark
Major Directions and Recent Advances
E-Book, Englisch, 464 Seiten
Reihe: Advances in Computer Vision and Pattern Recognition
ISBN: 978-1-84628-726-8
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark
This book brings all the major and frontier topics in the field of document analysis together into a single volume, creating a unique reference source that will be invaluable to a large audience of researchers, lecturers and students working in this field. With chapters written by some of the most distinguished researchers active in this field, this book addresses recent advances in digital document processing research and development.
Autoren/Hrsg.
Weitere Infos & Material
1;Preface;6
2;Contents;8
3;Contributors;17
4;1 Reading Systems: An Introduction to Digital Document Processing;20
4.1;1.1 Introduction;20
4.2;1.2 Text Sensing;22
4.3;1.3 Sensor Scope;22
4.4;1.4 Sensor Grid;25
4.5;1.5 Pre-processing;25
4.6;1.6 Invariance to Affine Transforms;26
4.7;1.7 Invariance to Ink-Trace Thickness;28
4.8;1.8 Shape Features;29
4.9;1.9 Processing Type;31
4.10;1.10 Computing Architecture;32
4.11;1.11 Computing Strategy;32
4.12;1.12 Knowledge Base;33
4.13;1.13 Cognitive Reliability;34
4.14;1.14 Response in Case of Difficult Input;34
4.15;1.15 Classification Accuracy;35
4.16;1.16 Energy and Mental Concentration;36
4.17;1.17 Processing Speed;36
4.18;1.18 Volume Processing;36
4.19;1.19 Summary of Human Versus Machine Reading;37
4.20;1.20 Conclusion;45
4.21;References;45
5;2 Document Structure and Layout Analysis;48
5.1;2.1 Introduction;48
5.2;2.2 Pre-processing;50
5.3;2.3 Representing Document Structure and Layout;53
5.4;2.4 Document Layout Analysis;55
5.5;2.5 Understanding Document Structure;61
5.6;2.6 Performance Evaluation;62
5.7;2.7 Handwritten Document Analysis;64
5.8;2.8 Summary;65
5.9;References;66
6;3 OCR Technologies for Machine Printed and Hand Printed Japanese Text;68
6.1;3.1 Introduction;68
6.2;3.2 Pre-Processing;68
6.3;3.3 Feature Extraction;77
6.4;3.4 Classification;80
6.5;3.5 Dimension Reduction;82
6.6;3.6 Performance Evaluation of OCR Technologies;83
6.7;3.7 Learning Algorithms;86
6.8;3.8 Conclusion;88
6.9;References;89
7;4 Multi-Font Printed Tibetan OCR;91
7.1;4.1 Introduction;91
7.2;4.2 Properties of Tibetan Characters and Scripts;92
7.3;4.3 Isolated Tibetan Character Recognition;96
7.4;4.4 Tibetan Document Segmentation;106
7.5;4.5 Experiment Results;112
7.6;4.6 Summary;114
7.7;Acknowledgments;114
7.8;References;114
8;5 On OCR of a Printed Indian Script;117
8.1;5.1 Introduction;117
8.2;5.2 Origin and Properties of Indian Scripts;118
8.3;5.3 Document Pre-Processing;122
8.4;5.4 Character Recognition;125
8.5;5.5 Performance Analysis;132
8.6;5.6 Conclusion;135
8.7;Acknowledgments;135
8.8;References;136
9;6 A Bayesian Network Approach for On-line Handwriting Recognition;138
9.1;6.1 Introduction;138
9.2;6.2 Modelling of Character Components and Their Relationships;141
9.3;6.3 Recognition and Training Algorithms;147
9.4;6.4 Experimental Results and Analysis;149
9.5;6.5 Conclusions;156
9.6;References;157
10;7 New Advances and New Challenges in On- Line Handwriting Recognition and Electronic Ink Management;159
10.1;7.1 Introduction;159
10.2;7.2 On-Line Handwriting Recognition Systems;160
10.3;7.3 New Trends in On-Line Handwriting Recognition;160
10.4;7.4 New Trends in Electronic Ink Management Systems;164
10.5;7.5 Conclusion, Open Problems and New Challenges;172
10.6;References;173
11;8 Off-Line Roman Cursive Handwriting Recognition;181
11.1;8.1 Introduction;181
11.2;8.2 Methodology;182
11.3;8.3 Emerging Topics;187
11.4;8.4 Outlook and Conclusions;191
11.5;Acknowledgment;192
11.6;References;192
12;9 Robustness Design of Industrial Strength Recognition Systems;200
12.1;9.1 Characterization of Robustness;200
12.2;9.2 Complex Recognition System: Postal Address Recognition;202
12.3;9.3 Performance Influencing Factors;204
12.4;9.4 Robustness Design Principles;209
12.5;9.5 Robustness Strategy for Implementation;218
12.6;9.6 Conclusions;224
12.7;Acknowledgments;224
12.8;References;225
13;10 Arabic Cheque Processing System: Issues and Future Trends;228
13.1;10.1 Introduction;228
13.2;10.2 Datasets;229
13.3;10.3 Legal Amount Processing;230
13.4;10.4 Courtesy Amount Processing;237
13.5;10.5 Conclusion and Future Perspective;245
13.6;References;247
14;11 OCR of Printed Mathematical Expressions;250
14.1;11.1 Introduction;250
14.2;11.2 Identification of Expressions in Document Images;252
14.3;11.3 Recognition of Expression Symbols;256
14.4;11.4 Interpretation of Expression Structure;260
14.5;11.5 Performance Evaluation;266
14.6;11.6 Conclusion and Future Research;270
14.7;References;271
15;12 The State of the Art of Document Image Degradation Modelling;275
15.1;12.1 Introduction;275
15.2;12.2 Document Image Degradations;276
15.3;12.3 The Measurement of Image Quality;278
15.4;12.4 Document Image Degradation Models;280
15.5;12.5 Applications of Models;284
15.6;12.6 Public-Domain Software and Image Databases;286
15.7;12.7 Open Problems;287
15.8;Acknowledgments;289
15.9;References;289
16;13 Advances in Graphics Recognition;294
16.1;13.1 Introduction;294
16.2;13.2 Application Scenarios;297
16.3;13.3 Early Processing;300
16.4;13.4 Symbol Recognition and Indexing;301
16.5;13.5 Architectures and Meta-data Modelling;302
16.6;13.6 On-Line Graphics Recognition and Sketching Interfaces;304
16.7;13.7 Performance Evaluation;306
16.8;13.8 An Application Scenario: Interpretation of Architectural Sketches;307
16.9;13.9 Conclusions: Sketching the Future;308
16.10;Acknowledgment;310
16.11;References;310
17;14 An Introduction to Super-Resolution Text;317
17.1;14.1 Introduction;317
17.2;14.2 Super-Resolution: An Analytical Model;319
17.3;14.3 MISO Super-Resolution: A Closer Look;320
17.4;14.4 Case Study: SURETEXT– Camera-Based SRText;330
17.5;14.5 Conclusions;337
17.6;Acknowledgment;337
17.7;References;337
18;15 Meta-Data Extraction from Bibliographic Documents for the Digital Library;340
18.1;15.1 Introduction;340
18.2;15.2 The Users’ Needs;341
18.3;15.3 Bibliographic Elements as Descriptive Meta-Data;342
18.4;15.4 Meta-Data Extraction in Bibliographic Documents;344
18.5;15.5 General Overview of the Work;344
18.6;15.6 Bibliographic Element Recognition for Library Management;346
18.7;15.7 Bibliographic Reference Structure in Technological Watch;352
18.8;15.8 Citation Analysis in Research Piloting and Evaluation;355
18.9;15.9 Conclusion;360
18.10;References;360
19;16 Document Information Retrieval;362
19.1;16.1 Introduction;362
19.2;16.2 Document Retrieval Based on the Vector-Space Model;363
19.3;16.3 Applications;374
19.4;16.4 Summary and Conclusion;386
19.5;References;386
20;17 Biometric and Forensic Aspects of Digital Document Processing;390
20.1;17.1 Introduction;390
20.2;17.2 Image Pre-processing and Interactive Tools;392
20.3;17.3 Discriminating Elements and Their Similarities;395
20.4;17.4 Writer Verification;397
20.5;17.5 Signature Verification;405
20.6;17.6 Concluding Remarks;414
20.7;References;414
21;18 Web Document Analysis;417
21.1;18.1 Introduction;417
21.2;18.2 Web Content Extraction, Repurposing and Mining;418
21.3;18.3 Web Image Analysis;421
21.4;18.4 Web Document Modelling and Annotation;424
21.5;18.5 Concluding Remarks;426
21.6;References;426
22;19 Semantic Structure Analysis of Web Documents;430
22.1;19.1 Introduction;430
22.2;19.2 Related Work;431
22.3;19.3 Semantic Structure of Web Documents;433
22.4;19.4 Vision-based Page Segmentation (VIPS);435
22.5;19.5 Determining Topic Coherency of Web Page Segments;440
22.6;19.6 Extracting Semantic Structure by Integrating Visual and Content Information;441
22.7;19.7 Advantages of the Integrated Approach;443
22.8;19.8 Conclusions and Discussion;443
22.9;References;444
23;20 Bank Cheque Data Mining: Integrated Cheque Recognition Technologies;446
23.1;20.1 Introduction;446
23.2;20.2 Challenges of the Cheque Processing Industry;447
23.3;20.3 Payee Name Recognition;452
23.4;20.4 Cheque Mining with A2iA CheckReaderTM;460
23.5;20.5 Conclusions;466
23.6;References;466
24;Index;468




