CONTENTS
1 INTRODUCTION : WHAT IS PATTERN RECOGNITION? = 1
1.1 Pattern Spaces = 8
1.2 Pattern Recognition and Perception = 12
1.3 Statistical and Syntactic Approaches = 14
1.4 Main Application Areas = 27
1.5 Organization of This Book = 30
2 OVERVIEW = 33
2.1 The Subproblems of Pattern Recognition = 38
2.1.1 Acquisition (Transducers, Measurement, A / D Conversion) = 39
2.1.2 Preprocessing (Signal Conditioning, Filtering, Enhancement) = 54
2.1.3 Segmentation (Low-Level Feature Extraction) = 68
2.1.4 Feature Extraction (High-Level Features : Description : Data Compression) = 74
2.1.5 Classification (Statistical and Structural Methods) = 75
2.1.6 Postprocessing (Contextual Disambiguation) = 80
2.1.7 Learning Methods in Pattern Recognition = 81
2.1.8 Performance Evaluation and Standards = 82
2.2 Two Meanings of "Dimension" = 87
2.2.1 Object-Space Dimension = 87
2.2.2 Pattern-Space Dimensions = 89
2.3 Combinatorial Aspects of Pattern Recognition Problems = 92
2.4 Summary = 98
3 SEGMENTATION = 100
3.1 Edge-Detection Methods = 107
3.1.1 Gradient = 109
3.1.2 Laplacian = 116
3.1.3 Pseudo-Laplacian = 121
3.1.4 Other Theoretical Models = 124
3.1.5 Line Detectors = 131
3.1.6 A Note On Hardware Implementation = 132
3.2 Segmenting Texture-Free Regions = 136
3.2.1 Top-Down vs. Bottom-Up Methods = 136
3.2.2 Histogram-Based Methods = 136
3.2.3 A Graph-Theoretical Method = 137
3.3 Connected Components = 141
3.4 Segmentation of Time Series = 142
3.4.1 EKG Segmentation = 143
3.4.2 Speech Segmentation = 144
3.5 Summary and Conclusions = 148
4 FEATURES = 149
4.1 What Are Features? = 150
4.1.1 Feature Vectors, Feature Spaced = 152
4.1.2 Feature Subset Selection = 153
4.2 Masks : Weighted and Boolean ; Global and Peephole = 155
4.3 Shape Characterization = 158
4.3.1 Shape Elements for OCR = 160
4.3.2 4-, 8-, and 6-Connectivity = 161
4.3.3 Chain Codes, Derivatives of Chain Codes, and Shape Codes = 165
4.3.4 Quasi-Topological Codes = 167
4.3.5 Two-Dimensional Shape Descriptors = 182
4.4 Coefficient Expansions = 196
4.4.1 Moment Analysis = 197
4.4.2 Fourier Analysis = 199
4.4.3 Hadamard Transforms = 201
4.4.4 K-L Methods = 208
4.5 Features in Speech Recognition = 215
4.5.1 Frequency Domain = 216
4.5.2 Time Domain = 219
4.5.3 Dynamic Time-Warping = 219
4.6 Fuzzy Features = 220
4.7 Conclusion = 223
5. PATTERNS AND TEXTURES : CHARACTERIZATION AND SEGMENTATION = 225
5.1 The Concept of Texture = 231
5.1.1 Regular and Irregular Textures = 232
5.1.2 Psychophysical Investigations and Artificial Textures = 238
5.2 Structural and Statistical - Again! = 240
5.3 Statistical Methods = 246
5.3.1 Image-Based Statistical Methods = 249
5.3.2 Model-Based Statistical Methods = 252
5.3.3 Transform-Based statistical Methods = 254
5.4 Structural Methods : Gestalts and Graphs = 270
5.4.1 Tsuji and Tomita's Method = 271
5.4.2 A Hybrid Method = 276
5.5 Concluding Remarks = 279
6. STATISTICAL DECISION METHODS : SOME BACKGROUND = 281
6.1 Probability and Probability Density Functions = 281
6.2 Sample and Population Variance - Covariance Matrix = 286
6.3 Distance Measures = 289
6.4 Clusters = 294
6.4.1 Clustering Techniques = 297
6.4.2 Determination of Subclasses by Clustering and by Nearest-Neighbor Methods = 307
6.4.3 Cluster Validity = 308
6.5 Classification and Clustering in Hyperspace = 311
6.5.1 Discriminants = 311
6.5.2 Linear, Polynomial, and Piecewise-Linear Separability = 315
6.5.3 Then-Class Problem = 320
6.5.4 Two-Threshold Rules : Fisher's "z" = 322
6.5.5 Piecewise-Linear Discrimination = 325
6.6 Cover's Inequalities = 328
6.7 Conclusion = 330
7 BAYES' OPTIMAL DECISIONS = 332
7.1 Multivariate Normal Distribution = 335
7.2 Parameter Estimation = 340
7.3 Bayes' Optimal Decisions = 345
7.4 Normal Populations with Equal Variance - Covariance Matrices = 350
7.5 Fisher's Method = 352
7.5.1 Geometry of Fisher's Rule = 355
7.6 Normal Populations with Unequal Variance - Covariance Matrices = 357
7.6.1 The Case of g Groups = 360
7.6.2 Unequal Covariance Matrices = 363
7.7 Bayes' Rule with Empirical p. d. f. and Unknown Distributions = 363
7.7.1 Kernel Methods = 364
7.7.2 Bayes' Optimal Decisions : The General Case = 367
7.8 Conclusion = 369
8 BEYOND BAYES = 370
8.1 Problems with the Bates Approach = 370
8.2 Nearest-Neighbor Classification and Clustering = 372
8.3 Evaluation of a Classification Rule : Estimation of Misclassification Probabilities = 378
8.3.1 Holdout Method = 379
8.3.2 The Feature Selection Problem = 380
8.3.3 Error/Reject Tradeoff as an Estimation Tool = 385
8.4 Hierarchical and Sequential Methods = 388
8.4.1 Hierarchical Pattern Recognition Systems = 380
8.5 Time- and Frequency-Domain Problems = 398
8.5.1 Discrete Markov Models = 399
8.5.2 Dynamic Time Warping and Hidden Markov Models in Speech Recognition = 403
8.6 Fuzzy Methods = 409
8.7 Conclusion = 413
9 Formal Linguistic Methods = 414
9.1 State Machines, Graphs, and Grammars = 415
9.1.1 State Machines = 415
9.1.2 Trees and Graphs = 417
9.1.3 Primitives, Strings, Languages = 420
9.2 Chomsky Theory = 420
9.2.1 The Chomsky Hierarchy = 421
9.2.2 Two-Dimensional Languages = 427
9.3 Stochastic Grammars = 431
9.4 Regular Languages and State Machines = 434
9.4.1 Moore and Mealy Machines = 436
9.5 Grammatical Inference = 438
9.6 Merging State Machines = 452
9.7 Summary = 454
10 STRUCTURAL AND HYBRID METHODS = 455
10.1 A Dictionary-Based Method = 455
10.2 State Machine Methods = 456
10.2.1 Feature Strings = 456
10.2.2 Decision Trees and Graphs = 459
10.2.3 Moor or Mealy? = 465
10.2.4 Using State Reduction in Finite-State Machines = 466
10.2.5 Output-Extended Machines for Pattern Recognition = 471
10.3 Hybrid Methods : Combination of Structural and Statistical Methods = 474
10.3.1 Advantages and Drawbacks of the Two Approaches = 475
10.3.2 Hybrid Methods = 476
10.4 Attributed Grammars = 478
10.5 Structural Analysis of Time Series : EKG and EEG = 483
10.5.1 The EEG = 485
10.6 Conclusion = 490
11 LEARNING TECHNIQUES = 491
11.1 The Basic Iterative Learning Procedure = 491
11.1.1 Error Rate Estimation = 492
11.1.2 Types of Learning = 494
11.2 Learning Statistical Discriminants = 497
11.2.1 How do We Find Piecewise-Linear Discriminants? = 497
11.2.2 Learning with Unlabeled Samples Using the Reject/Error Tradeoff = 498
11.2.3 Piecewise-Linear Discrimination = 498
11.3 Learning in Structural Recognition = 501
11.3.1 Grammatical Inference = 501
11.3.2 Construction of State Machines from Observed Samples = 501
11.3.3 Generalization by State Reduction = 503
11.3.4 Introducing Constraints into State Machines = 507
11.3.5 Subsets and Subclasses = 511
11.4 Sample Size Considerations in Sequential Learning = 511
11.5 What About Neural Nets? = 512
11.5.1 Backpropagation = 514
11.6 Conclusion = 516
12 EPILOGUE : WHERE DO WE GO FROM HERE = 517
APPENDIX : SOME HARDWARE CONSIDERATIONS = 520
SELECTED PROBLEMS = 525
REFERENCES = 545
INDEX = 565