Pattern recognition/classification is a broad field that has developed to address problems as diverse as handwritten character recognition, military target recognition, face recognition, manufactured product inspection, speech recognition, and breast cancer screening, among others. Fortunately, there are general divisions that can be made to distinguish the many application-specific approaches used to solve these problems.

One major division is between supervised and unsupervised
classification techniques. Supervised methods require that the
classifications of the samples be known and provided to the
classifier. Unsupervised methods, on the other hand, do not require
*a priori* knowledge of sample classifications and are useful
when a goal of the work is to discover structure and organization
among unclassified samples. All images used in the work below were
generated under known conditions and supervised methods were used
throughout.

A second division of pattern recognition techniques can be made
between parametric and non-parametric techniques. Parametric pattern
recognition includes approaches in which the forms of the
distributions of the features (i.e., the probability density
functions) are known or assumed *a priori*. If the
distribution of features is known, then Bayesian decision theory can
immediately be brought to bear on the problem [10, p. 10].
If the distribution is only assumed, then the parameters of that
distribution must be estimated (e.g., the mean vector and covariance
matrix for a Gaussian distribution). After the parameters are
obtained, Bayesian theory may again be used. The primary advantage of
parametric pattern recognition is that the classifiers are grounded in
rigorous statistics and their performance can therefore be predicted
and bounded. The obvious disadvantage of parametric approaches is
that for real world data, it is rare that the distribution of features
can be assumed and extremely rare that the distribution of features is
known *a priori*.

In contrast, non-parametric pattern recognition makes no assumptions regarding the distributions of the features. As such, the statistical analysis allowed with parametric methods are not applicable and it is more difficult to place bounds on the performance of the classifiers. The lack of assumptions does, however, accommodate more real world data sets. Non-parametric classifiers, although not as rigorous as their parametric counterparts, have been shown to perform well on a variety of pattern recognition tasks. All classifiers used in this work fall into the non-parametric category.

A third way of distinguishing pattern recognition systems is based on whether there are models available for representing the objects, where `object' is used to refer to that which is being recognized - time series, patterns in images, etc. When trying to recognize man made objects like manufactured parts or aircraft for instance, it is possible to construct canonical models, either real or simulated, that exactly represent each of the object classes. The classification system can then use these models to assign a classification to a new, unknown sample. When the input data are not as well defined, or if there is heterogeneity within the classes, as with the biological samples described below, a model-based approach is not appropriate.

When the exact form of the objects to be recognized cannot be modeled, a common approach is to describe the objects with a set of numeric features. While no systematic methods exist for selecting the features to use with particular classes of data, the features are usually chosen by an `expert' to try and capture useful information about each class of pattern. A fundamental motivation for this step is to describe each object, images in the case of this work, as concisely as possible (i.e., with as few numbers as possible). After the features have been calculated, each image is represented by a point in the n-dimensional feature space where n is the number of features used. The goal then is to devise methods (i.e., classifiers) for separating the various classes of input data from one another in that feature space.