next up previous contents
Next: Future Work Up: Conclusions Previous: Conclusions

Impact of this Work

The work described here represents the first known attempt at providing a quantitative, systematic approach to the description of protein localization patterns. While the underlying methods (fluorescence microscopy, pattern recognition) are established, it is the multi-disciplinary nature of the overall approach that is novel.

The results presented here are important for at least three reasons. First, a set of numeric features has been defined which are able to capture useful information about the localization patterns of proteins. The usefulness of these features was determined by their ability to discriminate several classes of localization patterns from one another. Defining features useful for describing protein localization is important primarily because it has not been done before, but also because it facilitates the application of quantitative analysis to those patterns while displacing the existing subjective analysis. It is useful to make an analogy to the advances made in sequence analysis after quantitative comparison methods were developed. Analysis of new protein or nucleic acid sequences initially relied on visual inspection of sequences for regions of identity or homology to previously known sequences. Even after computerized methods for comparing sequences were developed, the statistical significance of matches was not always evaluated. Currently, it is a simple matter to sequence a gene or cDNA and send the resulting sequence to a server that is capable of comparing it to existing sequences in a wide variety of organisms. The results from this comparison can provide almost immediate insight into the possible structure and function of the new protein. With the work described here, one can anticipate a time when visual comparison and analysis of protein localization patterns will be as rare as visual analysis of protein or nucleic acid sequences.

The second important contribution of this work is that it is a significant improvement over existing methods in terms of describing the localization of proteins. Even the current best efforts at describing protein localization (see Section 1.4, p. [*]), while attempting to incorporate a systematic approach, are still fundamentally subjective. Because the assignment of a particular protein to a localization pattern is currently dependent on the investigator making that assignment, it is difficult, if not impossible, to make comparisons between the localizations of different proteins. In contrast, the methods described here are able to discriminate between subtly different localization patterns, even within the same organelle (giantin and GPP130 in the Golgi). Even more important than their ability to discriminate, however, the various features tested here are important because they provide a consistent method of describing protein localization.

A third, and practical impact of this work is the fact that it demonstrates the feasibility of classifying individual cells (and populations of cells) based on protein localization patterns. Whereas there has been sporadic application of pattern recognition to specific problems involving fluorescence and microscopy, this work is the first to develop methods that are intended to be applicable to protein localization patterns in general. Using this general approach, the best classifier/feature combination provided adequate performance for single cells. Although the single-cell performance of the classifier was barely acceptable for some classes, it was very good for others. Unless the worst-case performance can be improved, however, highly reliable classification of single cells may be limited to problems in which the features and the classifier can be tailored to some relatively small number of specific patterns. Such a limitation, if true, does not preclude the goal of using this approach to describe protein localization so long as sets of patterns can be classified instead. As discussed in some detail in Chapter 3, many biological applications of these methods lend themselves to the classification of entire populations based on the constituent members. This approach mirrors what biologists might do when subjectively assessing the localization of a new protein, i.e., they might observe several identically prepared cells on a single coverslip before assigning them to a class. The set classification methods investigated here are impressive not only because of their high rates of correct classification, but also because they facilitate the discrimination of patterns that produced significant confusion for the single cell approach. For both of these reasons, classification of cell populations is perhaps the most significant result to come from this work.


next up previous contents
Next: Future Work Up: Conclusions Previous: Conclusions
Copyright ©1999 Michael V. Boland
1999-09-18