next up previous contents
Next: Acknowledgements Up: Index Previous: Title Page

Abstract

Assessment of cellular protein localization (i.e., where within a cell does each protein carry out its function?) is becoming increasingly important. Even after the entire human genome has been sequenced, many years will be required to study the structure, function, and localization of each protein. Localization information is important because it provides a context for a protein's structural and functional information. For example, two proteins that possess similar structure and function may in fact be found in distinct compartments within the cell and therefore may be involved in unrelated cellular processes. The work described below is the first to addresses the subcellular localization of proteins in a quantitative manner. Localization data were collected for several proteins using fluorescence microscopy. The resulting patterns were described using a variety of numeric features including Zernike moments, Haralick's texture features, and biologically motivated features developed specifically to describe protein localization. To test the usefulness of these features, they were used as inputs to classifiers developed in the field of pattern recognition: a back-propagation neural network and a k-nearest neighbor classifier. Features were deemed good descriptors of protein localization if they allowed these classification techniques to distinguish the various protein localization patterns from one another. For data containing ten different patterns, the neural network classifier was able to correctly recognize previously unseen cells with an accuracy of 83%. The same classifier was then used to recognize previously unseen sets of ten homogeneously prepared cells with 99% accuracy. Because experiments involving cells are frequently carried out on populations rather than on individuals, this latter result indicates that a single classification can be assigned to each experimental population with a high degree of accuracy. These results have an impact in at least two areas. First, they indicate that it is possible to describe protein localization quantitatively. Second, these methods will allow the automation of screening processes which currently require human intervention. Specifically, it will be possible to automatically identify from among many experimental samples those populations of cells in which a labeled protein localizes in a desired manner.


next up previous contents
Next: Acknowledgements Up: Index Previous: Title Page
Copyright ©1999 Michael V. Boland
1999-09-18