Murphy Lab

 Cytometry Development Workshop

 Flow Cytometry



 Carnegie Mellon University
 Computational Biology Department
 Center for Bioimage Informatics
 Biological Sciences Department
 Biomedical Engineering Department
 Machine Learning Department

Murphy Lab - Object-level Recognition of Protein Subcellular Location Patterns

Meel Velliste, graduate student in Biomedical Engineering
Michael V. Boland, former graduate student in Biomedical Engineering


We have previously described methods for automatically determining the location class of a protein based on fluorescence microscope images of that protein. The classifiers we developed have proven capable of recognizing the patterns of all major subcellular structures and organelles with high accuracy. However, the features that described the patterns were calculated at the level of the cell. This becomes a problem when trying to recognize a pattern that is a mixture of two or more fundamental patterns, as in the case of a protein that localizes to more than one organelle. This will frequently be the case when doing a proteome-wide analysis of location patterns. In such case the feature values of the mixed pattern will not be similar to those of any of the constituent fundamental patterns. Therefore, for example a classifier that has already been trained to recognize the patterns of Golgi and lysosomes would fail to recognize a mixed Golgi-lysosome pattern as either Golgi or lysosomes. A separate classifier would have to be trained for every possible combination of fundamental patterns, which of course would be

  1. not feasible due to the vast number of possible combinations, and
  2. useless because one would like to obtain the classifications of the individual component patterns rather than getting a different classification for every possible combination of patterns.
The goal of this project is to develop a feature representation and a classification scheme capable of recognizing components of patterns independently.


This goal might be achievable if the features were calculated differently - a separate feature vector for each individual object as opposed to one per cell (an object here is defined as a contiguous region of above-threshold pixels). Classification can then be performed on each constituent object, and the results aggregated over the whole cell in the form of percentage membership in each base class. So, for example if 20% of the fluorescence intensity in the image belongs to objects that are classified as being part of a Golgi pattern and the remaining 80% to objects classified as lysosomal, one would know the protein localizes mainly to lysosomes, but is also found in the Golgi.


This approach was applied to the same set of 2D images of HeLa cells as used in our previous 2D classification work. The cell-level patterns could be recognized with 61% accuracy. Considering that random accuracy would be 10% and that object-level features fail to take into account the relationships between the different objects in the cell, this result is surprizingly close to our previously achieved classification accuracy of 83% with cell-level features.


Even though there is room for improvement of the single cell classification accuracy, the results are encouraging because this method will allow classification of the location patterns proteins that localize to more than one organelle or subcellular structure. As we have mentioned in our previous work, the classification accuracy can be improved by classifying sets of images as opposed to single images. This is often possible because multiple images are typically acquired from the same specimen

Last Updated: 01 Dec 2004

Copyright © 1996-2016 by the Murphy Lab, Carnegie Mellon University