next up previous contents
Next: Related Work Up: Introduction Previous: Motivation

Current State of Protein Localization

Based on literature searches, it is fair to say that there is currently no body of work dedicated to the quantitative description of protein localization. This is not to say that biologists are not interested in describing the localization of proteins. On the contrary, it is frequently important to describe how a new protein localizes within a cell. Unfortunately, these descriptions are currently subjective and therefore not necessarily comparable between investigators. Even if investigator bias could be minimized, there is no standard set of classifications to use when assigning a localization pattern to a protein. In fact, the complexity and diversity of protein localization patterns precludes the development of a set of categories that could be simple enough for subjective assignment of patterns by investigators. As such, the localization of a new protein is usually described in general terms and usually within a particular sub-domain of cell biology. These and other problems can be ascertained by looking at the Swiss-PROT database. Swiss-PROT is an annotated protein sequence database which includes information on protein structure, function, post-translational modification, variants, etc. It also includes one field dedicated to subcellular localization. If the localization field of the Swiss-PROT database is summarized by calculating the frequency of each unique localization term1.1, a number of problems can be seen. First, the descriptions of localization are not systematic. Many of the entries contain subjective commentary from whomever entered it into the database. For example, there are many one-time entries like the following: IS PRESENT PREDOMINANTLY IN THE CYTOPLASM, BUT IS ALSO FOUND IN SMALL QUANTITIES IN THE NUCLEUS, and LOCALIZED THROUGHOUT THE CELL BUT IS MORE CONCENTRATED AT THE NUCLEUS. A second problem with the Swiss-PROT localization information is that there is no way of knowing whether multiple terms are being used to describe the same pattern. For example, are INNER SIDE OF THE MEMBRANE, INNER SURFACE OF CELL MEMBRANE, and INNER SURFACE OF PLASMA MEMBRANE really describing different patterns? Probably not, but there is really no way of knowing given only these terms. A third problem is that the most frequently used terms in the localization field are not very specific. Of the 28759 terms found in the database, 3606 are CYTOPLASMIC, 2689 are INTEGRAL MEMBRANE PROTEIN, and 2792 are NUCLEAR. While these terms represent important general categories of localization, they are hardly adequate for a more refined approach. Development of a systematic method of describing protein localization patterns would overcome these problems.

A popular approach to describing protein localization is to show that a newly discovered protein colocalizes with one or more other, better characterized proteins. One way to accomplish this is via fluorescence microscopy (see Section 1.6 for an introduction). By labeling two or more proteins in the same cell with different fluorescent dyes, it is possible to generate a set of images each of which depicts the localization of a single protein (see Figure 1.4). If these images are then overlaid, it is possible to determine (subjectively) the degree to which the proteins colocalize. While such experiments provide clear visual evidence of colocalization, they are not able to describe localization in a larger context. A colocalization experiment cannot, for instance, easily provide a list of all (or even most) of the proteins that localize in a manner similar to the one under study. Furthermore, in a case where the new protein does not colocalize with the known proteins, the experiment does not provide any information other than a list of proteins with which the newly discovered protein does not colocalize. In such a case, the experiment may have to be repeated using another set of `known' proteins for comparison. Finally, while colocalization experiments could theoretically be used to compare the localization of each protein to all other proteins, the combinatorics of the problem quickly make such an approach intractable. All pairwise comparisons of just 100 proteins would require 4950 co-localization experiments and there are an estimated 70,000-100,000 proteins encoded by the human genome. Colocalization will certainly continue to play a role in experiments that look at the localization of a small number of proteins, but it is not a reasonable approach to characterize even a small fraction of a genome. New approaches to the systematic study of protein localization are clearly needed.

Figure 1.4: Two images collected at different excitation/emission wavelengths using the same cell. LAMP2 (A) and Menkes protein (B) were labeled with different fluorescent antibodies in CHO cells and images were collected. These two proteins display very little, if any colocalization. Scale Bar $= 10\mu$m. (Images courtesy of Cinnamon Lane)
\includegraphics[width=2.75in]{cinnamon_mnk.tif} }

next up previous contents
Next: Related Work Up: Introduction Previous: Motivation
Copyright ©1999 Michael V. Boland