Murphy Lab

Home
Information
 People
 Addresses
 Cytometry Development Workshop
 FCS API

Research
 Projects
 Publications
 Software
 Presentations
 Flow Cytometry

Services
 PSLID
 SLIF
 Waldo

Data
 Download

Affiliations
 Carnegie Mellon University
 Computational Biology Department
 Center for Bioimage Informatics
 Biological Sciences Department
 Biomedical Engineering Department
 Machine Learning Department
 MBIC






Murphy Lab - Extracting and Classifying Fluorescence Microscope Images of Cells from Online Journals


Jie Yao, graduate student at the Center for Automated Learning and Discovery
Meel Velliste, graduate student in Biomedical Engineering

Introduction

We are interested in creating a self-populating knowledge base that can extract and store assertions about protein subcellular location from published literature in an automated manner. Such kind of knowledge base can serve not only as a resource for biologists but also as a test bed for knowledge reasoning systems that can generate new hypotheses under uncertainty.

Approach

As a starting point, we have developed an automated system to find fluorescence microscope images from on-line journal articles. Our system includes:

  1. web robot to download articles from PubMed matching a keyword query,
  2. tool to extract figures and captions from PDF files,
  3. algorithm for splitting figures into individual panels,
  4. program for distinguishing fluorescence microscope images from other types of images,
  5. program to find scale information from the images and corresponding captions,
  6. tool to remove annotations (such as characters and arrows) from the fluorescence microscope images,
  7. segmentation program to isolate individual cells from images containing multiple cells,
  8. classifier that can rank the returned images by their "likelyhood" of belonging to a particular location class

Results

Evaluation of each of the parts of this system revealed good precision (number of correct results out of all results returned) and reasonable recall (number of correct results out of all possible correct results). To demonstrate the usefulness of the system a search was performed using the keyword "Tubulin". 8 out of the top 10 images returned were actually images of tubulin.

Conclusions

When combined with utilities that extract assertions from figure captions and body text, this fully automated online image extractor will provide a truly useful tool for harnessing the vast amounts of information about protein subcellular location available in online journal articles.

References

R. F. Murphy, M. Velliste, J. Yao, and G. Porreca (2001). Searching Online Journals for Fluorescence Microscope Images Depicting Protein Subcellular Location Patterns. Proc IEEE Int Symp Bio-Informat Biomed Eng (BIBE 2001) 2; pp. 119-128. [ PDF Reprint ]




Last Updated: 01 Dec 2004




Copyright © 1996-2016 by the Murphy Lab, Carnegie Mellon University