Murphy Lab

 Cytometry Development Workshop

 Flow Cytometry


 Carnegie Mellon University
 Computational Biology Department
 Center for Bioimage Informatics
 Biological Sciences Department
 Biomedical Engineering Department
 Machine Learning Department

Bioinformatics 2013 Local Features for Improved Generalization

The software and raw data used for the following paper can be downloaded below:

Luis Pedro Coelho, Joshua D. Kangas, Armaghan Naik, Elvira Osuna-Highley, Estelle Glory-Afshar, Margaret Fuhrman, Ramanuja Simha, Peter B. Berget, Jonathan W. Jarvik, and Robert F. Murphy (2013) Local Features Provide Better Generalization of Subcellular Location Classifiers to New Proteins. Bioinformatics 29: 2343-2349.

The following files are compressed with tar and gzip

Source Code LocalFeaturesSource.tgz
Raw Data
RandTag Widefield Images (2D 3T3 RT Set 3) 2D3T3RTset3.tgz (2.0 GB)
RandTag Confocal Images (2D 3T3 RT Set 4) 2D3T3RTset4.tgz (612 MB)
Human Protein Atlas http://murphylab.web.cmu.edu/software/2012_PLoS_ONE_Reannotation
2D HeLa http://murphylab.web.cmu.edu/data/2Dhela_images.html
LOCATE endogenous and LOCATE transfected http://locate.imb.uq.edu.au/info_files/SubCellLoc.zip
LOCATE Confocal http://locate.imb.uq.edu.au
IICBU 2008 Benchmark http://ome.grc.nia.nih.gov/iicbu2008

Recreating Results from Raw Image Data

To recreate the results from the article (i.e. the figures and tables) from the raw image data,

Download and expand the source code file above to the desired directory.
tar -xzf LocalFeaturesSource.tgz

Download and expand the raw images for the RandTag and HPA datasets above into the data subdirectory

Run the following command, which will download the rest of the images and run everything (this will take over a day on a single processor)
source doitall.sh

System requirements

  • python
  • python-pip
  • python-virtualenv
  • dvipng
This package has been tested using python 2.7 under CentOS in a 64bit architecture.

Last Updated: 21 Feb 2014

