Cytometry Development Workshop
Carnegie Mellon University
Computational Biology Department
Center for Bioimage Informatics
Biological Sciences Department
Biomedical Engineering Department
Machine Learning Department
[an error occurred while processing this directive]
Lab Research Topics
Software and Documentation
Michael Boland's thesis
Welcome to the Murphy lab at Carnegie Mellon University. The lab
is a multidisciplinary environment with people working on
projects in computational cell biology.
|May 15, 2023||A new release (v2.10) of our open source CellOrganizer system is available both as Matlab code and in a Docker container with a Jupyter Notebook interface that does not require a Matlab license. This release include an extensive tutorial on using the Docker/Jupyter version. The tutorial consists of 21 Jupyter notebooks covering the major functionalities.
New features include
- functions for directly constructing SPHARM-RPDM representations from individual 3D objects,
- calculation of Jaccard index (in additional to Hausdorff distance) to measure the quality of a given shape representation, and
- ability to generate and customize movies illustrating temporal evolution of shape and organization.
|January 30, 2023||In collaboration with Chad Pearson's group at the University of Colorado Anschutz Medical Campus, we have developed an offshoot of the CellOrganizer project that creates generative models of cell shape and basal body organization in
Tetrahymena thermophila. It is described in a paper to be published in the forthcoming special issue of Molecular Biology of the Cell on "Quantitative Biology". The new program, TetAlyze, takes 3D confocal images of individual unsynchronized cells expressing an mCherry-tagged basal body marker and segments basal bodies, determines whether they are mature or newly-replicated, and aligns them into ciliary rows. Using this approach, we found novel patterns of ciliary arrangement and basal body replication through the cell cycle. TetAlyze also constructs a dynamic generative model that can generate movies of individual cells progressing through the cell cycle. TetAlyze is available from github.
|January 23, 2023||A critical task in cell analysis and modelingis the automated segmentation of microscope images into individual cells. Traditionally, the quality of algorithms for this task have been evaluated by comparison with results produced by humans. However, this does not allow the routine evaluation of segmentation quality for individual images in large scale studies, which can be especially valuable for identifying low quality or otherwise problematic images. For multichannel images, we have developed a set of metrics that measure segmentation quality without relying on comparison with human results. These are described in a paper to be published in the forthcoming special issue of Molecular Biology of the Cell on "Quantitative Biology," and an open source tool, CellSegmentationEvaluator, that implements them is available from github. Written in python, it is provided in two forms. The first is a function that takes as inputs an image and its corresponding set of cell masks and returns the metrics and quality score; it can be called from the command line or from a provided Jupiter Notebook. It supports both 2D and 3D images. The second is a pipeline that finds the best method for a given collection of images from among a set of existing trained segmentation models.
|October 17, 2022||Our paper "Improving and evaluating deep learning models of cellular organization" has been accepted by Bioinformatics. It begins by describing novel metrics for evaluating synthetic cell images produced by the innovative deep learning, label-free microscopy approach to generating organelle patterns from brightfield images developed by Ounkomol, Seshamani, Maleckar, Collman and Johnson (yes, that's my former PhD student, Greg Johnson!). We then describe ways to improve their generative models and introduce an alternative modeling approach that gave the best performance on our metrics. All results and source code are available as open source. The initial preprint version of the manuscript is available at bioRxiv.
|May 13, 2021||The early access version of a paper describing a new method for categorical matrix completion is now available from Bioinformatics. It outperforms previously published methods and we demonstrate that it yields improved results when used for active learning.
|September 15, 2020||Continuing our collaboration with the Wuelfing lab, our paper on suppression of killing ability of cytotoxic T cells upon exposure to the tumor microenvironment has been published in Science Signaling. CellOrganizer was used to construct and compare maps of the distributions of signaling molecules for cytotoxic T cells before and after exposure to tumors.
|March 19, 2020||Our work in collaboration with colleagues at the University of Freiburg has been published in Molecular Biology of the Cell. It describes construction of a model of the changes in shape and mitochondrial distribution that occur during differentiation of PC12 cells. It builds on our prior work on robust cell shape models and describes a method for building dynamic models of cell shape using large collections of static cell image from separate samples at different points in time. A similar approach is used to compare collections for samples treated with different drugs.
|October 30, 2019||Our collaborative study on manipulating the localization of adaptor proteins in the T cell central supramolecular signaling complex has been published in eLife. It is the fifth paper published through our longstanding collaboration with Prof. Christoph Wuelfing at the University of Bristol.
|February 17, 2019||Slides from my talk on "Self-driving instruments: Active Machine Learning for Biological Discovery" at the 2019 Annual Meeting of the American Associated for the Advancement of Science are now available.
|January 17, 2019|
|Our work on inferring the assembly order of influence virus RNAs was featured on the cover of PLoS Computational Biology.|
|December 7, 2018||Our paper on comparison and refinement of methods for cell shape modeling is available as early access.
|October 9, 2018||Our new Masters of Science in Automated Science program has launched!
|September 20, 2018||An interesting conference on Image-based Modeling and Simulation of Morphogenesis will be held in Dresden, Germany from 13-15 March 2019. The application deadline is 20 November 2018. Registration is only 140 Euro and costs for accomodation and meals will be covered by the Max Planck Institute.
|September 18, 2018||Dr. Murphy will be giving a talk in a
methods for reconstructing molecular dynamics in single cell" to be
held in Pisa, Italy from 15-19 October 2018.
|June 28, 2017|
||Our paper in collaboration with Jörn Dengjel's group was published today in Autophagy with a cover image from former student Greg Johnson's analysis of autophagosome/autophagolysosome pH and spatial distribution!
|June 26, 2017||Release 2.6 of CellOrganizer is now available. New features include the ability to learn and use models of protein distribution used in our study of T cell protein dynamics. Also included is support for generating images in SBML Spatial 3 Level 1 draft 0.90 or OME-TIFF format.
|June 1, 2016||Release 2.5 of CellOrganizer is now available. New features include a 10-fold speedup in training diffeomorphic models.
|April 19, 2016||Our
paper in collaboration with
Christoph Wülfing's group was published today in Science Signaling.
It describes developed of computational methods to construct and compare
spatiotemporal "maps" of the subcellular distribution of actin and eight
of its regulators during costimulatory antigen presentation to T cells.
The models are based on movies (3D images over time) of individual cells
proteins, but variability from cell to cell and noise in the images make it
difficult to understand the sequence of events that are occurring.
The work was part of the activities of the NIH-supported National Center for Multiscale Modeling of Biological Systems.
|February 9, 2016||My colleagues and I published a
paper in eLife
describing the first robotically driven experimentation system to determine
the effects of a large number of drugs on many proteins without doing
all possible experiments; it reduced the number of experiments needed to
produce an accurate model by 70%. The motivation is that in most areas of
biological experimentation, the number of possible experiments far exceeds
the number of experiments that can reasonably be performed. As we have
proposed previously, active machine learning is likely the only solution
to this problem. However, our previous approaches were only been tested
using synthetic or previously acquired data
(Naik et al 2013,
Kangas et al 2014,
Temerinac-Ott et al 2015).
Our eLife paper went beyond this by choosing which experiments to do entirely
by computer. The experiments were then carried out using liquid-handling
robots and an automated microscope.
A novelty of the new work was that the learner had to identify potentially
new phenotypes on its own as part of the learning process. To do this, it
clustered the images to form phenotypes. The phenotypes were then used to
form a predictive model, so the learner could guess the outcomes of
unmeasured experiments. The basis of the model was to identify sets of
proteins that responded similarly to sets of drugs, so that it could
predict the same prevailing trend in the unmeasured experiments. The
algorithm was able to learn a 92% accurate model for how 96 drugs
affected 96 proteins, from only 29% of the experiments conducted.
|February 4, 2016||A collaborative paper arising from the
National Center for Multiscale Modeling of Biological Systems (MMBioS)
was published today in PLoS Computational Biology. It describes approaches to perform simulations
of cellular biochemistry to efficiently estimate rate constants for rare events, including in spatially-realistic geometries.
It makes use of a number of tools being developed with National Institutes of Health support to MMBioS, including WestPA, BioNetGen, CellOrganizer, CellBlender, and MCell.
The work with CellOrganizer was done by Dr. Devin Sullivan while he was a Ph.D. student in our group. MMBioS is a joint center between the University of Pittsburgh, Carnegie Mellon University, the Pittsburgh Supercomputing
Center, and the Salk Institute for Biological Studies.
|December 2, 2015||Our paper
on modeling the relationships between microtubules and proteins that are
found in punctate structures was
published today by PLoS Computational Biology.
Using images generously provided by the Human Protein Atlas, we constructed a generative model of the position of punctate structures relative to the distances to the cell and nuclear boundaries and the nearest microtubule. We used the parameters of this model to show that eleven different punctate structures, including many of the types of vesicles found in the endomembrane system, could be distinguished. We were then able to tentatively assign detailed subcellular locations for hundreds of proteins whose locations had not been previously well-characterized.
|September 9, 2015||Our paper
on construction of generative models of joint cell and nuclear shape was
published online today by Molecular Biology of the Cell in "MBoC In Press".
It describes the use of diffeomorphic methods to demonstrate a statistically
significant relationship between cell and nuclear shape in three different
cultured cell lines. This was done by measuring the extent to which cell
shape can be predicted from nuclear shape (and vice versa). The correlation
was observed to be affected by altering protein C1QBP or by the addition of
various drugs. We also describe a generative model of the kinetics of shape
change that permits synthetic movies of shape dynamics to be created. The
software will be available in the next release of the open source
CellOrganizer system. Regular publication of the paper will take place
in the 2nd Special Issue on Quantitative Biology.
|September 3, 2015||The National Institute for Mathematical and Biological Synthesis has announced support for a
working group on Spatial Cell Simulation based on a proposal that I submitted with Jim Faeder
of the Department of Computational and Systems Biology at the University of Pittsburgh. Systems biology emphasizes the creation of mathematical or computational models
of biological systems such as cells and tissues, as a means both to integrate all available information and to make predictions about unmeasured mechanisms or behaviors.
The working group will address critical challenges currently faced in creating mathematical/computational simulations of the inner workings and dynamics of eukaryotic cells that reflect realistic cell architecture, especially for accurately simulating changes in cell shape and organization over time.
The issues to be addressed include methods for simulation that can consider dynamic cell and organelle shapes and positions and methods for learning joint probability distributions
for thousands of cellular components. These are very relevant for our work as project leaders in the
National Center for Multiscale Modeling of Biological Systems.
The working group will meet 2-3 times per year to develop new approaches to these problems, implement them in software,
develop proposals for future funding of such research, and develop training materials for biomedical researchers. The first meeting will be December 1-3, 2015.
Scientists interested in contributing to the effort are encouraged to contact us (firstname.lastname@example.org, email@example.com).
|July 9, 2015||Our paper
from RECOMB 2015 on combining active learning with kernelized Bayesian matrix factorization was published today
in BMC Bioinformatics. In it we evaluate the method we proposed in
Naik et al. 2013 for deciding when to stop an
active learning campaign on four drug-target interaction datasets. We show that our method results in
substantial savings in the number of experiments required to make accurate drug-target predictions.
|February 6, 2015||Today was a busy day. Congratulations to Drs. Devin Sullivan and Aparna Kumar for successfully defending their Ph.D. theses. Devin is on his way to Stockholm, Sweden for a postdoctoral fellowship with Dr. Emma Lundberg at the Science for Life Labs, KTH Royal Institute of Technology, where he will be working on the Subcell Atlas of the Human Protein Project. Aparna has accepted a position as a Data Scientist at Dow Jones in New York City. Devin and Aparna are the 25th and 26th Ph.D. to graduate from Murphylab.|
|December 8, 2014||Our group published a paper with implications for cancer research today in the U.S.
Proceedings of the National Academy of Sciences. It describes a new method for identifying proteins that differ significantly in subcellular location between normal and cancerous tissue and
applies it to images of four human tissues from the Human Protein Atlas. The proteins identified may help improve cancer detection and diagnosis, and may increase our understanding of the
|October 5, 2014||The image analysis and modeling team at the NIH-supported
National Center for Multiscale Modeling of Biological
Systems is seeking new partners for collaborative or service
projects with researchers at the Center. We are seeking investigators who wish to use our CellOrganizer system) for
learning and using generative models of cell size, shape and subcellular
organization (or to help with further development).
We can provide extensive training to
external personnel, consultation on appropriate methods and design of studies, help
with local installation of any desired software, and access to computational
resources at the Center for image analysis, modeling and simulation. CellOrganizer learns
modular models of things such as cell shape, nuclear shape, vesicular organelle
distribution and microtubule distribution directly from 2D or 3D images and can
produce specific instances of cell geometries without the need to create them
by hand or to segment microscope images (see Buck et al, 2012
for an overview). Through Center
funding, pipelines have been created whereby these geometries can be combined
with biochemical models to perform spatially realistic cell simulations with a
minimum of effort (Center resources can be provided to run these using the cell
simulation engine MCell. The biochemical models can be encoded in
SBML (i.e., investigator created or downloaded from models databases) or can be
generated by BioNetGen (a powerful rule-based
modeling package). This combination
of CellOrganizer and MCell
allows investigators to explore the effect of different cell geometries on
their models (e.g., to independently explore different modes of variation in the
generative models, such as variation in organelle number vs. shape). Existing generative models of 3T3 cells,
HeLa cells, and C2C12 cells can be used so that making
extensive image collections can be avoided.
If interested, please contact firstname.lastname@example.org
or fill out the form at the MMBioS web site. We would be happy to further explain the
capabilities of the current system and discuss development of new capabilities.
|May 22, 2014||Our paper on using active learning to identify drug-target interactions using PubChem data has been published in BMC Bioinformatics.
|April 29, 2014||CellOrganizer 2.1 released.
|April 17, 2014||A new service for content-based image retrieval, CellSearcher released. It allows users to upload cell images and find images in other databases that are similar in subcellular pattern (using the OMERO.searcher system).
|February 19, 2014||The MMBioS center, a collaboration between the University of Pittsburgh, Pittsburgh Supercomputing Center, Salk Institute and Carnegie Mellon is featured in a video created by the Biophysical Society for the 'Biophysical Society TV' shown at their annual meeting. The video is also available at YouTube. The Technology Research and Development project (TR&D3) that we lead is described starting at 4:18. The open source CellOrganizer system plays a central role in this project.
|December 17, 2013||Our paper characterizing new algorithms for active learning for drug discovery in the absence of compound or target features has been published in PLoS ONE.
The algorithms seek to learn the effects of many compounds on many targets, and address the case in which the effect of a given compound on a given target is represented as one of a number of different categorical phenotypes (rather than just as a score measuring extent of an expected effect).
We introduces measures of uniqueness and responsiveness to characterize the nature of a given experimental space, and show in simulated experiments that our active learner shows significant improvement over using random choice and does so for essentially all values of the uniqueness and responsiveness.
We also introduce a stopping rule approach for estimating the lower limit of the true accuracy of an actively learned model, permitting decisions to be made about when to stop a campaign of active learning-driven experimentation.
Lastly, we show using Connectivity Map data that accurate models of the effects of drugs on gene expression in various cell lines can be constructed without the need to perform experiments for all possible combinations of drugs and cell lines.
|September 30, 2013||CellOrganizer 2.0 released. New shape space modeling capabilities, SBML-spatial outputs, and reporter tools.
|July 10, 2013
||OMERO.searcher Local Client v1.3 released, along with contentDBs for three new databases (The Human Protein Atlas, The Cell Libary, and PSLID RandTag2).
|July 8, 2013||A new article in Bioinformatics describes a more demanding paradigm for subcellular location classification than has previously been used, which uses different sets of proteins for training and testing. New publicly available datasets were created to test this paradigm. Previously described classification methods did not perform well under this paradigm, but a combination of local and global features was shown to yield very good accuracies on a number of datasets.
|May 17, 2013
||CellOrganizer v1.9.0 released. Major addition is use of Bio-Formats to read input files.
|April 2, 2013
has been released. The primary goal of this release was to add the resolution of the
dataset to the model trainer graphical user interface.
|March 11, 2013
has been released. The primary new feature is the ability to generate cell and
nuclear shapes from diffeomorphic models.
|January 24, 2013
||Congratulations to Dr. Joshua Kangas for successfully defending his
thesis entitled, "Active Learning for Drug Discovery." Dr. Kangas will be joining a new startup, Quantitative Medicine, LLC, as cofounder and Chief Science Officer.
|January 15, 2013
||A review article from our group on automated image analysis
methods for high-content screening and analysis was awarded the
2013 JBC Authors' Choice Award
at the annual meeting of the Society for Laboratory Automation and Screening.
|January 9, 2013
||A new version of OMERO.searcher Local Client has been released,
along with a content database for the
database also released today.
This version permits searching of both
OMERO and non-OMERO databases and supports user-defined feature sets.
|January 9, 2013
||A significantly expanded collection of images and sequences from the
RandTag project has been
Automated analysis of the images of CD-tagged NIH 3T3 clones
in which the tagged gene has been identified permitted the assignment of
subcellular location for a number of previously unannotated or
|November 30, 2012
||Two articles in PLoS ONE describe results from our collaboration with
the Human Protein Atlas. In the first, analysis of images of eleven cultured cell lines
reveals that accounting for differences in cell shape and size reduces
apparent variation in microtubule distribution. Accounting for this,
three groups of cell lines remain distinguishable.
In the second,
computational analysis identified proteins whose annotations from visual
analysis were incorrect.
|November 28, 2012
||Congratulations to Dr. Jieyue Li for successfully defending his thesis entitled, "Automated Learning of Subcellular Location Patterns in Confocal Fluorescence Images from Human Protein Atlas." Dr. Li has accepted a position as Machine Learning Expert at ZestFinance in Los Angeles, California.
|September 4, 2012||
CellOrganizer v1.7.1 released. Support
added for exporting object files from TIF files of synthesized images.
|CellOrganizer v1.7 released. Support added for output as indexed images, blender object files, and SBML Spatial extension.
|OMERO.searcher v.1.1.2 released! Provides content-based searching of OMERO databases with local or remote images.
|CellOrganizer v1.6 released! Supports 2D/3D images and vesicle and microtubule pattern models.
|December 19, 2011||Dr. Murphy named to the NIH Council of Councils.
|September 7, 2011||Murphy Lab member Luis Pedro Coelho named to the 2012 class of Siebel Scholars. The Siebel Scholars program recognizes the most talented students at the world's leading graduate schools of business, bioengineering, and computer science.
|September 5, 2011||Video of Dr. Murphy's talk at the COMBINE 2011 meeting is available online.
|January 10, 2011||Work from Murphy group featured in Nature Biotechnology article on Computational Biology breakthroughs in 2010.
|September 18, 2010||Murphy Lab member Tao Peng wins the 2009 Research Award from Carnegie Mellon's Biomedical Engineering Department. One award is given each year to the BME graduate student judged to have the most outstanding research achievement.
|New release 2.0 of PatternUnmixer
PUnmix). The new version supports reading images from OMERO servers, displaying
object distributions, checking for the presence of unknown patterns, and
exporting unmixing fractions. See the Software link.
|August 22, 2009||Murphy Lab member Luis Pedro Coelho wins the CPCB Outstanding Research Achievement Award.
|Collection of hand-segmented nuclear images
and python code for comparing segmentation methods released. See the Software link.
|New PSLID release containing images from
over 2,500 clones generated by the RandTag project.
|Releases of SLML Tools and PUnmix are available
under the Software link. These packages
implement learned, generative models of subcellular patterns and estimation of
pattern unmixing fractions, respectively. Matlab source code, as well as
compiled versions for Linux, Mac OS, and Windows, are available.
The primary focus of current work in the lab is on automated interpretation of fluorescence microscope images.
If you are interested in reading more about our work, a list of publications is
Slides from Dr. Murphy's tutorials at meetings like the ISAC Congress
and the SBS Conference are available under
the presentations link.
Data Available for Download
Select data generated from Murphy Lab projects is
available for download.