Cytometry Development Workshop
Subcellular Location Features descriptions
Zernike features - A set of circle polynomials in two polar variables: The moments themselves are complex numbers and are sensitive to rotation of the image therefore the magnitudes of the moments were used as features. Zernike moments through degree 12 were used providing a set of 49 rotation invariant features. (Plots of all of the Zernike polynomials can be viewed here).
Haralick texture features - Texture features based in a gray-level co-occurrence matrix of the image: Haralick has described 13 statistics that can be calculated from the co-occurrence matrix of which 13 are used here.
SLF1SLF1.1 - The number of fluorescent objects in the image: Objects were identified by applying the Matlab bwlabel function to a binarized version of the processed image. The bwlabel function defines an object as a contiguous group of non-zero pixels in an 8-connected environment (i.e., a given pixel is adjacent to each of its eight neighbors).
SLF1.2 - The Euler number of the image: The Matlab imfeature function was used to calculate the number of objects in the image minus the number of holes. A hole is defined as a contiguous group of zero-valued pixels contained entirely within an area of non-zero pixels.
SLF1.3 - The average number of above-threshold pixels per object: The mean number of non-zero pixels per object was calculated for the binarized image.
SLF1.4 - The variance of the number of above-threshold pixels per object. The variance of the number of non-zero pixels per object was also calculated.
SLF1.5 - The ratio of the size of the largest object to the smallest: This was defined as the number of pixels in the largest object divided by the number of pixels in the smallest object.
SLF1.6 - The average object distance to the cellular center of fluorescence: The center of fluorescence (COF) of the whole cell was calculated and used to determine distances to the centers of fluorescence of each object in that cell. Centers of fluorescence were calculated as:
where x and y are the coordinates of each pixel (in either the entire cell or a particular object), and f(x,y) is the intensity of the pixel at (x,y).
SLF1.7 - The variance of object distances from the image COF. The variance was calculated using the COF determined for SLF1.6.
SLF1.8 - The ratio of the largest to the smallest object to image COF distance: This was calculated as the distance from the image COF to the furthest object in the cell divided by the distance from the image COF to the closest object.
SLF1.9 - The fraction of the non-zero pixels in a cell that are along an edge: Edge detection was performed on each image using the Canny method () as implemented in the Matlab edge function. Canny's method calculates the gradient of the image using the derivative of a Gaussian filter. It then assigns edges to strong and weak categories. Weak edges are only included in the final output if they are connected to strong edges. This approach is less sensitive to noise in the image than other edge detection methods. The area of the binarized edge image was then divided by the area of the binarized cell image.
SLF1.10 - Measure of edge intensity homogeneity: Each image (I) was convolved separately with the kernels N and W
to find the intensity gradients in two orthogonal directions ( and ). The intensity of the gradient at all points in the image was calculated using
and an eight-bin histogram was calculated for the values in this edge intensity image. The final feature was calculated as the fraction of all values that fall in the first two bins of this histogram.
SLF1.11 - Measure of edge direction homogeneity 1: The overall gradient at each point in the image G was then calculated from the convolved images GN and GW calculated for SLF1.10 using
The value of each pixel in the image G is therefore the direction (from -inf; to inf;) of the intensity gradient at that point in the image, I. An eight-bin histogram was then calculated using all of the values in the gradient image G. Images with patterns containing edges oriented predominantly along a particular direction (some patterns of actin filaments, for example) result in edge gradient histograms in which a few bins will dominate. The final feature was calculated as the ratio of the largest to smallest value in the histogram.
SLF1.12 - Measure of edge direction homogeneity 2: The ratio of the largest to the next largest value in the eight-bin histogram used for SLF1.11 was calculated. This feature was included to overcome problems that may arise with values of the first measure of edge direction homogeneity becoming very large when the minimum value of the histogram is small.
SLF1.13 - Measure of edge direction difference: For the eight-bin histogram used for SLF1.11, the difference between the bins for an angle and for that angle plus p was calculated by summing bins 1 through 4 and subtracting the sum of bins 5 through 8. This difference was normalized by the sum of all eight bins.
SLF1.14 - The fraction of the convex hull area occupied by protein fluorescence: The convex hull of the protein localization image was calculated using the convhull function in Matlab and converted to a binary image. The area of the binarized protein image was then divided by the area of the convex hull image.
SLF1.15 - The roundness of the convex hull: The roundness was defined as . This value approaches 1 as the shape approaches a circle.
SLF1.16 - The eccentricity of the convex hull: The eccentricity
of the ellipse that is equivalent, based on second order moments, to the protein image convex hull was calculated using the following (from ):
where are the central moments of the protein image convex hull.
SLF2.17 - The average object distance from the COF of the DNA image: As for SLF1.6, the distances from a reference point of objects in the protein image are calculated. However, in this case the center of fluorescence of the DNA image is used in place of the center of fluorescence of the protein image.
SLF2.18 - The variance of object distances from the DNA COF. This feature is analogous to SLF1.7 except that the DNA COF is used as the reference point.
SLF2.19 - The ratio of the largest to the smallest object to DNA COF distance. This feature is analogous to SLF1.8 except that the DNA COF is used as the reference point.
SLF2.20 - The distance between the protein COF and the DNA COF: The distance between the COF of a protein image and its corresponding DNA image is calculated.
SLF2.21 - The ratio of the area occupied by protein to that occupied by DNA: The number of pixels in the binarized protein image is divided by the number of pixels in the binarized DNA image.
SLF2.22 - The fraction of the protein fluorescence that co-localizes with DNA: The fraction of pixels in the binarized protein image that overlap with pixels in the binarized DNA image is calculated.
SLF7: SLF7.80-84 were defined based on the morphological skeleton of objects obtained by thinning using a homotopic interval.
SLF7.79 - The fraction of cellular fluorescence not included in objects: This was added to measure the amount of fluorescence that is not contained in fluorescent objects.
SLF7.80 - The average length of the morphological skeleton of objects.
SLF7.81 - The ratio of object skeleton length to the area of the convex hull of the skeleton, averaged over all objects.
SLF7.82 - The fraction of object pixels contained within the skeleton, averaged over all objects.
SLF7.83 - The fraction of object fluorescence contained within its skeleton, averaged over all objects.
SLF7.84 - The ratio of the number of branch points in skeleton to length of skeleton, averaged over all objects: A point was defined as a branch point if 3 or more of its neighbors were contained within the skeleton.
SLF31: SLF31.1-18 are parameter free threshold adjacency statistics.
SLF31.1 - pftas:center_0
SLF31.2 - pftas:center_1
SLF31.3 - pftas:center_2
SLF31.4 - pftas:center_3
SLF31.5 - pftas:center_4
SLF31.6 - pftas:center_5
SLF31.7 - pftas:center_6
SLF31.8 - pftas:center_7
SLF31.9 - pftas:center_8
SLF31.10 - npftas:center_0
SLF31.11 - npftas:center_1
SLF31.12 - npftas:center_2
SLF31.13 - npftas:center_3
SLF31.14 - npftas:center_4
SLF31.15 - npftas:center_5
SLF31.16 - npftas:center_6
SLF31.17 - npftas:center_7
SLF31.18 - npftas:center_8
SLF33: SLF33.126-161 are parameter free threshold adjacency statistics calculated at mean and mean-margin instead of center.
SLF33.126 - pftas:mu_margin_0
SLF33.127 - pftas:mu_margin_1
SLF33.128 - pftas:mu_margin_2
SLF33.129 - pftas:mu_margin_3
SLF33.130 - pftas:mu_margin_4
SLF33.131 - pftas:mu_margin_5
SLF33.132 - pftas:mu_margin_6
SLF33.133 - pftas:mu_margin_7
SLF33.134 - pftas:mu_margin_8
SLF33.135 - npftas:mu_margin_0
SLF33.136 - npftas:mu_margin_1
SLF33.137 - npftas:mu_margin_2
SLF33.138 - npftas:mu_margin_3
SLF33.139 - npftas:mu_margin_4
SLF33.140 - npftas:mu_margin_5
SLF33.141 - npftas:mu_margin_6
SLF33.142 - npftas:mu_margin_7
SLF33.143 - npftas:mu_margin_8
SLF33.144 - pftas:mu_0
SLF33.145 - pftas:mu_1
SLF33.146 - pftas:mu_2
SLF33.147 - pftas:mu_3
SLF33.148 - pftas:mu_4
SLF33.149 - pftas:mu_5
SLF33.150 - pftas:mu_6
SLF33.151 - pftas:mu_7
SLF33.152 - pftas:mu_8
SLF33.153 - npftas:mu_0
SLF33.154 - npftas:mu_1
SLF33.155 - npftas:mu_2
SLF33.156 - npftas:mu_3
SLF33.157 - npftas:mu_4
SLF33.158 - npftas:mu_5
SLF33.159 - npftas:mu_6
SLF33.160 - npftas:mu_7
SLF33.161 - npftas:mu_8
SLF34: SLF34.1-10 are overlap features
SLF34.164 - overlap:prot-to-ref-overlap
SLF34.165 - overlap:fraction-above-thresh-prot-in-above-thresh-ref
SLF34.166 - overlap:fraction-of-protein-in-above-thresh-ref
SLF34.167 - overlap:fraction-of-proc-protein-in-above-thresh-ref
SLF34.168 - overlap:fraction-above-thresh-ref-in-above-thresh-prot
SLF34.169 - overlap:correlation:binprot-binref
SLF34.170 - overlap:correlation:prot-binref
SLF34.171 - overlap:correlation:prot-ref
SLF34.172 - overlap:median-prot-dist-ref
SLF34.173 - overlap:mean-prot-dist-ref
SLF9.1 - The number of fluorescent objects in the image: A 3D object is defined as a group of contiguous, above-threshold voxels in a 26-connected environment.
SLF9.2 - The Euler number of the image: This is the difference between number of objects and number of holes in the image.
SLF9.3 - The average object volume: The volume of an object is defined as the number of voxels in the object.
SLF9.4 - The standard deviation of object volumes.
SLF9.5 - The ratio of the max object volume to min object volume.
SLF9.6 - The average object distance to the protein COF.
SLF9.7 - The standard deviation of object distances from the protein COF.
SLF9.8 - The ratio of the largest to the smallest object to protein COF distance.
SLF9.9 - The average object distance to the COF of the DNA image.
SLF9.10 - The standard deviation of object distances from the COF of the DNA image.
SLF9.11 - The ratio of the largest to the smallest object to DNA COF distance.
SLF9.12 - The distance between the protein COF and the DNA COF.
SLF9.13 - The ratio of the volume occupied by protein to that occupied by DNA.
SLF9.14 - The fraction of the protein fluorescence that co-localizes with DNA.
SLF9.15 - The average horizontal distance of objects to the protein COF.
SLF9.16 - The standard deviation of object horizontal distances from the protein COF.
SLF9.17 - The ratio of the largest to the smallest object to protein COF horizontal distance.
SLF9.18 - The average vertical distance of objects to the protein COF.
SLF9.19 - The standard deviation of object vertical distances from the protein COF.
SLF9.20 - The ratio of the largest to the smallest object to protein COF vertical distance.
SLF9.21 - The average object horizontal distance from the DNA COF.
SLF9.22 - The standard deviation of object horizontal distances from the DNA COF.
SLF9.23 - The ratio of the largest to the smallest object to DNA COF horizontal distance.
SLF9.24 - The average object vertical distance from the DNA COF.
SLF9.25 - The standard deviation of object vertical distances from the DNA COF.
SLF9.26 - The ratio of the largest to the smallest object to DNA COF vertical distance.
SLF9.27 - The horizontal distance between the protein COF and the DNA COF.
SLF9.28 - The signed vertical distance between the protein COF and the DNA COF: This was specifically defined to be a signed quantity in order to distinguish proteins distributed near the top of the cell from ones near the bottom.