You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+28-24Lines changed: 28 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,20 +29,21 @@
29
29
30
30
Cell health can be altered by genetic and chemical perturbations.
31
31
An increased understanding of these perturbation mechanisms is directly relevant for drug discovery and personalized medicine.
32
-
Here and in an accompanying paper, we present a novel cell imaging assay to measure 70 different aspects of cell health, such as proliferation, apoptosis, and cell cycle stalling.
33
-
However, this assay requires expensive reagents and does not scale well.
34
-
Therefore, we also developed a machine learning solution to predict cell health readouts directly from the inexpensive and high-throughput Cell Painting imaging assay.
32
+
Here and in an accompanying paper, we present two novel cell imaging assays that together measure 70 different aspects of cell health, such as proliferation, apoptosis, and cell cycle stalling.
33
+
However, these assays require expensive reagents and do not scale well.
34
+
Therefore, we also developed a machine learning solution to predict cell health readouts directly from a separate assay, known as Cell Painting.
35
+
In contrast to the Cell Health assays, Cell Painting is inexpensive, high-throughput, and unbiased (reagents are not targeted).
35
36
We predict many cell health indicators with high performance, but other readouts could not be predicted.
36
37
We validated our predictions by using orthogonal readouts and by applying the models to a large set of 1,500 drugs from the Drug Repurposing Hub.
37
38
Cell health predictions for drugs can be browsed at https://broad.io/cell-health-app.
38
-
We confirmed mitotic arrest and reactive oxygen speciesphenotypes via PLKand proteasome inhibition, respectively.
39
+
We confirmed mitotic arrest, reactive oxygen species, and DNA damage in G1 cell cycle based phenotypes via PLK, proteasome, and aurora kinase/tubulin inhibition, respectively.
39
40
In the future, we can use this approach to determine the cell health consequences of any perturbation in cells.
40
41
We conducted this project using open science principles with open data and open source code.
41
42
42
-
The following repository stores a complete analysis pipeline using Cell Painting data to predict readouts from several cell health assays.
43
+
The following repository stores a complete analysis pipeline using Cell Painting data to predict readouts from the Cell Health assays.
43
44
44
-
We first developed a customized microscopy assay we call "Cell Health".
45
-
The Cell Health assay is comprised of two different reagent panels: "Cell cycle" and "viability".
45
+
We first developed the customized microscopy assays we collectively call "Cell Health".
46
+
The Cell Health assays are comprised of two different reagent panels: "Cell cycle" and "viability".
46
47
Together, these two panels use reagents which mark different cell health phenotypes.
47
48
48
49
| Assay/Dye | Phenotype | Panel |
@@ -55,16 +56,16 @@ Together, these two panels use reagents which mark different cell health phenoty
55
56
| pH3 | Cell division | Cell cycle |
56
57
| gH2Ax | DNA damage | Cell cycle |
57
58
58
-
We hypothesized that we can use unbiased and high dimensional Cell Painting profiles to predict the readouts of each individual assay.
59
+
We hypothesized that we can use unbiased and high dimensional Cell Painting profiles to predict cell health readouts.
59
60
60
61
## Approach
61
62
62
-
This overview figure outlines the Cell Health assay, the Cell Painting assay, and our machine learning approach.
63
+
This overview figure outlines the Cell Health assays, the Cell Painting assay, and our machine learning approach.
> (a) Example images and workflow from the Cell Health assay.
68
+
> (a) Example images and workflow from the Cell Health assays.
68
69
> We apply a series of manual gating strategies (see Methods) to isolate cell subpopulations and to generate cell health readouts for each perturbation.
69
70
> (top) In the “Cell Cycle” panel, in each nucleus we measure Hoechst, EdU, PH3, and gH2AX.
70
71
> (bottom) In the “Cell Viability” panel, we capture digital phase contrast images, measure Caspase 3/7, DRAQ7, CellROX, and (b) Example Cell Painting image across five channels, plus a merged representation across channels.
@@ -94,7 +95,7 @@ All data are publicly available.
94
95
| Data | Level | Location | Notes |
95
96
| :--- | :---- | :--------| :---- |
96
97
| Cell health readouts | Raw |[1.generate-profiles/data/raw](1.generate-profiles/data/raw)| Per cell health panel (cell cycle and viability) per cell line |
97
-
| Cell health readouts | Normalized |`1.generate-profiles/data/raw/normalized_cell_health_labels.tsv`||
98
+
| Cell health readouts | Normalized |[1.generate-profiles/data/labels/normalized_cell_health_labels.tsv](1.generate-profiles/data/labels)||
98
99
| Cell health signatures | Consensus |[1.generate-profiles/data/consensus](1.generate-profiles/data/consensus)||
99
100
100
101
#### Drug Repurposing Hub
@@ -134,12 +135,13 @@ The full analysis pipeline consists of the following steps:
134
135
135
136
| Order | Module | Description |
136
137
| :---- | :----- | :---------- |
137
-
| 0 | Download cell painting data | Retrieve single cell profiles archived on Figshare |
138
-
| 1 | Generate profiles | Generate and process cell painting and cell health assay readouts |
139
-
| 2 | Determine replicate reproducibility | Determine the extent to which the CRISPR perturbations result in reproducible signatures |
140
-
| 3 | Train machine learning models to predict cell health assays | Train and visualize regression models using cell painting data to predict cell health assay readouts |
141
-
| 4 | Apply the models | Apply the trained models to the Drug Repurposing Hub data to predict drug perturbation effect |
142
-
| 5 | Validate the models | Use orthogonal readouts to validate the Drug Repurposing Hub predictions |
138
+
|[0.download-data](0.download-data/)| Download cell painting data | Retrieve single cell profiles archived on Figshare |
139
+
|[1.generate-profiles](1.generate-profiles/)| Generate profiles | Generate and process cell painting and cell health assay readouts |
140
+
|[2.replicate-reproducibility](2.replicate-reproducibility/)| Determine replicate reproducibility | Determine the extent to which the CRISPR perturbations result in reproducible signatures |
141
+
|[3.train](3.train/)| Train machine learning models to predict cell health assays | Train and visualize regression models using cell painting data to predict cell health assay readouts |
142
+
|[4.apply](4.apply/)| Apply the models | Apply the trained models to the Drug Repurposing Hub data to predict drug perturbation effect |
143
+
|[5.validate-repurposing](5.validate-repurposing/)| Validate the models | Use orthogonal readouts to validate the Drug Repurposing Hub predictions |
144
+
|[6.ml-robustness](6.ml-robustness)| Interrogate robustness of ML predictions | Assess sample size, feature groups, and cell line holdouts to probe ML robustness |
143
145
144
146
Each analysis module should be run in order.
145
147
View each module for specific instructions on how to reproduce results.
@@ -189,7 +191,7 @@ However, there are many cell line specific differences.
189
191
### Model Interpretation
190
192
191
193
Because we used a logistic regression classifier, we can readily interpret the output features.
192
-
These features were derived from CellProfiler and represent different measurements of cell morphology
194
+
These features were derived from CellProfiler and represent different measurements of cell morphology.
193
195
Shown above is a summary of coefficients from all 70 cell health models.
194
196
We observed that each contribute to classifying various facets of cell health.
195
197
Many different categories of cell morphology features contribute to cell health predictions.
@@ -211,20 +213,22 @@ These data represent ~1,500 compound perturbations in ~6 dose points in A549 cel
211
213
Collapsing the Drug Repurposing Hub Cell Painting data into UMAP coordinates, we observed many associated Cell Health predictions.
212
214
For example, predicted G1 Cell Count and predicted ROS had clear gradients in UMAP space.
213
215
However, there is not exactly a 1-1 relationship.
214
-
The control proteasome inhibitors (DMSO and Bortezomib) are known to induce ROS, while PLK inhibitors are known to induce cell death by blocking mitosis entry.
216
+
The proteasome inhibitors (DMSO and Bortezomib) are known to induce ROS, while PLK inhibitors are known to induce cell death by blocking mitosis entry.
215
217
A single PLK inhibitor (HMN-214) showed a strong dose relationship with predicted G1 count.
> Applying cell health models to Cell Painting data from The Drug Repurposing Hub.
220
-
> (a) We apply a Uniform Manifold Approximation (UMAP) to Drug Repurposing Hub consensus profiles of 1,571 compounds across 6 doses.
221
-
> The models were not trained using the Drug Repurposing Hub data.
222
-
> The point color represents the output of the cell health model trained to predict the number of cells in G1 phase (G1 cell count).
223
-
> (b) The same UMAP dimensions, but colored by the output of the Cell Health model trained to predict reactive oxygen species (ROS).
221
+
> Validating Cell Health models to Cell Painting data from The Drug Repurposing Hub.
222
+
> (a) The results of the dose alignment between the PRISM assay and the Drug Repurposing Hub data.
223
+
> This view indicates that there was not a one-to-one matching between perturbation doses.
224
+
> (b) Comparing viability estimates from the PRISM assay to the predicted number of live cells in the Drug Repurposing Hub.
225
+
> The PRISM assay estimates viability by measuring barcoded A549 cells after an incubation period.
224
226
> (c) Drug Repurposing Hub profiles stratified by G1 cell count and ROS predictions.
225
227
> Bortezomib and MG-132 are proteasome inhibitors and are used as positive controls; DMSO is a negative control.
226
228
> We also highlight all PLK inhibitors in the dataset.
227
229
> (d) HMN-214 is an example of a PLK inhibitor that shows strong dose response for G1 cell count predictions.
230
+
> (e) Tubulin and aurora kinase inhibitors are predicted to have high Number of gH2AX spots in G1 cells compared to other compounds and controls.
231
+
> (f) Barasertib (AZD1152) is an aurora kinase inhibitor that is predicted to have a strong dose response for Number of gH2AX spots in G1 cells predictions.
0 commit comments