-
Notifications
You must be signed in to change notification settings - Fork 0
LiveCellMiner Extension for SciXMiner
This repository contains the SciXMiner extension LiveCellMiner that is targeted to provide tools for a qualitative and quantitaitve analysis of cells undergoing mitosis. On the basis of time series of 2D microscopy images with a nuclear marker, cells are detected, tracked and analyzed. For valid division cycles, image patches of each frame are extracted and segmented to obtain quantitative features for each time point. Cells can then be temporally aligned using a set of manual and automatic tools, and various possibilities to visualize the data allow comparisons between different treatments.
Citation If you find this work useful, please make sure to cite the following paper:
@article{moreno2022livecellminer,
title={LiveCellMiner: A New Tool to Analyze Mitotic Progression},
author={Moreno-Andr{\'e}s, D. and Bhattacharyya, A. and Scheufen, A. and Stegmaier, J.},
journal={PLOS ONE},
volume={17},
number={7},
pages={e0270923},
year={2022},
publisher={Public Library of Science San Francisco, CA USA}
}
The LiveCellMiner toolbox is an extension of the MATLAB toolbox SciXMiner [1] and requires several additional tools to function properly. The following list provides and overview of the required third party tools and may serve as a checkbox for the individual installation steps. Details on each step are provided in the next paragraphs.
- SciXMiner (General purpose MATLAB-based GUI for data mining) and LiveCellMiner toolbox (extension to SciXMiner containing all functionality of our proposed tool).
- XPIWIT (ITK-based image processing tool that is used for detection and segmentation of nuclei).
- Cellpose MATLAB add-on (pretrained CNN model for nucleus segmentation).
- Ultrack (Python-based tracking tool installed as a virtual environment in Anaconda / Miniconda).
The first step is to ensure that SciXMiner is properly installed on your system (see [1] for installation instructions or simply use the installers provided on the SciXMiner Repository). Moreover, LiveCellMiner uses the third party tools XPIWIT [2], Cellpose [3] and Ultrack [7] that also have to be installed prior to using the software.
For installing XPIWIT, you can simply download the precompiled binaries for your operating system from our bitbucket repository and extract the archive to a destination of your choice. Remember where you save XPIWIT, as you later on need to point LiveCellMiner to this directory to make sure it can access the XPIWIT files. On Unix systems, make sure that the executables XPIWIT
and XPIWIT.sh
have execution permissions.
To use Cellpose in LiveCellMiner, we make use of the MATLAB implementation provided by the authors. The functionality requires a MATLAB version R2023b or higher, the Medical Imaging Toolbox and the Cellpose Add-On to be installed as described here.
We occasionally experienced problems with executing this release on MATLAB R2023b and also offer an interface to the python implementation of Cellpose. If you want to use the python installation of Cellpose, please setup a new environment for this using the commands listed below.
Instructions on the python-based Cellpose installation (Optional)
conda create --name cellpose python=3.10
conda activate cellpose
pip install cellpose[gui]==3.1.1.2
To additionally install the GPU support, additionally execute the following commands (make sure to select the CUDA version that fits your installed driver or update the drivers accordingly):
pip uninstall torch
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu126
To install an environment for Ultrack (Bragantini et al., 2024), we recommend installing Anaconda or Miniconda. Extensive installation instructions are provided in the respective readme files of the original repositories. After having installed either Anaconda or Miniconda (used to setup virtual environments for python), open a Terminal window or the Anaconda prompt and create a new repository for ultrack with the following commands to setup the minimal working installation:
conda create -n ultrack python=3.11 higra gurobi pytorch pyqt -c pytorch -c gurobi -c conda-forge
conda activate ultrack
pip install ultrack
Remember the location you install the ultrack environment to (e.g., /path/to/miniconda/envs/ultrack
), as this will required later to point LiveCellMiner to it.
Hint: We occasionally observed problems when installing on Windows ("Intel MKL FATAL ERROR: Cannot load mkl_intel_thread.dll."). A working environment for Windows 11 is uploaded to the repository named ultrack_environment.yml
, which can be used to directly install all above-mentioned dependencies via conda env create -f ultrack_environment.yml
.
Moreover, make sure to install the Gurobi ILP solver as explained in the original repository or based on our step-by-step instructions below. If Gurobi DLLs are not found automatically on Windows-based systems, try installing Gurobi from https://www.gurobi.com/downloads/gurobi-software/, which should contain all required DLLs.
Instructions on the Gurobi installation
Here are the steps that worked for us and could serve as a initial idea how to get it running:- Register at Gurobi to get access to a free license for academic purposes.
- Request license (“Named-User Academic”).
- Go to “Licenses” and hit “Show Installation Instructions for this License”
- Run the Gurobi instalation command in your CMD or Terminal (e.g., “grbgetkey 46cf7d04-xx99-8523-9874-2267c7c03c88”)
- Save the gurobi.lic file to the destination of your choice.
- In order to make the ultrack python environment find the valid license file, it needs to be placed in a default search location like your home directory (e.g. C:\Users\myusername\gurobi.lic (Windows) or /Users/myusername/gurobi.lic (macOS)).
- Check if Gurobi was successfully installed by activating the environment in the Anaconda prompt (
conda activate ultrack
) and then callingultrack check_gurobi
. If this command prints a valid license association, you're good to go for the next steps.
After having installed the above-mentioned requirements, download the LiveCellMiner toolbox from this repository and copy the contents of the Source
folder into a new folder called livecellminer
that is to be created in the application_specials
folder of SciXMiner. The Source
files should afterwards be contained in a folder /path/to/scixminer/application_specials/livecellminer/
.
Next, open up MATLAB and start SciXMiner using the command scixminer
from the command prompt. To enable the LiveCellMiner toolbox use Extras -> Choose application-specific extension packages...
from the SciXMiner menu. Restart SciXMiner and you should see a new menu entry called LiveCellMiner.
To be able to use the external tools XPIWIT and Ultrack from within the toolbox (required to import new projects), SciXMiner has to be provided with the paths to the third party software that should be used for processing. The paths can be set using the menu entry LiveCellMiner -> Import -> Set External Dependencies
. In particular, these are the path to the executable of XPIWIT (e.g., D:/XPIWIT/Bin/XPIWIT.exe
(Windows) or /your/path/to/XPIWIT/Bin/XPIWIT.sh
(Unix)), the path to the Ultrack environment (e.g, C:/Environments/ultrack/python.exe
on Windows systems or /your/path/to/Environments/ultrack/python
).
Once you've successfully installed SciXMiner as well as the LiveCellMiner extension, you can already have a look at the example data. Download the archive to your computer and extract its contents. You can now load the project file (*.prjz
file in the root folder of the archive). The data set has already been synchronized and you can directly jump to the analysis functionality of LiveCellMiner as described in the Data Selection and LiveCellMiner Settings below. This example contains the data from Figure 4 (last column) and a good exercise to get started would be trying to recreate the plots of the figure as described in the following steps:
- Navigate to the time series selection using the dropdown menu the main window of SciXMiner by selecting the entry Time series: General options.
- Select the time series you want to display (e.g., Area or MeanIntensity as in Figure 4A,B).
- Select the output variable using the dropdown menu Selection of output variable and set it to 6-OligoID. This will display different lines for each of the oligos.
- Now you can start plotting the selected time series, e.g., as line plots using LiveCellMiner -> Show -> Comb. Line Plots from the menu.
Note that rows C - G contain additional features that have to be precomputed first.
-
Figure 4C: This should display a normalized version of the mean intensity. Select the original time series MeanIntensity and then call LiveCellMiner -> Process -> Perform Feature Normalization from the menu and select Interphase Mean. You should now see a new time series called MeanIntensity-NormalizedIntMean. Select this one and plot it as before to generate the same plot as shown in Figure 4C.
-
Figure 4D: This feature is a ratio of two other time series. In this case, the ratio of the minor and the major axis. Select the two time series called MinorAxisLength and MajorAxisLength, which should result in 8,9 showing up in the edit field on the right. Make sure to swap the order, i.e., modify the edit field entry to 9,8 as the order determines which time series is used as the numerator and the denominator, respectively. Finally, call LiveCellMiner -> Process -> Compute Time Series Ratio and visualize the new time series called Ratio-MinorAxisLength-vs-MajorAxisLength as before to reproduce Figure 4D.
-
Figure 4F,G: These panels visualize single features, i.e., first navigate to the single features selection view using the dropdown menu the main window of SciXMiner by selecting the entry Single features. You'll notice that there is only one single feature present so far. To precompute some additional features, call LiveCellMiner -> Process -> Compute Additional Single Features. You can now select the feature IPToMALength_Minutes or MeanOrientationDiffPMA for visualization. Single features can be visualized either as box plots, violin plots or as histograms. Simply choose the visualization you like from the menu, e.g., LiveCellMiner -> Show -> Comb. Violin Plots to reproduce Figure 4F,G.
The example data set also contains the raw image snippets and their segmentation of all extracted cells, so you can also have a look at the current synchronization results using the manual synchronization gui that can be started using LiveCellMiner -> Show -> Synchronization GUI.
The first required step is to extract the cell trajectories of single mitotic cells from the raw images. The input folders and processing parameters can be setup with a dedicated GUI. Open up the GUI from the LiveCellMiner menu via LiveCellMiner -> Import -> Import New Experiment``LCMProjectImporter
. The popup that opens should look similar to the following one:
The typical steps for importing new data are the following:
-
Select one or more input folders. LiveCellMiner assumes that different experiments have different root folders and that subfolders in each experiment folder contain the individual imaged positions (e.g., images of the first position of an experiment called
Experiment01
would reside in a folder namedExperiment01/P0001/
). -
Select the output folder to save all processing results. This should be the root folder and subfolders will be automatically generated for each experiment and the contained positions.
-
Setup the project-specific meta information in the overview table.
dT (min)
is the time interval between two consecutive frames in minutes,Spacing (µm)
is the physical spacing of the images in microns,Chromatin Suffix
is a suffix to select the chromatin channel (e.g.,*_Ch1
),2nd Ch. Suffix
is a suffix to select the second channel if available (e.g.,*_Ch2
),Preview Image
is an integer value to specify which frame to use for previewing segmentation results (a value between 1 and max. number of frames). -
Setup the minimum and maximum diameter of the nuclei contained in the images. The easiest way to do this is to use the
Measure Min/Max Diameter
button to open up a visualization of an exemplary image (changePreview Image
in the overview table, if you want to select a different image). Now draw the diameter of the smallest visible nucleus and subsequently the diameter of the largest visible nucleus. The length in pixels will be automatically converted to minimum/maximum diameters measured in microns. You can preview the current settings usingPreview LoG/TWANG
button. This only works if XPIWIT is already installed and you may need to point the GUI to the correct location of theXPIWIT.exe
(Windows) orXPIWIT.sh
(Unix). -
Optionally change other processing parameters for segmentation and tracking. If you want to reprocess some parts of the data, select which reprocessing option you want to choose.
-
Once all parameters are set as desired, continue by saving the project import configuration using the button
Save Project Settings
. This will produce a*.mat
file containing all input folders, parameter settings and meta information. -
Start the actual processing by pressing the
Start Processing
button. Now the image data should be automatically imported using the selected parameters.
The subsequent automatic steps involve the following operations:
- Detecting potential cell centroids and creating a preliminary segmentation in all frames separately [2],
- (Optionally) run a CNN-based segmentation using Cellpose [3] on all frames individually for improved segmentation results,
- Perform tracking of the detected nuclei and identify cell trajectories that match the required length of the analysis time window (default: 30 frames before and 60 frames after the detected cell division).
For each tracked cell in each frame, a small patch (default: 96x96 px) is extracted from the raw images and segmented using the TWANG segmentation algorithm or using the precomputed segmentation obtained with Cellpose [3]. The segmentation and the raw image snippet are then used to extract classical features (area, centroid, major/minor axes, orientation, circularity, mean intensity, intensity std. dev., std. dev. of the gradient magnitude) and a set of Haralick features [4] obtained from the grayscale covariance matrix with a discretization of the image to 64 intensity levels and by omitting all transitions from the background label to any other label (to focus only on texture features within each nucleus). Moreover, we extract CNN-based features for each image snippet using the features of the last fully-connected layer of a GoogLeNet [5] that was pretrained on the ImageNet database, yielding a 1000-dimensional feature vector for each cell in each frame.
The tracked cells and associated classical features are stored in a SciXMiner project file (*.prjz
) that can be opened with the SciXMiner MATLAB toolbox. The raw image snippets, segmentation and CNN features are saved in a HDF5 file for easier access (*_ImageData.h5
).
Notes
- Make sure to have a consistent folder structure. The software assumes the following arrangement of your data: Experiment / Position / ImageFiles. This information will be used to group the extracted data, e.g., to allow selecting only a subset of experiments, data that was acquired with a specific microscope or a particular position.
- Depending on the number of images per position, the processing can take a while and usually requires about 1-2 hours for one position depending on the hardware available.
- There are three tracking methods available in the import dialog. The first option uses
ultrack
by ... et al. and represents the latest algorithm. Moreover, we provide classical tracking methods like a nearest-neighbor based linking of centroid positions and a segmentation-based tracking that uses the spatio-temporal overlap of the cell segmentations. The latter algorithm requires that the temporal resolution is high enough to guarantee spatial overlaps of neighboring frames. - The pre-segmentation based on the external tool Cellpose can be enabled/disabled in the import settings dialog. A fast alternative to running Cellpose on the original images is to first run the software with the default segmentation based on XPIWIT and apply the refined segmentation only after the tracking by calling
LiveCellMiner -> Import -> Resegment Image Snippets
. Available options for the resegmentation are a watershed-based classical method and a Cellpose-based deep learning approach using pre-trained segmentation models. Note that due to the changed segmentation, it will also change the features contained in the project. Thus calling this function duplicates the project file (*.prjz
) and saves a new one next to the old one called*_RecomputedSegmentation.prjz
. The Cellpose MATLAB implementation requires a MATLAB version R2023b or higher, the Medical Imaging Toolbox and the Cellpose Add-On to be installed as described here. - The MATLAB GUI may loose focus or windows could be hidden behind the main window. In that case, just use the task bar or any other window manager of your operating system to get back to the GUI window.
- Use the importer GUI to toggle between the MATLAB or the Python-based implementation of Cellpose.
Deprecated Import Settings of Previous Versions of LiveCellMiner
The first required step is to extract the cell trajectories of single mitotic cells from the raw images. There are three steps involved in this process: 1. detecting potential cell centroids in all frames separately [2], 2. optionally run a CNN-based segmentation using Cellpose [3] on all frames individually for improved segmentation results, 3. perform tracking of the detected centroids and identify cell trajectories that match the required length of the analysis time window (default: 30 frames before and 60 frames after the detected cell division).For each selected cell in each frame, a small patch (default: 96x96 px) is extracted from the raw images and segmented on-the-fly using a classical method based on binary thresholding and a seeded watershed or using the precomputed segmentation obtained with Cellpose [3]. The segmentation and the raw image snippet are then used to extract classical features (area, centroid, major/minor axes, orientation, circularity, mean intensity, intensity std. dev., std. dev. of the gradient magnitude) and a set of Haralick features [4] obtained from the grayscale covariance matrix with a discretization of the image to 64 intensity levels and by omitting all transitions from the background label to any other label (to focus only on texture features within each nucleus). Moreover, we extract CNN-based features for each image snippet using the features of the last fully-connected layer of a GoogLeNet [5] that was pretrained on the ImageNet database, yielding a 1000-dimensional feature vector for each cell in each frame.
The tracked cells and associated classical features are stored in a SciXMiner project file (*.prjz
) that can be opened with the SciXMiner MATLAB toolbox. The raw image snippets are saved in a MATLAB file for easier access (*_RawImagePatches.mat
, *_RawImagePatches2.mat
and *_MaskImagePatches.mat
). The CNN features are also saved as separate MATLAB file (*_MaskedImageCNNFeatures.mat
).
The import of a single folder as well as a batch job that processes all subfolders in a provided directory can be started with LiveCellMiner -> Import -> Import New Experiment. The script lets you decide wheather to process a single position, an entire experiment with multiple positions or multiple experiments with all contained positions.
Notes
- Make sure to have a consistent folder structure. The software assumes the following arrangement of your data: Microscope / Experiment / Position / ImageFiles. This information will be used to group the extracted data, e.g., to allow selecting only a subset of experiments, data that was acquired with a specific microscope or a particular position.
- Depending on the number of images per position, the processing can take a while and usually requires about 1-2 hours for one position depending on the hardware available.
- There are two tracking methods available in the import dialog. Option "0" uses a nearest-neighbor based linking based on centroid positions. Option "1" is based on the spatio-temporal overlap of the cell segmentations and requires that the temporal resolution is high enough to guarantee spatial overlaps of neighboring frames.
- The pre-segmentation based on the external tool Cellpose, the writing of separate image snippets and potential image file selection filters can be enabled/disabled in the settings dialog that opens upon calling the script
callback_livecellminer_batch_processing.m
. - If the detection and tracking was performed with XPIWIT in the first place and if segmentation quality is limited, you can re-run the segmentation only on the extracted image patches again using a more complex algorithm. Available options are a watershed-based classical method and a Cellpose-based deep learning approach using pre-trained segmentation models. To trigger the re-segmentation, simply call
LiveCellMiner -> Import -> Convert Image Snippets to HDF5
from the main menu and select, which segmentation you want to pursue. Note that due to the changed segmentation, it will also change the features contained in the project. Thus calling this function duplicates the project file (*.prjz
) and saves a new one next to the old one called*_RecomputedSegmentation.prjz
. The Cellpose MATLAB implementation requires a MATLAB version R2023b or higher, the Medical Imaging Toolbox and the Cellpose Add-On to be installed as described here.
Projects are at first created for each experiment and position individually. To be able to analyze the data in a single project, e.g., to compare different positions, microscopes or experiments, the individual projects need to be fused. The fusion can initiated with the command LiveCellMiner -> Import -> Fuse Experiments. You have the option to select which projects you want to fuse. The fused project will be located in the root folder you selected and is indicated by the suffix *_FusedProjects.prjz
.
Due to the different durations of pro-, prometa- and metaphase, the LiveCellMiner toolbox provides different ways of synchronizing the individual trajectories. All functions related to the data synchronization are summarized in the menu entry LiveCellMiner -> Align. The following enumeration describes the individual menu items and their functionality:
-
LiveCellMiner -> Align -> Perform Auto Sync: This function automatically tries to identify the interphase -> prophase transition (IP) as well as the metaphase/early anaphase -> late anaphase transition (MA) points. These two transition points are used to align the different cell trajectories such that interphasic and postmitotic frames are aligned properly. There are currently three options for automatic alignment:
- Classical: this approach uses the classical features area, circularity, mean intensity and intensity std. dev. to identify the interphase -> prophase transition by searching for two clusters that minimize the within-class variance using the temporally constrained combinatorial clustering (TC3) method [6]. The division time point that was identified during tracking is assumed to be the correct time point and we only shift it by one additional time point, if the identified division time point is still corresponding to early anaphase. This is accomplished by a heuristic that checks if the area of both daughter cells are smaller than the area before the division. If yes, the synchronization time point is increased by one frame. Otherwise, it's left unchanged as presumably no early anaphase was visible or the segmentation could already identify two separate objects.
- Classical + Auto Rejection: Same as before, but uses an LSTM network that was trained on sequences of CNN features to predict if the current cell is a valid mitotic trajectory or an erroneous track. Erroneous tracks are directly rejected.
- LSTM + HMM + Auto Rejection: Uses an LSTM network that was trained on sequences of CNN features to predict the synchronization time points as well as identifying which of the cell tracks are valid/invalid. The prediction of the synchronization time points is post-processed with a Hidden Markov Model (HMM) that only allows valid state transitions (e.g., state sequences 1 -> 2 -> 3 for a valid track or 0 for an invalid track). Transition probabilities are manually specified and based on the predicted states of the LSTM, the Viterbi algorithm is used to identify the most likely hidden state sequence.
- Hybrid (Classical+Auto Rejection for MA; LSTM + HMM for IP): Combination of Classical + Auto Rejection and LSTM + HMM + Auto Rejection. The IP transition is determined via the LSTM + HMM approach, the MA transition based on the Classical approach. Moreover, trajectories deemed invalid are automatically rejected.
-
LiveCellMiner -> Align -> Auto-Sync Overview: displays some information about the number of (presumably) correct synchronizations and the number of ambiguous/rejected tracks. Note that this feature only works if the project already contains OligoIDs (see LiveCellMiner -> Process -> Add OligoID Output Variable).
-
LiveCellMiner -> Align -> Synchronization GUI: This menu item opens up a simple graphical user interface that can be used to manually identify the state transitions. It displays a set of 10 cells at a time, where two cells above one another are daughters, respectively. The annotation procedure is like this: (1) Left-click on the first frame considered as prophase to mark the interphase -> prophase transition and next, left-click on the first late anaphase frame to mark the meta-/early anaphase -> late anaphase transition. All intermediate frames are classified accordingly and the annotations on one of the daughter cells are directly copied to the other daughter to have a consistent alignment. To redo an annotation, simply start with the identification of the IP transition and the MA transition. If a complete trajectory should be rejected as either no mitotic event is present or if the tracking was erroneous, click on an arbitary frame of that cell with the right mouse button (both daughters should be highlighted in red afterwards). Stages are colored in green, magenta, cyan and red for inter, (pro, prometa, meta, early ana), (late ana, telo, inter) and incorrect cells, respectively. Use the arrow keys to switch to the previous/next 10 cells. The keys 1, 2, 3 toggle the visualization between raw, mask, masked raw images. It's usually advisable to start with the automatic synchronization and then focus on correcting only the errors. The following keyboard commands are available for the annotation GUI:
- 1, 2, 3, 4, 5: Toggles the visualization of raw, mask, raw+mask overlay. Options 4, 5 are only available if a second channel was analyzed (2nd channel raw and overlay of 1st and 2nd channel in red and green, respectively).
- Left Arrow: Load previous montage
- Right Arrow Load next montage
- Up Arrow Mark current images as suitable training data and load next montage
- Down Arrow Mark current images as unsuitable training data
- Left click: Set IP and MA transition points
- Right click: Reject the current cell track
- G: Add currently visible cells to the confirmed ground truth (used for the LSTM classifier)
- H: Show this help dialog with an overview of the available keyboard shortcuts.
- Hint: In case key presses show no effect, left click once on the title bar of the figure window and try hitting the button again. This only happens if the window loses the focus.
-
LiveCellMiner -> Align -> Alignment Classifier -> Randomly Select Cells: Randomly selects a desired number of cell pairs, e.g., a number of 100 would select 100 randomly drawn cell pairs and result in 200 selected cells in total. This feature can be useful for classifier training to obtain a random sample of the entirety of cells present in the project. The random seed is renewed each time you call the menu entry, i.e., each run should generate a new sequence of selected indices. If you want to reproduce a previously made selection, just remember the random seed and the number of cell pairs you want to select and parametrize the dialog accordingly. The selected cells are directly copied to the LiveCellMiner selection and are displayed when opening the manual correction GUI. To display the selected indices you can type
ind_auswahl
in MATLAB's command line. -
LiveCellMiner -> Align -> Alignment Classifier -> Select All Annotated Cells: Performs a selection of all cells that have been previously annotated. Use LiveCellMiner -> Align -> Synchronization GUI to visualize those cells in the graphical user interface.
-
LiveCellMiner -> Align -> Alignment Classifier -> Reset Manual Annotations: Clears all previously performed annotations to start with a fresh annotation session.
-
LiveCellMiner -> Align -> Alignment Classifier -> Find Inconsistent Synchronizations: This function selects all cells where the automatic synchronization was unsure or where the identified synchronization states are incomplete. The selection variable ind_auswahl of SciXMiner is set to these inconsistent cells and these cells can subsequently be displayed and corrected via the manual synchronization tool (see LiveCellMiner -> Align -> Synchronization GUI).
-
LiveCellMiner -> Align -> Alignment Classifier -> Save Training Data: Function to save the currently labeled training data to a dedicated file (
*.cdd
). All cells that were manually checked via Synchronization GUI and verified by confirming via the Up Arrow are considered valid training data and saved to the database. If cells visualized in the Synchronization GUI window should not contribute to the training data set, simply hit the Down Arrow. By default, cells are not part of the training set, i.e., you have to explicitly mark them as being suitable as ground truth after checking / adjusting the manual synchronization. The associated information if a cell is considered a manually checked one is stored in the single featuremanuallyConfirmed
that determines if cells were manually verified (value 1) or not (value 0). After calling the Save Training Data, the first popup dialog asks for a file to save the data to. If you select an already existing file you have the option to either extend or overwrite the dataset. If you want to keep previous annotations, make sure to extend it or choose a different file name to start a new training database. After the data is saved, the script will automatically ask if you'd like to train a new classifier with the annotated training data. This retraining can also be triggered manually by calling LiveCellMiner -> Align -> Alignment Classifier -> Train LSTM Classifier from the menu as described in the next point. -
LiveCellMiner -> Align -> Alignment Classifier -> Train LSTM Classifier: Function to update or train new LSTM classifiers used for rejecting incorrect tracks and to perform the automatic synchronization. The script requires a training database containing manually synchronized cells as described in the previous points. The initial popup dialog asks for such a training database (
*.cdd
file) and upon providing a suitable one, the classifier and regression LSTMs are then retrained on the selected data. The software will always ask for a name of the classifier when (re-)training and when being applied to the image data. So make sure to store separate classifiers, e.g., for different magnifications or microscope settings. Hint: to only annotate a selected set of cells, you can preselect the desired cells using Edit -> Select -> Data points using classes....
An important aspect to analyze a particular subset of the data is data selection. This can be accomplished using the standard SciXMiner data selection procedures and most useful in this aspect is the class-based selection using Edit -> Select -> Data points using classes ... This opens up the dialog depicted in the following screenshot:
Use these listboxes to select the subset of the data that you want to analyze in more detail. For instance, select experiments that were acquired with a particular microscope, a subset of OligoIDs, a specific experiment or individual plates. Also any combinations of these output variables can be used. It is also possible to manually add additional output variables using the standard procedure for SciXMiner using the variables code_alle
, zgf_y_bez
and bez_code
(see official documentation [1]). After selection, the selected cell indices are summarized in the selection variable ind_auswahl
and subsequent visualizations or manual corrections are only performed on the selected cells. Note that additional features are automatically computed for all cells to avoid missing feature values. The SciXMiner project overview shows the current selection as depicted in the following screenshot:
As SciXMiner is a general purpose data mining tool, the specifiers depicted in this overview might appear cryptic to some non-data scientists. The following bullet points explain the individual depicted values:
-
Time series are time resolved features extracted for each cell. For instance, measuring the mean intensity over all tracked frames would be a time series. The number 29 specifies the number of available time series and below, the number of selected time series is depicted. The number of available measurements is reflected by 90 sample points, i.e., for each time series feature, there are 90 measurements available.
-
Single Features are features of tracked cell that are not time-resolved. For instance, this could be average values, feature values at a particular point in time or measures like the number of frames from the IP to MA transition. In the example, there is only one single feature and this one is also selected.
-
Output Variables are intended to group the cells. In this case, the groups All, Microscope, Experiment and Position exist. These could be used, e.g., to select cells that were acquired with a particular microscopy platform or only cells or a specific experiment.
-
Data Points correspont to the number of tracked cells in the case of LiveCellMiner. In this example, there are 1606 cells present and out of these a subset of 67 cells is currently selected, e.g., to just visualize a subset of the data.
In addition to selecting data points (i.e., cells), it is possible to select which of the feature time series should be visualized. This can be performed in the dialog Time series: General options:
The listbox enumerates all available time series allows to select a time series for visualization, analysis or to compute additional features. It is also possible to select multiple time series by pressing the CTRL key while selecting. Each time series is then visualized separately.
The output variable Selection of output variable controls how the data are split for the visualization and uses the same classes as for the class-based data point selection, e.g., to group data points according to the same microscope, experiment, cell Id or oligoID. For instance, if All is selected, all cells are plotted in a single plot. If Experiment is selected, all cells corresponding to a single experiment are plotted in the same plot, with subplots for each of the experiments. Also useful is e.g., OligoID that allows to summarize the results of the same treatment. These selections affect all available visualizations. The two edit fields Time series segment from can be used to constrain the time window, e.g., to only focus on the first 40 time points for the manual synchronization. If you want to visualize the entire time series again, hit the Complete time series button.
-
Summary Output Variable: Allows to specify which output variable should be used for grouping. This can be useful to summarize repeats of the same experiment but not the oligos of all experiments. Use LiveCellMiner -> Process -> Add Repeats Output Variable to specify the related repeats.
-
Recovery Measure Mode: See documentation of the menu entry LiveCellMiner -> Process -> Compute Rel. Recovery Time Series.
-
Smoothing Method: The smoothing method to be used for smoothing time series features. See valid options for the MATLAB function
smooth
. -
Smoothing Window: The smoothing window to be used for smoothing time series features.
-
IP Transition: The position of the interphase -> prophase transition in the aligned visualizations. E.g. 30 means that all interphase frames are put before frame 30.
-
MA Transition: The position of the metaphase -> anaphase transition in the aligned visualizations. E.g., 60 means that all phases following anaphase will be placed after frame 60.
-
Aligned Length: The total number of frames in the aligned plots. IP Transition and MA Transition are within this range. E.g., 0 -> IP Transition -> MA Transition -> end.
-
Sister Chromatin Dist. Threshold: Used to refine the synchronization time point based on the distance of the sister chromatin masses. The value is provided in microns. Note that it's important that projects were processed with the correct physical spacing set, as otherwise length and area measures are not comparable.
-
Frame Step Width: The interval between frame labels (x-axis of the plots). Only effective if Relative Frame Numbers? is enabled.
-
Error Bar Step: The frequency at which error bars are plotted. Used to avoid cluttered visualizations. Only effective if Show Error Bars? is enabled.
-
Rel. Regression Time Range: Range of frames relative to the synchronization time point that will be used for computing regression-based single features to be visualized as box plots. E.g., 1-5 would indicate to use frames 1-5 after the mitotic event for computing the slope of the regression line.
-
Temporal Resolution (min): The time interval in minutes between two frames. Used for properly converting number of frames into time in minutes.
-
Multiple Comparison: Selects the correction mode used for multiple comparison testing.
-
Color Map: Allows to select which colormap is used for plotting of line plots and heatmaps. In addition to some basic MATLAB color maps it is also possible to define custom colormaps. Custom colormaps are located in the folder
toolbox/luts/
and are simple MATLAB scripts that return a list of colors based on a requested number of colors. If you want to define your own colormaps just copy one of the template files, change the base colors to your needs and change the function name to the new name of the color map. After restarting SciXMiner, the new colormap should appear in the dropdown menu. If the number of requested colors (e.g., the number of lines in a plot) is lower than the number of base colors, the script directly returns a set of maximally distant base colors. Otherwise, linear interpolation between the base colors is performed. -
Summarize Experiments?: If enabled, the oligos of different experiments will be summarized.
-
Align Plots?: If enabled, the synchronization time points will be used for visualization. Otherwise, the original unaligned time series will be used.
-
Time in Minutes?: If enabled, the line plots and heat map visualizations use time in minutes as the unit. If disabled, the frame numbers are visualized.
-
Relative Frame Numbers?: If enabled, frame numbers in all plots are displayed relative to the IP and MA transition points. The transition points are labeld with 0 and frames before/after the transitions are negative/positive values that indicate the number of frames before/after teh event. If disabled, absolute frame numbers will be used.
-
Show Error Bars?: If enabled, error bars will be plotted in line plots.
-
Dark Mode?: If enabled, dark color scheme is used for visualizations.
In addition to the standard visualization possibilities available in SciXMiner, all visualizations specific for the LiveCellMiner toolbox are summarized in the menu entry LiveCellMiner -> Show. The visualizations include:
-
Heatmaps: Show heatmaps of the selected time series. Separate Documentation/Screenshots are created for each of the time series and subplots are created according to the selected output variable and the optional Summarize experiments setting that can be adjusted in the Time series: General options and LiveCellMiner dialogs, respectively.
-
Line Plots: Show mean +/- std.dev. line plots of the selected time series. Separate Documentation/Screenshots are created for each of the time series and subplots are created according to the selected output variable and the optional Summarize experiments setting that can be adjusted in the Time series: General options and LiveCellMiner dialogs, respectively.
-
Comb. Line Plots: Show average time series +/- std. err. of the selected time series. Separate Documentation/Screenshots are created for each of the time series. Instead of subplots, all lines are summarized in a single plot for direct comparison. Averages and std. err. are summarized according to the selected output variable and the optional Summarize experiments setting that can be adjusted in the Time series: General options and LiveCellMiner dialogs, respectively. Error bars can be optionally turned on/off by toggling the Show Error Bars? checkbox and the frequency of error bars can be controlled with the edit field Error Bar Step which are both located in the LiveCellMiner settings dialog.
-
Comb. Box Plots: Shows box plots of the selected single features. Note that single features are not available in the initial project and have to be computed separately (see section on Data Processing). Values are summarized according to the selected output variable and the optional Summarize experiments setting that can be adjusted in the Single features and LiveCellMiner dialogs, respectively.
-
Comb. Violing Plots: Similar to the box plot functionality but using violin plots instead, for additional information.
-
Comb. Histogram Plots: Shows histogram plots of the selected single features. Note that single features are not available in the initial project and have to be computed separately (see section on Data Processing). Values are summarized according to the selected output variable and the optional Summarize experiments setting that can be adjusted in the Single features and LiveCellMiner dialogs, respectively.
The following figure showcases the five different visualization possibilities:
The selected data points used for visualization can be grouped in different ways. All selected cells with the same value of the output variable (see section on Data Selection) will be plotted in a single figure. Moreover, the LiveCellMiner settings dialog allows to enable/disable Summarize experiments?, where data points of different experiments but with the same output variable are summarized. This can be useful, e.g., to visualize the response of a particular treatment across experiments. When disabled, all experiments are visualized individually. The three panels in the following figure were obtained for a selection of four experiments, where experiments 1-3 were repeats and experiment 4 is a separate experiment also using a different modality (confocal instead of a widefield microscope). The settings for combining the experiments (from left to right) are: (1) Summarize experiments=true, Summary Output Variable=Experiment, (2) Summarize experiments=false, Summary Output Variable=ExperimentsCombined, (3), Summarize experiments=false, Summary Output Variable=Experiment.
The depicted plots were generated with the setting Align plots? in the LiveCellMiner settings dialog enabled, i.e., the identified synchronization time points are used for spatial alignment (default). Disabling this flag displays the unaligned time series. The edit field Aligned Length in the LiveCellMiner settings dialog specifies the number of frames to use for the aligned heat maps and line plots (default value: 120), and the values IP Transition and MA Transition indicate at which frame the respective transitions identified during the manual/auto synchronization should be placed (default values: 30 and 60 respectively, if you expect very long IP -> MA durations, increase the total number of aligned frames and increase the MA Transition variable).
The LiveCellMiner toolbox also includes a few additional algorithms for postprocessing the time series and to extract single features from the time series. The following functions are currently available:
-
Compute 2nd Channel Features: If a project has an additional second channel and if this information was extracted during the import of the project, additional features can be computed on this channel. The segmentation is initialized with the nucleus segmentation of the first channel and can be modified using morphological operators dilation and erosion. The parameter dialog allows to specify the mode (extension, shrinking and toroidal), the structuring element to use (see strel definition of MATLAB) and the radius of the structuring element. Extension dilates the nucleus segmentation, Shrinking performs an erosion of the nucleus segmentation and Toroidal dilates the nucleus segmentation and then subtracts the original segmentation to have a toroidal region surrounding the initial segmentation. All morphological operations are based on the selected structuring element. The additional time series are added to the time series overview and can be visualized/analyzed in the same way as all other time series. It is also possible to extract multiple 2nd channel time series with different structuring elements. Currently, the features mean intensity, max intensity, intensity standard deviation and the mean intensity ratios between first and second channel are computed. If additional features should be added, you can implement them in the file
callback_livecellminer_extract_second_channel_features.m
. -
Compute Additional Single Features: Combines multiple frames of selected features to single features that can be analyzed using the boxplot and histogram visualizations. Currently, the following single features are included: IPToMALength_Frames (number of frames between the interphase-prophase and the metaphase-anaphase transition), IPToMALength_Minutes (same as before but scaled with the time step in minutes), InterphaseMeanIntensity (average intensity of each cell measured in the interphase frames), AccumulatedOrientationDiffPMA (sum of orientation angle changes between the start of the prophase and the time point before the chromatin masses separate).
-
Compute Stage-Dependent Mean Features: Computes three new single features that reflect the arithmetic mean feature value of the phases before IP, between IP - MA and after MA.
-
Compute Linear Regression Slope: Uses a specified time window and the currently selected time series to compute a linear regression on the feature values of the selected frames for each of the cells. The slope of the regression curve is stored as a single feature for each of the cells. This can be used, e.g., to estimate the initial recovery rate of time series features right after the division. The time window of interest can be selected in the LiveCellMiner settings dialog and is named Rel. Regression Time Range and the values are relative with respect to the division event, i.e., selecting a range of 1 - 5 would select the five frames after the division.
-
Compute Interphase Recovery Feature: This function computes a new time series based on the currently selected time series. The average value of the interphasic frames is identified and different options to compute the recovery time series can be selected in the LiveCellMiner settings dialog (Recovery Measure Mode). The available options are:
- Current vs. Target Ratio: Ratio of the current value and the average value of the interphase. Larger than 1 if the current value is higher than the target value and less than 1 if the current value is smaller than the target value.
- Always Increasing Ratio: similar as (1) but always <= 1, i.e., it is ensured that the larger value is always put in the denominator.
- Signed Percentage Deviation: the signed percentage deviation of the current value with respect to the interphase average value.
- Absolute Percentage Deviation: the absolute percentage deviation of the current value with respect to the interphase average value.
-
Compute Sister Distance: Creates a new time series feature that reflects the distance of two daughters to one another based on the Euclidean distance between their centroids. Note that this feature is by design 0 for all stages before the division happens.
-
Compute Single Feature Before IP: Creates a new single feature that reflects the arithmetic mean of a selected time series for a desired number of frames in front of the interphase-to-prophase (IP) transition.
-
Compute Single Feature After MA: Creates a new single feature that reflects the arithmetic mean of a selected time series for a desired number of frames in after the meta- to anaphase (MA) transition.
-
Compute Frame-based Grouping Feature: Computes a new single feature where individual cells can be grouped according to the time point the IP or MA happens. The input dialog asks for frame ranges to be used for the grouping. You can either specify custom ranges (e.g., [0, 100, 200, 400]) or just specify the number of desired bins, which will split the frame range between minimum and maximum frame in even parts to obtain the number of bins. The third parameter lets you choose if IP (=0) or MA (=1) should be used for the grouping. The new single feature is called FrameGrouping_NumBins=XX_MA=XX. To visualize individual groups, the single feature can be converted into an output variable (from the menu use
Edit -> Convert -> Selected Single Features -> Output Variables
). Now you can use the usual selection strategies to select and visualize subgroups. -
Compute Time Series Ratio: Creates a new time series feature that reflects the ratio of the first two selected time series.
-
Add OligoID Output Variable: This function adds a new output variable containing the OligoIDs, i.e., each cell is associated with an ID that uniquely identifies the oligo/treatement that was applied to the cell. The information used for this function is crawled from a text file that has to be located in the same folder as the experiment raw data and should have the same name as the experiment folder with the extension .txt. Inside of the text file, a three-column CSV file is expected with the specifiers in the first line and a separate line for each of the positions of the experiment. For instance the file content for 16 positions and four different treatments could look like this:
Position OligoID GeneSymbol
0001 Oligo1 Oligo1
0002 Oligo1 Oligo1
0003 Oligo1 Oligo1
0004 Oligo1 Oligo1
0005 OLI-2 OLI-2
0006 OLI-2 OLI-2
0007 OLI-2 OLI-2
0008 OLI-2 OLI-2
0009 olig_3 olig_3
0010 olig_3 olig_3
0011 olig_3 LSD1-6
0012 olig_3 olig_3
0013 TT2T TT2T
0014 TT2T TT2T
0015 TT2T TT2T
0016 TT2T TT2T
Always make sure that you use consistent OligoIDs. Any spelling or case error will be considered as a different treatment and can not be combined with the corresponding treatment later on.
-
Add Repeats Output Variable: Allows you to specify which experiments should be combined. For instance, if you have three repeats of one experiment, you can specify the same unique ID for all of these experiments, to optionally summarize the measurements. See options Summarize experiments? and Summary Output Variable in the LiveCellMiner settings dialog.
-
Select Data Points using Feature Range: can be used to select data points based on a value range of a single feature. This can, for instance, be useful to select only cells for comparison that exhibit comparable mean intensities during the interphase.
-
Select Single Daughter (Odd/Even/Random): Creates a subset of the current selection that contains only one daughter of each daughter pair. Selection strategies can be according to odd (1,3,5, ...), even (2,4,6, ...) or random (1,3,4,7, ...).
-
Perform Feature Normalization: Normalizes the selected time series by dividing each frame's value by the value after the cell division, such that the first late ana time point is assigned the value 1 and all other positions' values are relative to this time point.
-
Smooth Selected Time Series: Smooths the selected time series with the method selected in the LiveCellMiner settings dialog. See documentation of the MATLAB function smooth for details on the available methods. Moreover, the size of the smoothing window can be adjusted in the edit field Smoothing Window located in the LiveCellMiner settings dialog.
LiveCellMiner allows exporting of CSV tables, image galleries and automaically can generate reports. This functionality is summarized in the menu entry LiveCellMiner -> Export. The possibilities include:
-
Export -> Auto-Generate Report: Creates an experiment report as a website that can be viewed in a browser. Feature images are precomputed for faster analysis.
-
Export -> Export Individual Gallery for Selected Cells: Allows for different selection options and writes separate images for each of the selected cells. Options include export as 3D stack with time being the third dimension or 2D gallery with frames stacked horizontally. Moreover, both mask and raw images can be exported.
-
Export -> Export Aligned Gallery for Selected Cells: Creates a high-resolution gallery of all selected cells using the current alignment information. Note that this image can get quite large if a large number of cells is selected!
-
Export -> Export Selected Cells as CSV: Exports features of the selected cells to CSV-based spreadsheet files that can be further processed with external tools like Excel.
-
Export -> Export Selected Cells as Raw Image, Mask and CSV: Exports separate images for each of the selected cells to be used for further processing.
LiveCellMiner allows to apply several statistical tests to selected single features and time series. Note that there is no checking performed if a statistical test is reasonable to be applied for the selected set of features, i.e., familarize yourself with the methodology and make sure to select the appropriate statistical test for your application.
The following functions are included:
- Test for Normal Distribution: Performs a test for each of the selected output classes if the feature follows a normal distribution (the respective test method to use can be specified in the SciXMiner dropdown menu Data mining: Statistical options).
-
Apply Two-Sample t-Test (parametric, SF): Performs a two-sample t-test (e.g., to compare a single feature of two different oligos) on a selected single feature (SF). This also produces result tables in
*.csv
format as output with a matrix of pairwise tests (p-values and boolean test result if hypothesis was rejected or not). The result tables are placed in the same folder as the SciXMiner project resides in. - Apply Wilcoxon Rank Sum Test (non-parametric, SF: Same as the Two-Sample t-Test but using a Wilcoxon rank sum test instead.
- Apply ANOVA (parametric, SF) and Apply Kruskal-Wallis (non-parametric, SF): Possibility to compute an ANOVA or Kruskal-Wallis on the selected single feature (e.g. to compare multiple oligos including a visualization of the confidence intervals and the resulting significances). The multiple testing correction method to use can be specified in the LiveCellMiner settings dialog as well and is named Multiple comparison. See https://de.mathworks.com/help/stats/multcompare.html for further information.
- Compute Median Absolute Deviation (MAD, SF): Computes the global median value of the selected feature across all classes and the individual median values of each class. The average deviation of the individual median values to the global median (MAD) is computed and used as a threshold. Hits that exhibit a median value that deviates more than 3xMAD can be considered as strong hits.
- Apply Two-Way ANOVA: Possibility to perform two-way ANOVA with interaction, where the dependent variable is the selected time series feature and the independent variables are the currently selected classes (e.g., oligos, experiments, microscope, ...) vs. a set of selected time points (e.g., an interphase time point, the MA-transition time point and a later value a few minutes after MA-transition defined by the user). Note that the time point selection uses absolute frame numbers in this case. To get the absolute frame numbers displayed in the line plots and heat map visualizations, make sure to uncheck the Relative Frame Numbers? checkbox located in the general LiveCellMiner settings dialog. The multiple testing correction method to use can be specified in the LiveCellMiner settings dialog as well and is named Multiple comparison. See https://de.mathworks.com/help/stats/multcompare.html for further information.
Note that all above-mentioned statistical tests make use of the general SciXMiner selection, i.e., you can use the Edit -> Select -> Data Points using Classes ... menu entry to perform the desired selection of cells to be used for the statistical analysis. Moreover, make sure to specify the correct output variable (dropdown menu entitled Selection of output variable that you can find in the Single features and Time series: General options dialog).
-
Moreno-Andrés, D., Bhattacharyya, A., Scheufen, A., & Stegmaier, J. (2022). LiveCellMiner: A New Tool to Analyze Mitotic Progression. PLOS ONE, 17(7), e0270923.
-
Mikut, R., Bartschat, A., Doneit, W., Ordiano, J. Á. G., Schott, B., Stegmaier, J., ... & Reischl, M. (2017). The MATLAB Toolbox SciXMiner: User's Manual and Programmer's Guide. arXiv preprint arXiv:1704.03298.
-
Bartschat, A., Hübner, E., Reischl, M., Mikut, R., & Stegmaier, J. (2016). XPIWIT—an XML Pipeline Wrapper for the Insight Toolkit. Bioinformatics, 32(2), 315-317.
-
Stringer, C., Michaelos, M., & Pachitariu, M. (2020). Cellpose: A Generalist Algorithm for Cellular Segmentation. bioRxiv.
-
Haralick, R. M., Shanmugam, K., & Dinstein, I. H. (1973). Textural Features for Image Classification. IEEE Transactions on Systems, Man, and Cybernetics, (6), 610-621.
-
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... & Rabinovich, A. (2015). Going Deeper with Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1-9).
-
Zhong, Q., Busetto, A. G., Fededa, J. P., Buhmann, J. M., & Gerlich, D. W. (2012). Unsupervised Modeling of Cell Morphology Dynamics for Time-Lapse Microscopy. Nature Methods, 9(7), 711-713.
-
Bragantini, J., Theodoro, I., Zhao, X., Huijben, T. A., Hirata-Miyasaki, E., VijayKumar, S., ... & Royer, L. A. (2024). Ultrack: pushing the limits of cell tracking across biological scales. bioRxiv.