Below are a few of the active software projects being undertaken by the Munsky group. For full information and most recent versions, please visit our GitHub page at: https://github.com/MunskyGroup
MATLAB Stochastic System Identification Toolkit (SSIT) for modeling single-cell fluorescence microscopy data
Authors: Huy Vo, Joshua Cook, Brian Munsky
The SSIT allows users to specifiy and solve the chemical master equation for discreate stochastic models, especially those used for the analysis of single-cell gene regulaton.
To learn more about the FSP theory that underlies the SSIT, please see the slide from our Nov. 3, 2022 BPPB Seminar
The SSIT includes command line tools and a graphical user interface to:
- Build, save, and load models
- Generate synthetic data from models using Stochastic Simulations
- Solve models using the Finite State Projection algorithm
- Compute sensitivity of FSP solutions to parameter variations
- Load experimental smFISH data
- Compute/Maximize the likelihood of data givien model
- Run Metropolis Hastings algorithm to estimate parametr uncertainties given single-cell data
- Compute the Fisher Information Matrix for CME models
- Search experiment design space to find optimally informative experiments
Dependencies
For all basic functionalities:
- MATLAB R2021b or later.
- Symbolic Computing Toolbox.
- Global Optimization Toolbox.
- Parallel Computing Toolbox.
- Tensor Toolbox for MATLAB. You will need to make sure to add the TTB to the Matlab path before running the SSIT.
Installation
Clone this package to a local folder on your computer. Then add the path to that folder (with subfolders) into MATLAB’s search path. You can then call all functions from MATLAB.
Getting Started
The SSIT provides two basic interaction options: (1) command line tools and (2) a graphical user interface.
GUI Version
To get started with the GUI, compile and launch to tool kit with the following commands:
src2app;
A = SSIT_GUI;
You should then see the model loading and building page of the graphical interface, and you are off to the races…
Command Line Version
To get started with the Command line Tools, navigate to the directory “CommandLine” and open one of the tutorial scripts “example_XXX.m”. Or you caan start creating and solving models as follows.
Example for generating an FSI model and fitting it to smFISH data for Dusp1 activation following glucocorticoid stimulation:
Define SSIT Model
Model = SSIT;
Model.species = {‘x1′;’x2’};
Model.initialCondition = [0; 0];
Model.propensityFunctions = {‘kon * IGR * (2-x1)’; ‘koff * x1’; ‘kr * x1’; ‘gr * x2’};
Model.stoichiometry = [1,-1,0,0; 0,0,1,-1];
Model.inputExpressions = {‘IGR’,’a0 + a1 * exp(-r1 * t) * (1-exp(-r2 * t)) * (t>0)’};
Model.parameters = ({‘koff’,0.14; ‘kon’,0.14; ‘kr’,25; ‘gr’,0.01; ‘a0’,0.006; ‘a1’,0.4; ‘r1’,0.04; ‘r2’,0.1});
Model.initialTime = -120; % large negative time to simulate steady state at t=0
Load and Fit smFISH Data
Model = Model.loadData(‘../ExampleData/DUSP1_Dex_100nM_Rep1_Rep2.csv’,{‘x2′,’RNA_nuc’});
Model.tSpan = unique([Model.initialTime,Model.dataSet.times]); fitOptions = optimset(‘Display’,’iter’,’MaxIter’,100); pars,likelihood] = Model.maximizeLikelihood([],fitOptions);
Update Model and Make Plots of Results
Model.parameters(:,2) = num2cell(pars);
Model.makeFitPlot
You should arrive at a fit of the model to the experimentally measured Dusp1 mRNA distributions looking something like this:
Fluorescence In Situ Hybridization (FISH) – automated image processing
Authors: Luis U. Aguilera, Linda Forero-Quintero, Eric Ron, Joshua Cook, Brian Munsky
Description
Repository to automatically process Fluorescence In Situ Hybridization (FISH) images. This repository uses PySMB to allow the user to transfer data between Network-attached storage (NAS) and a remote or local server. Then it uses Cellpose to detect and segment cells on microscope images. Big-FISH is used to quantify the number of spots per cell. Data is processed using Pandas data frames for single-cell and cell population statistics.
Code architecture
Code overview
Cell segmentation
* The code can achieve accurate cell segmentation for the nucleus and cytosol in the images. The segmentation is performed using cellpose and complex optimization routines that ensure the maximum number of cells detected in the image.
Spot detection
* Spot detection is achieved using Big-FISH. Customization is added in this code to detect spots in multiple color channels. Additionally, this repository contains algorithms to measure spots that are co-detected in different color channels.
Spot counting
* The code quantifies the number of spots per cell and allows the visualization of these numbers as a function of cell size.
Spot intensity quantification
* The code allows quantifying the intensity of each spot, using the disk and a ring mask method developed by Morisaki and Stasevich, Methods Mol Biol. 2022.
Data management
* A complete data-frame for all processed images and cells is generated. This data-frame contains information about the location and intensity of each detected spot.
Data reproducibility report
* To increase reproducibility a metadata report is generated. This report contains information about the list of images processed, the specific parameters used to process the data, the user that processed the data, and the version of the modules and packages used.
Data visualization and publication quality images.
* Plotting a complete field of view
* Plotting the detected spots and transcription sites in a selected cell.
* Plotting all color channels for a selected cell.
* Plotting all z-slices for a selected cell.
Installation
Installation on a local computer
To install this repository and all its dependencies. We recommend installing Anaconda.
- Clone the repository.
git clone --depth 1 https://github.com/MunskyGroup/FISH_Processing.git
- To create a virtual environment, navigate to the location of the requirements file, and use:
conda create -n FISH_processing python=3.8 -y
source activate FISH_processing
- To install pytorch for GPU usage in Cellpose (Optional step). Only for Linux and Windows users check the specific version for your computer on this link :
conda install pytorch cudatoolkit=10.2 -c pytorch -y
- To install pytorch for CPU usage in Cellpose (Optional step). Only for Mac users check the specific version for your computer on this link :
conda install pytorch -c pytorch
- To include the rest of the requirements use:
pip install -r requirements.txt
Installation on the Keck-Cluster (Rocky Linux 8)
The following instructions are intended to use the codes on the Keck Cluster.
- Clone the repository to the cluster.
git clone --depth 1 https://github.com/MunskyGroup/FISH_Processing.git
- Move to the directory
cd FISH_Processing
- Create an environment from this YAML file.
conda env create -f FISH_env.yml
Using this repository
Most codes are accessible as notebook scripts or executables.
To use the codes locally with an interactive environment, use the notebooks folder
To process images use the notebook FISH pipeline
After processing the images use the notebook FISH pipeline to analyze multiple datasets
Executable codes are located in cluster folder
- A Bash script is used to execute a python script containing the image processing pipeline. Please adapt these scripts to your specific configuration and target folders.
Miscellaneous instructions:
To login to the NAS, it is needed to provide a configuration YAML file with the format:
user:
username: user_name
password: user_password
remote_address : remote_ip_address
domain: remote_domain
Creating an environment file (YAML) use:
conda env export > FISH_env.yml
Additional steps to deactivate or remove the environment from the computer:
- To deactivate the environment, use
conda deactivate
- To remove the environment use:
conda env remove -n FISH_processing
To create the documentation use the following modules.
pip install sphinx
pip install sphinx_rtd_theme
pip install Pygments
Licenses for dependencies
Please check this file with the licenses for BIG-FISH, Cellpose, and PySMB.
Citation
If you use this repository, make sure to cite BIG-FISH and Cellpose:
rSNAPsim – RNA Sequence to NAscent Protein Simulation
Project Goal
Provide a Python module that takes nucleotide sequence as an input and does the following:
- Choose a file or pull a file from GeneBank
- Analyzes the sequence and identifies proteins
- Detects or adds fluorescent tags
- Simulates translation trajectories and converts to intensity vectors of A.U. under various conditions
- Constructs with Rare codons only or Common codons, FRAP or Harringtonite assays
- Provides analyses of the trajectories
- Allows the user to save or export the data
- Commandline / GUI implementations
Documentation
Tutorials, Module Documentation, Installiation and more [LINK TO MUNSKY GROUP WEBSITE]
Dependencies:
Instillation
Within a conda enviroment:
conda install eigen
pip install rsnapsim-ssa-cpp
pip install rsnapsim
Within a Google Colab:
!apt install libeigen3-dev
!ln -sf /usr/include/eigen3/Eigen /usr/include/Eigen
!pip install rsnapsim-ssa-cpp
!pip install rsnapsim
!pip install --upgrade rsnapsim
Compilation of the C++
The c++ model should attempt to compile when you pip install the ssa-cpp module, however in the event that it cannot here are some common errors:
cannot include eigen3/Eigen/Dense
- This means eigen was not installed correctly from the conda installiation, you may have to manually download eigen and pass the argument to the setup.py command.
python setup.py build_ext --inplace -I[PATH TO EIGEN FOLDER]
- This means eigen was not installed correctly from the conda installiation, you may have to manually download eigen and pass the argument to the setup.py command.
gcc not found
Example Colab Notebooks
- Simulating Translation
- Simulating Constructs with Different codon usages
- Intensity Analyses
- Harringtonine / FRAP simulations
- Model Maker/ Designer
- MW/Diffusion Calculations
Future work
- Example notebooks of all functions
rSNAPed: RNA Sequence to NAscent Protein Experiment Designer.Authors: Luis U. Aguilera, William Raymond, Tatsuya Morisaki, Brooke Silagy, Timothy J. Stasevich, and Brian Munsky. |
---|
Description
rSNAPed is a library to simulate single-molecule gene expression experiments to test machine learning and computational pipelines. The code generates simulated intensity translation spots using rSNAPsim. Cell segmentation is performed using Cellpose. Spot detection and tracking is achieved using Trackpy. If you use rSNAPed
, please make sure you properly cite cellpose
, trackpy
and rSNAPsim
.
Summary of uses
- Simulating the single-molecule translation for any gene.
- Design of single-molecule gene expression experiments.
- Tracking for single-molecule translation (RNA + nascent protein) spots.
- Tracking for single-molecule RNA spots.
Ethical Considerations and Content Policy
You must accept our Content Policy when using this library:
- All simulated images generated with this software are intended to be used to test Machine learning or computational algorithms.
- All images generated with this software should always be labeled with the specific terms “simulated data” or “simulated images”.
- All datasets resulting from a simulated image should explicitly be reported with the term “simulated data”.
- Under any circumstance, a simulated image or dataset generated with rSNAPed should not be used to misrepresent real data.
- For public or private use, you must disclose that the generated images are simulated data and give proper credit to rSNAPed.
Test the codes in Google Colab
Description | Link |
---|---|
How to simulate your cell! 👉 | |
Harringtonin experiment 👉 | |
Manual particle tracking 👉 | |
🔥 Automated cell segmentation and particle tracking 🔥 👉 | |
Multiplexing experiments 👉 |
Simulating single-molecule translation
The code generates videos with the simulated cell and a data frame containing spot and intensity positions. This simulation can be used to train new algorithms.
Local installation using PIP
- To create a virtual environment using:
conda create -n rsnaped_env python=3.8.5 -y
source activate rsnaped_env
- Open the terminal and use pip for the installation:
pip install rsnaped
Local installation from the Github repository
- To create a virtual environment navigate to the location of the requirements file, and use:
conda create -n rsnaped_env python=3.8.5 -y
source activate rsnaped_env
- To install GPU for Cellpose (Optional step). For Linux and Windows users check the specific version for your computer on this link :
conda install pytorch cudatoolkit=10.2 -c pytorch -y
- To install CPU for Cellpose (Optional step). For Mac users check the specific version for your computer on this link :
conda install pytorch -c pytorch
- To include the rest of the requirements use:
pip install -r requirements.txt
Additional steps to deactivate or remove the environment from the computer:
- To deactivate the environment use
conda deactivate
- To remove the environment use:
conda env remove -n rsnaped_env
References for main dependencies
rSNAPsim: Aguilera, Luis U., et al. “Computational design and interpretation of single-RNA translation experiments.” PLoS computational biology 15.10 (2019): e1007425.
Trackpy: Dan Allan, et al. (2019, October 16). soft-matter/trackpy: Trackpy v0.4.2 (Version v0.4.2). Zenodo. http://doi.org/10.5281/zenodo.3492186
Cellpose: Stringer, Carsen, et al. “Cellpose: a generalist algorithm for cellular segmentation.” Nature Methods 18.1 (2021): 100-106.
Licenses for dependencies
For a complete list containing the complete licenses for the dependencies, check file: Licenses_Dependencies.md.
- License for rSNAPsim: MIT. Copyright © 2018 Dr. Luis Aguilera, William Raymond
- License for Trackpy: BSD-3-Clause. Copyright © 2013-2014 trackpy contributors https://github.com/soft-matter/trackpy. All rights reserved.
- License for Cellpose: BSD 3-Clause. Copyright © 2020 Howard Hughes Medical Institute
Cite as
Luis Aguilera, William Raymond, Tatsuya Morisaki, Brooke Silagy, Timothy J. Stasevich, & Brian Munsky. (2022). rSNAPed. RNA Sequence to NAscent Protein Experiment Designer. (v0.1-beta.2). Zenodo. https://doi.org/10.5281/zenodo.6967555
Below is the Github repository holding all the links to Colab Notebooks and files needed during the course.
Authors
Brian Munsky, Luis Aguilera, William Raymond, Joshua Cook, Michael May, Zachary Fox, Eric Ron, Keisha Cook, Kaan Ocal, Ania Baetica, and Ana Carolina Padua.
uqbio.summer.school@gmail.com • 2023 Undergraduate Summer School Schedule • UQ-Bio • Munsky Group
Modules
Module 0 (Online) : Getting Started with Basic Scientific Computing in Python.
Module 1 : Optical Microscopy Experiments and Image Processing .
Module 2 : Multivariable Statistics and Machine Learning for Biological Data.
Module 3 : Stochastic Simulations of Biological Processes.
Module 4 : Master Equation Analyses of Biological Processes.
UQ-Bio23 Drug Discovery Challenge
Drug Discovery Challenge Presentation