This repository contains tools and scripts designed for analyzing and modeling biological data, with specific components for graph-based models and neural networks. It includes shell scripts, Python and R files, and Jupyter notebooks that facilitate the setup, analysis, and visualization of different data-driven models.
-
Shell Scripts (
makeGraph.sh,makeMetaCells.sh,runGraphModel.sh,runNeuralNetwork.sh, etc.)- Contains bash scripts to automate various stages of data processing and model execution.
-
Python Scripts (in
srcdirectory)dataAnalysisPipeline.ipynb: Jupyter notebook for the primary data analysis pipeline.graphModelFunctions.py,graphModels.py: Modules that define functions and implementations for graph-based modeling.neuralNetworkFunctions.py,neuralNetworks.py: Modules for creating and training neural network models.makeTFAPlots.py,plotLosses.py: Scripts for plotting and visualizing model performance metrics.
-
R Scripts (in
srcandsrc/archivedirectories)getVariableGenes.r,makeH5adFiles.r,mapToEnsembl.r: R scripts that assist in data preparation and transformation for analysis.makeMetaCells.r,makeGraph.py: Scripts focused on preparing graph data and metacells for further analysis.
-
Notebooks and Tutorials (in
srcandsrc/archive)ATACseq.ipynb,DataAnalysisPipeline.ipynb, and others in thearchivedirectory offer preliminary and additional analysis workflows.- Tutorials include
CellOracle GRN models.ipynb, which demonstrates how to build gene regulatory network models using CellOracle.
- Dependencies: Ensure all required Python and R libraries are installed. Refer to
pythonRequirements.txtandrequirements.Rif available, or manually install based on script imports. - Data Preparation: Use the scripts in
srcto preprocess and prepare data. For example,makeH5adFiles.rprepares files in.h5adformat. - Run Models: Execute the models using provided shell scripts:
runGraphModel.shto initiate graph-based models.runNeuralNetwork.shor other neural network scripts for deep learning models.
- Visualization: Generate plots and performance metrics using scripts like
makeTFAPlots.pyandplotLosses.py.