A computational tool for modeling, generating, and analyzing exerkines (beneficial signaling molecules released from exercise).
Exersomes combines Go-based data retrieval with Python analysis to study exercise-induced signaling molecules. The project provides high-throughput processing of gene/protein sequences and facilitates pathway analysis of exerkines, their ligands, and receptors. Using bioinformatics, this project aims to facilitate the exploration and generation of biologically relevant sequences that could aid in understanding gene expression during physical activity.
- Fast concurrent retrieval of gene/protein data from NCBI databases
- High-throughput processing and BLAST similarity searches
- Network analysis and pathway visualization
- Gene and protein sequence generation using transformer models
- Support for variable-length RNA and Protein sequence analysis
- Fetches gene data and annotations from NCBI and Ensembl databases
- Performs high-throughput processing of gene sequences and utilizes BLAST for similarity searches
- Generates gene and protein sequences using a DGM transformer model
- Supports padding and attention masking for variable-length RNA and Protein sequences
- Go 1.16+
- NCBI Entrez Direct utilities
- Python 3.8+
- Python packages: pandas, matplotlib, seaborn, networkx
- Install Go dependencies:
cd exersomes go mod tidy
- Install Python dependencies:
pip install -r requirements.txt
- Install NCBI Entrez Direct utilities:
sh -c "$(curl -fsSL https://ftp.ncbi.nlm.nih.gov/entrez/entrezdirect/install-edirect.sh)"
- Run the complete workflow:
make all
- Run individual steps:
make build
# Build Go code make run # Fetch data from NCBI
make analyze
# Run Python analysis make test # Run unit tests
gene_references.tsv
: Basic gene informationprotein_info.tsv
: Protein details and propertiesprotein_sequences.fasta
: Protein sequences in FASTA formatpathway_maps.tsv
: Gene pathway associationsfunctional_insights.tsv
: Functional annotations from literature and GO
- Go Module: Fast concurrent retrieval of gene/protein data from NCBI
- Python Analysis: Processing and machine learning on the retrieved data
- Visualization: Network analysis and pathway visualization
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
If you use Exersomes in your research, please cite:
Gomez, DJ. et al. (2025). Exersomes: A computational tool for analyzing exercise-induced signaling molecules. [Software]. Available from https://github.com/gomezdj/Exersomes