Welcome to the NAG2G (Node-Aligned Graph-to-Graph) repository! NAG2G is a state-of-the-art neural network model for retrosynthesis prediction.
-
🗓️ May 2025 — 🌿
with_stereoisomerism
branch is now publicly available📝 This branch was completed in 2023 but wasn’t released until now — opened in response to community interest.
New in this branch:
-
💊 Enhanced Stereochemistry Support
- Direct prediction of stereochemical features (e.g., chirality) from model outputs
- No post-processing required for stereochemical reconstruction
-
⇄ Unified Bidirectional Synthesis
- Single model supports both retrosynthesis and forward synthesis
-
-
🗓️ August 2024 — 💻 Initial codebase released (
main
branch) -
🗓️ February 2024 — 🧪 Paper published in JACS Au
-
🗓️ September 2023 — 📄 Preprint available on ArXiv
To begin working with NAG2G, you'll need to set up your environment. Below is a step-by-step guide to get you started:
# Install Uni-Core
git clone https://github.com/dptech-corp/Uni-Core
cd Uni-Core
pip install .
cd -
# Install Unimol plus
cd unimol_plus
pip install .
cd -
# Install additional dependencies
pip install rdchiral transformers tokenizers omegaconf rdkit
You can obtain the dataset USPTO-50k and pretrained model weights for USPTO-50k from the Google Drive:
To validate the NAG2G model with the provided weights, follow the instructions below:
When using a dataset that does not include reactants, you need to modify the valid.sh
script. Specifically, add the --no_reactant
command in line 95 in the code.
When using your own dataset, please modify the data_path
in the valid.sh
script.
# Execute the validation script with the specified checkpoint file
sh valid.sh path2weight/NAG2G_unimolplus_uspto_50k_20230513-222355/checkpoint_last.pt
If you need to regenerate the dataset, please refer to the code inside the data_preprocess
directory.
cd data_preprocess
python lmdb_preprocess <input_csv> <output_lmdb>
Two sample CSV files are provided for reference:
sample.csv
: This sample includes given reactants.sample_without_reactants.csv
: This sample does not include given reactants.
For any questions or issues, please open an issue on our GitHub repository.
Thank you for your interest in NAG2G!