Skip to content
/ mhnpath Public

Official code for 'A User-Tunable Machine Learning Framework for Step-Wise Synthesis Planning'

Notifications You must be signed in to change notification settings

MSRG/mhnpath

Repository files navigation

A User-Tunable Machine Learning Framework for Step-Wise Synthesis Planning


This repository contains the official implementation of A User-Tunable Machine Learning Framework for Step-Wise Synthesis Planning, available on arXiv.

1a
1b

We introduce MHNpath, a machine learning-driven retrosynthetic tool designed for computer-aided synthesis planning. Leveraging modern Hopfield networks and novel comparative metrics, MHNpath efficiently prioritizes reaction templates, improving the scalability and accuracy of retrosynthetic predictions. The tool incorporates a tunable scoring system that allows users to prioritize pathways based on cost, reaction temperature, and toxicity, thereby facilitating the design of greener and cost-effective reaction routes. We demonstrate its effectiveness through case studies involving complex molecules from ChemByDesign, showcasing its ability to predict novel synthetic and enzymatic pathways. Furthermore, we benchmark MHNpath against existing frameworks, replicating experimentally validated "gold-standard" pathways from PaRoutes. Our case studies reveal that the tool can generate shorter, cheaper, moderate-temperature routes employing green solvents, as exemplified by compounds such as dronabinol, arformoterol, and lupinine.


Getting Started

  1. Setup Environment

    Make the project directory your current working directory (this is important):

    conda create -n "mhnpath" python=3.8
    conda activate mhnpath
    pip install -r requirements.txt

    To use the pricing feature, obtain your API keys from one or all of Mcule, Molport, and Chemspace, and add them to the config.yaml file. We highly recommend doing this for the best and most accurate results.

  2. Download Data and Models

    Go to Figshare Dataset, click on Download All, and save the zip file with the default name 28673540.zip.

  3. Extract and Organize Files

    Run the following command to unzip and move the data/models to the required locations:

    python extract.py
  4. Inference

    To perform inference, run:

    python tree_search_global_greedy.py -product "compound" -n_enz 5 -n_syn 5 -max_depth 5 -json_pathway "tree.json" -device "cuda"

    Parameters:

    • product : SMILES string of the target product. (Required)
    • n_enz : Number of enzyme reaction rules to consider. (Optional, default: 3)
    • n_syn : Number of synthetic reaction rules to consider. (Optional, default: 3)
    • max_depth : Maximum depth for the tree search. (Optional, default: 3)
    • json_pathway : Filename for saving the resulting pathway tree in JSON format. (Optional, default: "tree.json")
    • device : Device to run the model on; either "cpu" or "cuda". (Optional, default: "cpu")
  5. Training

    To train using the same hyperparameters as in our experiments, run the following commands:

    python mhnreact/train.py --concat_rand_template_thresh 3 --exp_name enz_final --ssretroeval True --csv_path data/enz_mhn_shuffled.csv --save_model True --seed 0 --epoch 11 --dropout 0.01 --lr 1e-4 --hopf_beta 0.035 --hopf_association_activation 'Tanh' --norm_input False --temp_encoder_layers 2 --batch_size 32 > enz_final.txt
    python mhnreact/train.py --concat_rand_template_thresh 3 --exp_name syn1_final --ssretroeval True --csv_path data/syn_mhn_split_1.csv --save_model True --seed 0 --epoch 11 --dropout 0.01 --lr 1e-4 --hopf_beta 0.035 --hopf_association_activation 'Tanh' --norm_input False --temp_encoder_layers 2 --batch_size 32 > syn1_final.txt
    python mhnreact/train.py --concat_rand_template_thresh 3 --exp_name syn2_final --ssretroeval True --csv_path data/syn_mhn_split_2.csv --save_model True --seed 0 --epoch 11 --dropout 0.01 --lr 1e-4 --hopf_beta 0.035 --hopf_association_activation 'Tanh' --norm_input False --temp_encoder_layers 2 --batch_size 32 > syn2_final.txt
    python mhnreact/train.py --concat_rand_template_thresh 3 --exp_name syn3_final --ssretroeval True --csv_path data/syn_mhn_split_3.csv --save_model True --seed 0 --epoch 11 --dropout 0.01 --lr 1e-4 --hopf_beta 0.035 --hopf_association_activation 'Tanh' --norm_input False --temp_encoder_layers 2 --batch_size 32 > syn3_final.txt
    python mhnreact/train.py --concat_rand_template_thresh 3 --exp_name syn4_final --ssretroeval True --csv_path data/syn_mhn_split_4.csv --save_model True --seed 0 --epoch 11 --dropout 0.01 --lr 1e-4 --hopf_beta 0.035 --hopf_association_activation 'Tanh' --norm_input False --temp_encoder_layers 2 --batch_size 32 > syn4_final.txt
    python mhnreact/train.py --concat_rand_template_thresh 3 --exp_name syn5_final --ssretroeval True --csv_path data/syn_mhn_split_5.csv --save_model True --seed 0 --epoch 11 --dropout 0.01 --lr 1e-4 --hopf_beta 0.035 --hopf_association_activation 'Tanh' --norm_input False --temp_encoder_layers 2 --batch_size 32 > syn5_final.txt

Credits

This code base is built on top of, and thanks to them for maintaining the repositories:

Citation

If you find MHNpath helpful, please consider citing:

@misc{prakash2025usertunablemachinelearningframework,
      title={A User-Tunable Machine Learning Framework for Step-Wise Synthesis Planning}, 
      author={Shivesh Prakash and Hans-Arno Jacobsen and Viki Kumar Prasad},
      year={2025},
      eprint={2504.02191},
      archivePrefix={arXiv},
      primaryClass={cs.CE},
      url={https://arxiv.org/abs/2504.02191}, 
}

About

Official code for 'A User-Tunable Machine Learning Framework for Step-Wise Synthesis Planning'

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages