Unified OCP Trainer (#520)

* initial single trainer commit * more general evaluator * backwards tasks * debug config * predict support, evaluator cleanup * cleanup, remove hpo * loss bugfix, cleanup hpo * backwards compatability for old configs * backwards breaking fix * eval fix * remove old imports * default for get task metrics * rebase cleanup * config refactor support * black * reorganize free_atoms * output config fix * config naming * support loss mean over all dimensions * config backwards support * equiformer can now run * add example equiformer config * handle arbitrary torch loss fns * correct primary metric def * update s2ef portion of OCP tutorial * add type annotations * cleanup * Type annotations * Abstract out _get_timestamp * don't double ids when saving prediction results * clip_grad_norm should be float * model compatibility * evaluator test fix * lint * remove old models * pass calculator test * remove DP, cleanup * remove comments * eqv2 support * odac energy trainer merge fix * is2re support * cleanup * config cleanup * oc22 support * introduce collater to handle otf_graph arg * organize methods * include parent in targets * shape flexibility * cleanup debug lines * cleanup * normalizer bugfix for new configs * calculator normalization fix, backwards support for ckpt loads * New weight_decay config -- defaults in BaseModel, extendable by others (e.g. EqV2) * Doc update * Throw a warning instead of a hard error for optim.weight_decay * EqV2 readme update * Config update * don't need transform on inference lmdbs with no ground truth * remove debug configs * ocp-2.0 example.yml * take out ocpdataparallel from fit.py * linter * update tutorials --------- Co-authored-by: Janice Lan <[email protected]> Co-authored-by: Richard Barnes <[email protected]> Co-authored-by: Abhishek Das <[email protected]>
FAIR-Chem · Jan 5, 2024 · 1382a35 · 1382a35
1 parent e7a8745
commit 1382a35
Show file tree

Hide file tree

Showing 84 changed files with 4,727 additions and 10,088 deletions.
diff --git a/DATASET.md b/DATASET.md
@@ -340,7 +340,7 @@ Please consider citing the following paper in any research manuscript using the
 
 
 
-```
+```bibtex
 @article{ocp_dataset,
     author = {Chanussot*, Lowik and Das*, Abhishek and Goyal*, Siddharth and Lavril*, Thibaut and Shuaibi*, Muhammed and Riviere, Morgane and Tran, Kevin and Heras-Domingo, Javier and Ho, Caleb and Hu, Weihua and Palizhati, Aini and Sriram, Anuroop and Wood, Brandon and Yoon, Junwoong and Parikh, Devi and Zitnick, C. Lawrence and Ulissi, Zachary},
     title = {Open Catalyst 2020 (OC20) Dataset and Community Challenges},
@@ -462,12 +462,12 @@ The Open Catalyst 2022 (OC22) dataset is licensed under a [Creative Commons Attr
 Please consider citing the following paper in any research manuscript using the OC22 dataset:
 
 
-```
+```bibtex
 @article{oc22_dataset,
     author = {Tran*, Richard and Lan*, Janice and Shuaibi*, Muhammed and Wood*, Brandon and Goyal*, Siddharth and Das, Abhishek and Heras-Domingo, Javier and Kolluru, Adeesh and Rizvi, Ammar and Shoghi, Nima and Sriram, Anuroop and Ulissi, Zachary and Zitnick, C. Lawrence},
-    title = {The Open Catalyst 2022 (OC22) Dataset and Challenges for Oxide Electrocatalysis},
-    year = {2022},
-    journal={arXiv preprint arXiv:2206.08917},
+    title = {The Open Catalyst 2022 (OC22) dataset and challenges for oxide electrocatalysts},
+    journal = {ACS Catalysis},
+    year={2023},
 }
 ```
 
@@ -513,7 +513,7 @@ The OpenDAC 2023 (ODAC23) dataset is licensed under a [Creative Commons Attribut
 Please consider citing the following paper in any research manuscript using the ODAC23 dataset:
 
 
-```
+```bibtex
 @article{odac23_dataset,
     author = {Anuroop Sriram and Sihoon Choi and Xiaohan Yu and Logan M. Brabson and Abhishek Das and Zachary Ulissi and Matt Uyttendaele and Andrew J. Medford and David S. Sholl},
     title = {The Open DAC 2023 Dataset and Challenges for Sorbent Discovery in Direct Air Capture},

diff --git a/README.md b/README.md
@@ -11,28 +11,34 @@ library of state-of-the-art machine learning algorithms for catalysis.
 </div>
 
 It provides training and evaluation code for tasks and models that take arbitrary
-chemical structures as input to predict energies / forces / positions, and can
-be used as a base scaffold for research projects. For an overview of tasks, data, and metrics, please read our papers:
+chemical structures as input to predict energies / forces / positions / stresses,
+and can be used as a base scaffold for research projects. For an overview of
+tasks, data, and metrics, please read our papers:
  - [OC20](https://arxiv.org/abs/2010.09990)
  - [OC22](https://arxiv.org/abs/2206.08917)
  - [ODAC23](https://arxiv.org/abs/2311.00341)
 
-Projects developed on `ocp`:
+Projects and models built on `ocp`:
 
-- CGCNN [[`arXiv`](https://arxiv.org/abs/1710.10324)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/blob/main/ocpmodels/models/cgcnn.py)]
 - SchNet [[`arXiv`](https://arxiv.org/abs/1706.08566)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/blob/main/ocpmodels/models/schnet.py)]
-- DimeNet [[`arXiv`](https://arxiv.org/abs/2003.03123)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/blob/main/ocpmodels/models/dimenet.py)]
-- ForceNet [[`arXiv`](https://arxiv.org/abs/2103.01436)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/blob/main/ocpmodels/models/forcenet.py)]
 - DimeNet++ [[`arXiv`](https://arxiv.org/abs/2011.14115)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/blob/main/ocpmodels/models/dimenet_plus_plus.py)]
-- SpinConv [[`arXiv`](https://arxiv.org/abs/2106.09575)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/blob/main/ocpmodels/models/spinconv.py)]
 - GemNet-dT [[`arXiv`](https://arxiv.org/abs/2106.08903)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/tree/main/ocpmodels/models/gemnet)]
 - PaiNN [[`arXiv`](https://arxiv.org/abs/2102.03150)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/tree/main/ocpmodels/models/painn)]
 - Graph Parallelism [[`arXiv`](https://arxiv.org/abs/2203.09697)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/tree/main/ocpmodels/models/gemnet_gp)]
 - GemNet-OC [[`arXiv`](https://arxiv.org/abs/2204.02782)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/tree/main/ocpmodels/models/gemnet_oc)]
 - SCN [[`arXiv`](https://arxiv.org/abs/2206.14331)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/tree/main/ocpmodels/models/scn)]
+- AdsorbML [[`arXiv`](https://arxiv.org/abs/2211.16486)] [[`code`](https://github.com/open-catalyst-project/adsorbml)]
 - eSCN [[`arXiv`](https://arxiv.org/abs/2302.03655)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/tree/main/ocpmodels/models/escn)]
 - EquiformerV2 [[`arXiv`](https://arxiv.org/abs/2306.12059)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/tree/main/ocpmodels/models/equiformer_v2)]
 
+Older model implementations that are no longer supported:
+
+- CGCNN [[`arXiv`](https://arxiv.org/abs/1710.10324)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/blob/e7a8745eb307e8a681a1aa9d30c36e8c41e9457e/ocpmodels/models/cgcnn.py)]
+- DimeNet [[`arXiv`](https://arxiv.org/abs/2003.03123)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/blob/e7a8745eb307e8a681a1aa9d30c36e8c41e9457e/ocpmodels/models/dimenet.py)]
+- SpinConv [[`arXiv`](https://arxiv.org/abs/2106.09575)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/blob/e7a8745eb307e8a681a1aa9d30c36e8c41e9457e/ocpmodels/models/spinconv.py)]
+- ForceNet [[`arXiv`](https://arxiv.org/abs/2103.01436)] [[`code`](https://github.com/Open-Catalyst-Project/ocp/blob/e7a8745eb307e8a681a1aa9d30c36e8c41e9457e/ocpmodels/models/forcenet.py)]
+
+
 ## Installation
 
 See [installation instructions](https://github.com/Open-Catalyst-Project/ocp/blob/main/INSTALL.md).

diff --git a/configs/is2re/100k/cgcnn/cgcnn.yml b/configs/is2re/100k/cgcnn/cgcnn.yml
diff --git a/configs/is2re/10k/cgcnn/cgcnn.yml b/configs/is2re/10k/cgcnn/cgcnn.yml
diff --git a/configs/is2re/all/cgcnn/cgcnn.yml b/configs/is2re/all/cgcnn/cgcnn.yml
diff --git a/configs/is2re/all/painn/painn_h1024_bs8x4.yml b/configs/is2re/all/painn/painn_h1024_bs8x4.yml
@@ -20,7 +20,9 @@ optim:
   load_balancing: atoms
   num_workers: 2
   optimizer: AdamW
-  optimizer_params: {"amsgrad": True}
+  optimizer_params:
+    amsgrad: True
+    weight_decay: 0  # 2e-6 (TF weight decay) / 1e-4 (lr) = 2e-2
   lr_initial: 1.e-4
   scheduler: ReduceLROnPlateau
   mode: min
@@ -31,4 +33,3 @@ optim:
   ema_decay: 0.999
   clip_grad_norm: 10
   loss_energy: mae
-  weight_decay: 0  # 2e-6 (TF weight decay) / 1e-4 (lr) = 2e-2
diff --git a/configs/is2re/example.yml b/configs/is2re/example.yml
@@ -95,9 +95,10 @@ optim:
   # Learning rate. Passed as an `lr` argument when initializing the optimizer.
   lr_initial: 1.e-4
   # Additional args needed to initialize the optimizer.
-  optimizer_params: {"amsgrad": True}
-  # Weight decay to use. Passed as an argument when initializing the optimizer.
-  weight_decay: 0
+  optimizer_params:
+    amsgrad: True
+    # Weight decay to use. Passed as an argument when initializing the optimizer.
+    weight_decay: 0
   # Learning rate scheduler. Should work for any scheduler specified in
   # in torch.optim.lr_scheduler: https://pytorch.org/docs/stable/optim.html
   # as long as the relevant args are specified here.

diff --git a/configs/oc22/is2re/painn/painn.yml b/configs/oc22/is2re/painn/painn.yml
@@ -20,7 +20,9 @@ optim:
   load_balancing: atoms
   num_workers: 2
   optimizer: AdamW
-  optimizer_params: {"amsgrad": True}
+  optimizer_params:
+    amsgrad: True
+    weight_decay: 0  # 2e-6 (TF weight decay) / 1e-4 (lr) = 2e-2
   lr_initial: 1.e-4
   scheduler: ReduceLROnPlateau
   mode: min
@@ -31,4 +33,3 @@ optim:
   ema_decay: 0.999
   clip_grad_norm: 10
   loss_energy: mae
-  weight_decay: 0  # 2e-6 (TF weight decay) / 1e-4 (lr) = 2e-2
diff --git a/configs/oc22/s2ef/gemnet-oc/gemnet_oc.yml b/configs/oc22/s2ef/gemnet-oc/gemnet_oc.yml
@@ -65,7 +65,9 @@ optim:
   num_workers: 2
   lr_initial: 5.e-4
   optimizer: AdamW
-  optimizer_params: {"amsgrad": True}
+  optimizer_params:
+    amsgrad: True
+    weight_decay: 0  # 2e-6 (TF weight decay) / 1e-4 (lr) = 2e-2
   warmup_steps: -1 # don't warm-up the learning rate
   # warmup_factor: 0.2
   lr_gamma: 0.8
@@ -81,4 +83,3 @@ optim:
   max_epochs: 80
   ema_decay: 0.999
   clip_grad_norm: 10
-  weight_decay: 0  # 2e-6 (TF weight decay) / 1e-4 (lr) = 2e-2
diff --git a/configs/oc22/s2ef/gemnet-oc/gemnet_oc_finetune.yml b/configs/oc22/s2ef/gemnet-oc/gemnet_oc_finetune.yml
@@ -65,7 +65,9 @@ optim:
   num_workers: 2
   lr_initial: 1.e-4
   optimizer: AdamW
-  optimizer_params: {"amsgrad": True}
+  optimizer_params:
+    amsgrad: True
+    weight_decay: 0  # 2e-6 (TF weight decay) / 1e-4 (lr) = 2e-2
   warmup_steps: -1 # don't warm-up the learning rate
   # warmup_factor: 0.2
   lr_gamma: 0.8
@@ -94,7 +96,6 @@ optim:
   max_epochs: 15
   ema_decay: 0.999
   clip_grad_norm: 10
-  weight_decay: 0  # 2e-6 (TF weight decay) / 1e-4 (lr) = 2e-2
   loss_energy: mae
   loss_force: l2mae
   force_coefficient: 100

diff --git a/configs/oc22/s2ef/gemnet-oc/gemnet_oc_oc20_oc22.yml b/configs/oc22/s2ef/gemnet-oc/gemnet_oc_oc20_oc22.yml
@@ -65,15 +65,16 @@ optim:
   num_workers: 2
   lr_initial: 5.e-4
   optimizer: AdamW
-  optimizer_params: {"amsgrad": True}
+  optimizer_params:
+    amsgrad: True
+    weight_decay: 0  # 2e-6 (TF weight decay) / 1e-4 (lr) = 2e-2
   scheduler: ReduceLROnPlateau
   mode: min
   factor: 0.8
   patience: 3
   max_epochs: 80
   ema_decay: 0.999
   clip_grad_norm: 10
-  weight_decay: 0  # 2e-6 (TF weight decay) / 1e-4 (lr) = 2e-2
   loss_energy: mae
   loss_force: atomwisel2
   force_coefficient: 1

diff --git a/configs/oc22/s2ef/gemnet-oc/gemnet_oc_oc20_oc22_degen_edges.yml b/configs/oc22/s2ef/gemnet-oc/gemnet_oc_oc20_oc22_degen_edges.yml
@@ -67,15 +67,16 @@ optim:
   num_workers: 2
   lr_initial: 5.e-4
   optimizer: AdamW
-  optimizer_params: {"amsgrad": True}
+  optimizer_params:
+    amsgrad: True
+    weight_decay: 0  # 2e-6 (TF weight decay) / 1e-4 (lr) = 2e-2
   scheduler: ReduceLROnPlateau
   mode: min
   factor: 0.8
   patience: 3
   max_epochs: 80
   ema_decay: 0.999
   clip_grad_norm: 10
-  weight_decay: 0  # 2e-6 (TF weight decay) / 1e-4 (lr) = 2e-2
   loss_energy: mae
   loss_force: atomwisel2
   force_coefficient: 1

diff --git a/configs/oc22/s2ef/painn/painn.yml b/configs/oc22/s2ef/painn/painn.yml
@@ -22,7 +22,9 @@ optim:
   eval_every: 5000
   num_workers: 2
   optimizer: AdamW
-  optimizer_params: {"amsgrad": True}
+  optimizer_params:
+    amsgrad: True
+    weight_decay: 0  # 2e-6 (TF weight decay) / 1e-4 (lr) = 2e-2
   lr_initial: 1.e-4
   warmup_steps: -1 # don't warm-up the learning rate
   # warmup_factor: 0.2
@@ -39,4 +41,3 @@ optim:
   max_epochs: 80
   ema_decay: 0.999
   clip_grad_norm: 10
-  weight_decay: 0  # 2e-6 (TF weight decay) / 1e-4 (lr) = 2e-2
diff --git a/configs/oc22/s2ef/spinconv/spinconv.yml b/configs/oc22/s2ef/spinconv/spinconv.yml
diff --git a/configs/oc22/s2ef/spinconv/spinconv_finetune.yml b/configs/oc22/s2ef/spinconv/spinconv_finetune.yml