You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Current Issue: Inputs, labels, and neighbous are currently loaded into RAM, which imposes limitations on the maximum size of datasets that can be processed.
Proposed Solution: To address this issue, we plan to precompute and store neighbors as HDF5 files alongside the dataset. Additionally, we will save the values of max_r and min_r along with the neighbors. In the input pipeline, two TFDatasets will be generated—one from the atoms file (moste formats possible exclusively .traj) and another from the precomputed neighbours. These datasets will then be merged.
Advantages of the Proposed Solution: This approach offers several advantages. Firstly, it eliminates the need to load both the dataset and its neighbors into RAM, thereby mitigating memory constraints. Secondly, if the same dataset is used for multiple training sessions with the same max_r and min_r values, the precomputing step can be skipped, resulting in a more efficient workflow.
The text was updated successfully, but these errors were encountered:
Current Issue: Inputs, labels, and neighbous are currently loaded into RAM, which imposes limitations on the maximum size of datasets that can be processed.
Proposed Solution: To address this issue, we plan to precompute and store neighbors as HDF5 files alongside the dataset. Additionally, we will save the values of max_r and min_r along with the neighbors. In the input pipeline, two TFDatasets will be generated—one from the atoms file (moste formats possible exclusively .traj) and another from the precomputed neighbours. These datasets will then be merged.
Advantages of the Proposed Solution: This approach offers several advantages. Firstly, it eliminates the need to load both the dataset and its neighbors into RAM, thereby mitigating memory constraints. Secondly, if the same dataset is used for multiple training sessions with the same max_r and min_r values, the precomputing step can be skipped, resulting in a more efficient workflow.
The text was updated successfully, but these errors were encountered: