Save trained models, checkpoints and tensorboard logs to Google Cloud Storage #100
-
Hi @Toni-SM, Thank you for amazing works. Currently, I train skrl agent using custom training on Google Vertex AI (i.e.
And in the
But this does not work in Google Cloud. I have 2 questions:
Currently, I create another method to save all files in
where the method here:
This is work-around solution for Question 1. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Since the Vertex AI environment variables for CustomJob are configured as follow (reference: baseOutputDirectory): AIP_MODEL_DIR = <baseOutputDirectory>/model/
AIP_CHECKPOINT_DIR = <baseOutputDirectory>/checkpoints/
AIP_TENSORBOARD_LOG_DIR = <baseOutputDirectory>/logs/ The next small changes can be applied to the skrl code to work with Google Cloud Storage.
To view Vertex AI TensorBoard in the Google Cloud console follow the next link: https://cloud.google.com/vertex-ai/docs/experiments/tensorboard-training#view_your_in_the |
Beta Was this translation helpful? Give feedback.
Hi @khanhphan1311
Since the Vertex AI environment variables for CustomJob are configured as follow (reference: baseOutputDirectory):
The next small changes can be applied to the skrl code to work with Google Cloud Storage.
These changes overwrite the configuration with Vertex AI environment variables.
Replace:
skrl/skrl/agents/torch/base.py
Line 86 in 00a2fd3
with