Skip to content

Save trained models, checkpoints and tensorboard logs to Google Cloud Storage #100

Answered by Toni-SM
khanhphan1311 asked this question in Q&A
Discussion options

You must be logged in to vote

Hi @khanhphan1311

Since the Vertex AI environment variables for CustomJob are configured as follow (reference: baseOutputDirectory):

AIP_MODEL_DIR = <baseOutputDirectory>/model/
AIP_CHECKPOINT_DIR = <baseOutputDirectory>/checkpoints/
AIP_TENSORBOARD_LOG_DIR = <baseOutputDirectory>/logs/

The next small changes can be applied to the skrl code to work with Google Cloud Storage.
These changes overwrite the configuration with Vertex AI environment variables.

  • Replace:

    self.experiment_dir = os.path.join(directory, experiment_name)

    with

    self.experiment_dir = os.path.join(directory, experiment_name)
    if "AIP_CHECKPOINT_DIR" in os.e…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by khanhphan1311
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants