Fix edge case in `PyTorchPredictor.deserialize` #2994

lostella · 2023-09-04T11:29:40Z

Description of changes: Since #2965 (not sure before), if a PyTorchPredictor object was serialized from the CPU memory, then deserializeing it with device="cuda" would actually not work. This would happen:

predictor object created, with parameters of its torch.nn.Module model allocated on CPU (since that was the device the predictor was serialized with)
when deserializing with device="cuda", torch.load with map_location="cuda" would put the state_dict values on GPU as expected
torch.nn.Module.load_state_dict would however copy the parameters back to CPU
since the predictor has the .device attribute set to "cpu", prediction would not complain (data & model on the same device) but would be really slow.

This PR makes sure that the predictor object being created is moved .to(device) after step 1, so that step 3 actually keeps parameters on GPU.

The same issue happens inverting CPU and GPU, as in the following example

import pandas as pd
import numpy as np
import logging
import tempfile
from gluonts.model import Predictor
from gluonts.torch.model.wavenet import WaveNetEstimator
from pathlib import Path

logging.basicConfig(level=logging.INFO)

data = [
    {
        "start": pd.Period("2012-02-04", freq="D"),
        "target": np.ones(2000),
    }
]

estimator = WaveNetEstimator(
    freq="H",
    prediction_length=24,
    num_batches_per_epoch=2,
    trainer_kwargs=dict(max_epochs=1),
)

predictor = estimator.train(data)

with tempfile.TemporaryDirectory() as td:
    predictor.serialize(Path(td))
    predictor = Predictor.deserialize(Path(td), device="cpu")
    print(f" predictor device is {predictor.device}")
    print(f"module parameters on {next(predictor.prediction_net.parameters()).device}")

Expected to have the resulting model on CPU. Output before the PR:

 predictor device is cuda
module parameters on cuda:0

Output after the PR:

 predictor device is cpu
module parameters on cpu

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Please tag this pr with at least one of these labels to make our release process faster: BREAKING, new feature, bug fix, other change, dev setup

lostella · 2023-09-04T15:21:56Z

@abdulfatir now I'm wondering, whether it makes sense to have a device option in PyTorchPredictor.deserialize at all, since doing

predictor = Predictor.deserialize(path, device=device)

is the same as doing

predictor = Predictor.deserialize(path)
predictor.to(device)

Fix edge case in PyTorchPredictor.deserialize

e125bcd

lostella requested a review from abdulfatir September 4, 2023 11:29

lostella added torch This concerns the PyTorch side of GluonTS bug fix (one of pr required labels) labels Sep 4, 2023

lostella marked this pull request as draft September 4, 2023 13:43

lostella marked this pull request as ready for review September 4, 2023 15:19

lostella added this to the v0.14 milestone Sep 4, 2023

abdulfatir approved these changes Sep 5, 2023

View reviewed changes

lostella merged commit 25c76a2 into awslabs:dev Sep 5, 2023

lostella deleted the predictor-to-device branch September 5, 2023 15:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix edge case in `PyTorchPredictor.deserialize` #2994

Fix edge case in `PyTorchPredictor.deserialize` #2994

lostella commented Sep 4, 2023 •

edited

Loading

lostella commented Sep 4, 2023

Fix edge case in PyTorchPredictor.deserialize #2994

Fix edge case in PyTorchPredictor.deserialize #2994

Conversation

lostella commented Sep 4, 2023 • edited Loading

lostella commented Sep 4, 2023

Fix edge case in `PyTorchPredictor.deserialize` #2994

Fix edge case in `PyTorchPredictor.deserialize` #2994

lostella commented Sep 4, 2023 •

edited

Loading