To avoid CPU -> GPU data transport #11

Nanco-L · 2021-06-15T08:48:45Z

Nanco-L
Jun 15, 2021
Maintainer

Add map_location='cuda'. You can set 'cpu' instead for CPU only mode

SIMPLE-NN_v2/simple_nn_v2/models/data_handler.py

Line 60 in 28007d6

return torch.load(self.filelist[idx])
Remove all .to(device) in training code
Set pin_memory=False in

SIMPLE-NN_v2/simple_nn_v2/models/data_handler.py

Line 223 in 28007d6

num_workers=inputs['neural_network']['workers'], pin_memory=True)

pin_memory only works CPU->GPU data transport.
Add .to(device) for scale and pca (Since all data is uploaded in the target device and collate_fn also works on that device).
Add .to(device) for all tensors generated in the code.

In my case, training speed increase by 5~10 times.