Creat Gpu TFTensor from Cuda array on GPU to avoid the deviceTohost copy. #478

serjl · 2020-03-02T12:49:38Z

Is your feature request related to a problem? Please describe.
TFTensor object usually obtained from a CPU array meaning a need of data copy from a device(GPU) to a host (CPU). This pipleine architecture is considerably slow for a large dataset (e.g. large images).

Describe the solution you'd like
My pipeline considers the image processing via cuda(managedCuda wrapper) and its libraries (npp). At some point i would like to feed my CNN with an image stored on GPU as npp Image or just cuda array or just device pointer - call this d_array for the sake of convenience. Of course, one can copy it to the host to get a standard host

d_array.CopytoHost(h_array);

and then define the usual

var tensor = new TFTensor (h_array);

Is there and option to get the device tensor d_tensor from d_array directly avoiding CopytoHost operation and feed it to CNN?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creat Gpu TFTensor from Cuda array on GPU to avoid the deviceTohost copy. #478

Creat Gpu TFTensor from Cuda array on GPU to avoid the deviceTohost copy. #478

serjl commented Mar 2, 2020 •

edited

Loading

Creat Gpu TFTensor from Cuda array on GPU to avoid the deviceTohost copy. #478

Creat Gpu TFTensor from Cuda array on GPU to avoid the deviceTohost copy. #478

Comments

serjl commented Mar 2, 2020 • edited Loading

serjl commented Mar 2, 2020 •

edited

Loading