Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backprojector non-deterministically fails to allocate mem #8

Open
maxrohleder opened this issue Jul 14, 2021 · 1 comment
Open

Backprojector non-deterministically fails to allocate mem #8

maxrohleder opened this issue Jul 14, 2021 · 1 comment

Comments

@maxrohleder
Copy link

I noticed, that when using the PyroNN Layers for training, sporadically the training aborts, because the layers dont seem to get the required memory. This error is displayed:

GPUassert: out of memory pyronn_layers/cc/kernels/cone_backprojector_3D_CudaKernel_hardware_interp.cu.cc 129

The responsible line of code in the pyronn-layers is here:

cudaArray *projArray;
static cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc<float>();
gpuErrchk( cudaMalloc3DArray( &projArray, &channelDesc, projExtent, cudaArrayLayered ) );
auto pitch_ptr = make_cudaPitchedPtr( const_cast<float*>( sinogram_ptr ),
detector_size.x*sizeof(float),
detector_size.x,
detector_size.y

Then I noticed the comments describing exactly the problem I am experiencing (I believe):

/*************** WARNING ******************./
*
* Tensorflow is allocating the whole GPU memory for itself and just leave a small slack memory
* using cudaMalloc and cudaMalloc3D will allocate memory in this small slack memory !
* Therefore, currently only small volumes can be used (they have to fit into the slack memory which TF does not allocae !)
*
* This is the kernel based on texture interpolation, thus, the allocations are not within the Tensorflow managed memory.
* If memory errors occure:
* 1. start Tensorflow with less gpu memory and allow growth
* 2. switch to software-based interpolation.
*
* TODO: use context->allocate_tmp and context->allocate_persistent instead of cudaMalloc for the projection_matrices array
* : https://stackoverflow.com/questions/48580580/tensorflow-new-op-cuda-kernel-memory-managment
*
*/

Will this TODO be resolved anytime soon? I would really appreciate it and would love to help if I can.

Cheers,
Max

@maxrohleder
Copy link
Author

Note, I do have the memory growth activated as described in the potential challenges section of the readme with this code:

gpus = tf.config.experimental.list_physical_devices('GPU')
    if gpus:
        try:
            for gpu in gpus:
                tf.config.experimental.set_memory_growth(gpu, True)
        except RunetimeError as e:
            print(e)

Will try now, if the hardware_interp flag does the job...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant