Backprojector non-deterministically fails to allocate mem #8

maxrohleder · 2021-07-14T17:48:20Z

I noticed, that when using the PyroNN Layers for training, sporadically the training aborts, because the layers dont seem to get the required memory. This error is displayed:

GPUassert: out of memory pyronn_layers/cc/kernels/cone_backprojector_3D_CudaKernel_hardware_interp.cu.cc 129

The responsible line of code in the pyronn-layers is here:

PYRO-NN-Layers/cc/kernels/cone_backprojector_3D_CudaKernel_hardware_interp.cu.cc

Lines 125 to 134 in 9bec5cc

    
           cudaArray *projArray; 
        
           static cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc<float>(); 
        
           gpuErrchk( cudaMalloc3DArray( &projArray, &channelDesc, projExtent, cudaArrayLayered ) ); 
        
           auto pitch_ptr = make_cudaPitchedPtr( const_cast<float*>( sinogram_ptr ), 
        
                                                       detector_size.x*sizeof(float), 
        
                                                       detector_size.x, 
        
                                                       detector_size.y

Then I noticed the comments describing exactly the problem I am experiencing (I believe):

PYRO-NN-Layers/cc/kernels/cone_backprojector_3D_CudaKernel_hardware_interp.cu.cc

Lines 89 to 103 in 9bec5cc

    
           /*************** WARNING ******************./ 
        
               *  
        
               *   Tensorflow is allocating the whole GPU memory for itself and just leave a small slack memory 
        
               *   using cudaMalloc and cudaMalloc3D will allocate memory in this small slack memory ! 
        
               *   Therefore, currently only small volumes can be used (they have to fit into the slack memory which TF does not allocae !) 
        
               *  
        
               *   This is the kernel based on texture interpolation, thus, the allocations are not within the Tensorflow managed memory. 
        
               *   If memory errors occure: 
        
               *    1. start Tensorflow with less gpu memory and allow growth 
        
               *    2. switch to software-based interpolation.  
        
               *  
        
               *   TODO: use context->allocate_tmp and context->allocate_persistent instead of cudaMalloc for the projection_matrices array 
        
               *       : https://stackoverflow.com/questions/48580580/tensorflow-new-op-cuda-kernel-memory-managment 
        
               *  
        
               */

Will this TODO be resolved anytime soon? I would really appreciate it and would love to help if I can.

Cheers,
Max

maxrohleder · 2021-07-14T18:39:48Z

Note, I do have the memory growth activated as described in the potential challenges section of the readme with this code:

gpus = tf.config.experimental.list_physical_devices('GPU')
    if gpus:
        try:
            for gpu in gpus:
                tf.config.experimental.set_memory_growth(gpu, True)
        except RunetimeError as e:
            print(e)

Will try now, if the hardware_interp flag does the job...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backprojector non-deterministically fails to allocate mem #8

Backprojector non-deterministically fails to allocate mem #8

maxrohleder commented Jul 14, 2021

maxrohleder commented Jul 14, 2021

Backprojector non-deterministically fails to allocate mem #8

Backprojector non-deterministically fails to allocate mem #8

Comments

maxrohleder commented Jul 14, 2021

maxrohleder commented Jul 14, 2021