Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ran out of memory #21

Open
GerrieWell opened this issue May 24, 2017 · 4 comments
Open

Ran out of memory #21

GerrieWell opened this issue May 24, 2017 · 4 comments

Comments

@GerrieWell
Copy link


W tensorflow/core/common_runtime/bfc_allocator.cc:274] **_*******************************xxx************************xx********************xxxxxxxxxxxxxxxxx
W tensorflow/core/common_runtime/bfc_allocator.cc:275] Ran out of memory trying to allocate 25.41MiB.  See logs for memory state.
W tensorflow/core/framework/op_kernel.cc:993] Resource exhausted: OOM when allocating tensor with shape[150,25,74,24]
Traceback (most recent call last):
  File "main.py", line 100, in <module>
    main(args.dataset_path)
  File "main.py", line 20, in main
    train(model, dataset_path)
  File "main.py", line 43, in train
    model.fit_generator(Data_Generator.flow(f,flag = flag_train),one_epoch,epoch_num,validation_data=Data_Generator.flow(f,train_or_validation=which_val_data,flag=flag_val),nb_val_samples=nb_val_samples)
  File "/Users/gerrie/tensorflow/lib/python2.7/site-packages/keras/legacy/interfaces.py", line 88, in wrapper
    return func(*args, **kwargs)
  File "/Users/gerrie/tensorflow/lib/python2.7/site-packages/keras/engine/training.py", line 1877, in fit_generator
    class_weight=class_weight)
  File "/Users/gerrie/tensorflow/lib/python2.7/site-packages/keras/engine/training.py", line 1621, in train_on_batch
    outputs = self.train_function(ins)
  File "/Users/gerrie/tensorflow/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2103, in __call__
    feed_dict=feed_dict)
  File "/Users/gerrie/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 767, in run
    run_metadata_ptr)
  File "/Users/gerrie/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 965, in _run
    feed_dict_string, options, run_metadata)
  File "/Users/gerrie/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1015, in _do_run
    target_list, options, run_metadata)
  File "/Users/gerrie/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1035, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[150,74,24,25]
	 [[Node: conv2d_2/convolution = Conv2D[T=DT_FLOAT, data_format="NHWC", padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](max_pooling2d_1/MaxPool, conv2d_2/kernel/read)]]
	 [[Node: add_9/_35 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_19807_add_9", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

Caused by op u'conv2d_2/convolution', defined at:
  File "main.py", line 100, in <module>
    main(args.dataset_path)
  File "main.py", line 18, in main
    model = generate_model()
  File "/Volumes/more/source/cv/reid/Implementation-CVPR2015-CNN-for-ReID/CUHK03/model.py", line 61, in generate_model
    x1 = share_conv_2(x1)
  File "/Users/gerrie/tensorflow/lib/python2.7/site-packages/keras/engine/topology.py", line 578, in __call__
    output = self.call(inputs, **kwargs)
  File "/Users/gerrie/tensorflow/lib/python2.7/site-packages/keras/layers/convolutional.py", line 164, in call
    dilation_rate=self.dilation_rate)
  File "/Users/gerrie/tensorflow/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2893, in conv2d
    data_format='NHWC')
  File "/Users/gerrie/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 639, in convolution
    op=op)
  File "/Users/gerrie/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 308, in with_space_to_batch
    return op(input, num_spatial_dims, padding)
  File "/Users/gerrie/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 631, in op
    name=name)
  File "/Users/gerrie/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 129, in _non_atrous_convolution
    name=name)
  File "/Users/gerrie/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 396, in conv2d
    data_format=data_format, name=name)
  File "/Users/gerrie/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "/Users/gerrie/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2327, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/Users/gerrie/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1226, in __init__
    self._traceback = _extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[150,74,24,25]
	 [[Node: conv2d_2/convolution = Conv2D[T=DT_FLOAT, data_format="NHWC", padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](max_pooling2d_1/MaxPool, conv2d_2/kernel/read)]]
	 [[Node: add_9/_35 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_19807_add_9", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

Seem like GPU memory run out . How to solve this problem?

my device is :

CUHK03 git:(master) ✗  $ cuda-smi
Device 0 [PCIe 0:1:0.0]: GeForce GT 650M (CC 3.0): 745.94 of 1023.7 MB (i.e. 72.9%) Free

The memory should be enough since I've ran other big project using tensorflow.
and I run on macOS 10.12. latest tensorflow version .

@GerrieWell
Copy link
Author

可以提供你训练好的模型吗?

@LG17
Copy link

LG17 commented May 24, 2017

如果能提供 权值文件 就太好了!

@prashanthbasani
Copy link
Contributor

I am also facing the same issue with error showing as
tensorflow/core/framework/op_kernel.cc:1152] Resource exhausted: OOM when allocating tensor with shape[150,41,16,25]

@prashanthbasani
Copy link
Contributor

Change the batch size argument to 50 in data_preparation.py init and flow functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants