how can solve the error "errors_impl.InvalidArgumentError: assertion failed: [`predictions` out of bound] [Condition x < y did not hold element-wise:] [x (mean_iou/confusion_matrix/control_dependency_1:0) = ] [0 0 0...]", when running deeplabv3+ eval.py at training on ms coco2014 dataset with the mobilenetv2 model #4709

feixuedudiao · 2018-07-06T09:29:48Z

when I training on ms coco 2014 datasets with the mobilenet pretrained mode, I have the error reported "errors_impl.InvalidArgumentError: assertion failed: [predictions out of bound] [Condition x < y did not hold element-wise:] [x (mean_iou/confusion_matrix/control_dependency_1:0) = ] [0 0 0...]". I set the cropsize[641, 641] at running eval.py, how can solve this problem ,thanks.

The text was updated successfully, but these errors were encountered:

tensorflowbutler · 2018-07-06T19:01:32Z

Thank you for your post. We noticed you have not filled out the following field in the issue template. Could you update them if they are relevant in your case, or leave them as N/A? Thanks.
What is the top-level directory of the model you are using
Have I written custom code
OS Platform and Distribution
TensorFlow installed from
TensorFlow version
Bazel version
CUDA/cuDNN version
GPU model and memory
Exact command to reproduce

feixuedudiao · 2018-07-07T00:58:31Z

oh, sorry.

System information
What is the top-level directory of the model you are using:deeplab
Have I written custom code (as opposed to using a stock example script provided in TensorFlow):no
OS Platform and Distribution (e.g., Linux Ubuntu 16.04):Ubuntu 16.04
TensorFlow installed from (source or binary):binary;
TensorFlow version (use command below):1.8.0(ubuntu)
Bazel version (if compiling from source):
CUDA/cuDNN version:V9.0(windows); V9.0(ubuntu)
GPU model and memory:Titan XP, 12G(ubuntu);
Exact command to reproduce:

Describe the problem:

when I training on ms coco 2014 datasets with the mobilenet pretrained mode, I have the error reported "errors_impl.InvalidArgumentError: assertion failed: [predictions out of bound] [Condition x < y did not hold element-wise:] [x (mean_iou/confusion_matrix/control_dependency_1:0) = ] [0 0 0...]". I set the cropsize[641, 641] at running eval.py, how can solve this problem ,thanks.

Source code / logs
run eval.py in terminal on Ubuntu:
(tf3.6) root@DeepLearning:/home/video/videocom/sementatial_segmentation/Deeplabv/research/deeplab# sh ./local_test_mobilenetv2_coco_sunjf.sh

INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Starting evaluation at 2018-07-07-00:50:03
Traceback (most recent call last):
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
return fn(*args)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [predictions out of bound] [Condition x < y did not hold element-wise:] [x (mean_iou/confusion_matrix/control_dependency_1:0) = ] [0 0 0...] [y (mean_iou/ToInt64_2:0) = ] [81]
[[Node: mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert = Assert[T=[DT_STRING, DT_STRING, DT_STRING, DT_INT64, DT_STRING, DT_INT64], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/Switch/_2021, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/data_0, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/data_1, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/data_2, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/Switch_1/_2023, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/data_4, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/Switch_2/_2025)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/video/videocom/sementatial_segmentation/Deeplabv/code/models-master/research/deeplab/eval_sunjf_0612.py", line 202, in
tf.app.run()
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "/home/video/videocom/sementatial_segmentation/Deeplabv/code/models-master/research/deeplab/eval_sunjf_0612.py", line 195, in main
eval_interval_secs=FLAGS.eval_interval_secs)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/evaluation.py", line 301, in evaluation_loop
timeout=timeout)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/contrib/training/python/training/evaluation.py", line 450, in evaluate_repeatedly
session.run(eval_ops, feed_dict)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 567, in run
run_metadata=run_metadata)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1043, in run
run_metadata=run_metadata)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1134, in run
raise six.reraise(*original_exc_info)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/six.py", line 693, in reraise
raise value
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1119, in run
return self._sess.run(*args, **kwargs)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1191, in run
run_metadata=run_metadata)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 971, in run
return self._sess.run(*args, **kwargs)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [predictions out of bound] [Condition x < y did not hold element-wise:] [x (mean_iou/confusion_matrix/control_dependency_1:0) = ] [0 0 0...] [y (mean_iou/ToInt64_2:0) = ] [81]
[[Node: mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert = Assert[T=[DT_STRING, DT_STRING, DT_STRING, DT_INT64, DT_STRING, DT_INT64], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/Switch/_2021, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/data_0, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/data_1, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/data_2, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/Switch_1/_2023, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/data_4, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/Switch_2/_2025)]]

MuadDev · 2019-03-13T13:00:19Z

Is there any progress on this subject?

MuadDev · 2019-03-13T13:05:38Z

I solved the above error by following this answer.

violet17 · 2019-03-19T06:28:57Z

#DrSleep/tensorflow-deeplab-resnet#107

violet17 · 2019-03-19T06:29:36Z

#warmspringwinds/tf-image-segmentation#33

violet17 · 2019-03-19T06:30:27Z

My python is 2.7 with tensorflow 1.12.
I added these code before caculating mIoU:

indices = tf.cast(tf.less_equal(annotation_batch_tensor, number_of_classes - 1),tf.uint8)
annotation_batch_tensor = tf.multiply(annotation_batch_tensor,indices)
Because the later version of TF has added the assert:

labels = control_flow_ops.with_dependencies(
[check_ops.assert_less(
labels, num_classes_int64, message='labels out of bound')],
labels)
predictions = control_flow_ops.with_dependencies(
[check_ops.assert_less(
predictions, num_classes_int64,
message='predictions out of bound')],
predictions)
from the comments of @xxxzhi in #2239.

And it it so wired in TF 1.12 that I can't use the solution of #2239. After using tf.gather, the shape of annotation_batch_tensor is wired and is not compatable with the input labels in slim.metrics.streaming_mean_iou .

Then I thought I must ignore all labels greater than or equal to number_of_classes from the comments of @amlarraz DrSleep/tensorflow-deeplab-resnet#107.

That is my comment in warmspringwinds/tf-image-segmentation#33

tensorflowbutler · 2020-01-29T23:18:29Z

Hi There,
We are checking to see if you still need help on this, as this seems to be considerably old issue. Please update this issue with the latest information, code snippet to reproduce your issue and error you are seeing.
If we don't hear from you in the next 7 days, this issue will be closed automatically. If you don't need help on this issue any more, please consider closing this.

tensorflowbutler assigned yhliang2018 Jul 6, 2018

tensorflowbutler added the stat:awaiting response Waiting on input from the contributor label Jul 6, 2018

tensorflowbutler removed the stat:awaiting response Waiting on input from the contributor label Jul 7, 2018

yhliang2018 assigned aquariusjay Jul 13, 2018

yhliang2018 added the stat:awaiting model gardener Waiting on input from TensorFlow model gardener label Jul 13, 2018

yhliang2018 removed their assignment Jul 13, 2018

tensorflowbutler closed this as completed Feb 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feixuedudiao commented Jul 6, 2018

tensorflowbutler commented Jul 6, 2018

feixuedudiao commented Jul 7, 2018

MuadDev commented Mar 13, 2019

MuadDev commented Mar 13, 2019

violet17 commented Mar 19, 2019

violet17 commented Mar 19, 2019

violet17 commented Mar 19, 2019

tensorflowbutler commented Jan 29, 2020

Comments

feixuedudiao commented Jul 6, 2018

tensorflowbutler commented Jul 6, 2018

feixuedudiao commented Jul 7, 2018

MuadDev commented Mar 13, 2019

MuadDev commented Mar 13, 2019

violet17 commented Mar 19, 2019

violet17 commented Mar 19, 2019

violet17 commented Mar 19, 2019

tensorflowbutler commented Jan 29, 2020