Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how can solve the error "errors_impl.InvalidArgumentError: assertion failed: [predictions out of bound] [Condition x < y did not hold element-wise:] [x (mean_iou/confusion_matrix/control_dependency_1:0) = ] [0 0 0...]", when running deeplabv3+ eval.py at training on ms coco2014 dataset with the mobilenetv2 model #4709

Closed
feixuedudiao opened this issue Jul 6, 2018 · 8 comments
Assignees
Labels
stat:awaiting model gardener Waiting on input from TensorFlow model gardener

Comments

@feixuedudiao
Copy link

when I training on ms coco 2014 datasets with the mobilenet pretrained mode, I have the error reported "errors_impl.InvalidArgumentError: assertion failed: [predictions out of bound] [Condition x < y did not hold element-wise:] [x (mean_iou/confusion_matrix/control_dependency_1:0) = ] [0 0 0...]". I set the cropsize[641, 641] at running eval.py, how can solve this problem ,thanks.

@tensorflowbutler tensorflowbutler added the stat:awaiting response Waiting on input from the contributor label Jul 6, 2018
@tensorflowbutler
Copy link
Member

Thank you for your post. We noticed you have not filled out the following field in the issue template. Could you update them if they are relevant in your case, or leave them as N/A? Thanks.
What is the top-level directory of the model you are using
Have I written custom code
OS Platform and Distribution
TensorFlow installed from
TensorFlow version
Bazel version
CUDA/cuDNN version
GPU model and memory
Exact command to reproduce

@feixuedudiao
Copy link
Author

oh, sorry.

System information
What is the top-level directory of the model you are using:deeplab
Have I written custom code (as opposed to using a stock example script provided in TensorFlow):no
OS Platform and Distribution (e.g., Linux Ubuntu 16.04):Ubuntu 16.04
TensorFlow installed from (source or binary):binary;
TensorFlow version (use command below):1.8.0(ubuntu)
Bazel version (if compiling from source):
CUDA/cuDNN version:V9.0(windows); V9.0(ubuntu)
GPU model and memory:Titan XP, 12G(ubuntu);
Exact command to reproduce:

Describe the problem:

when I training on ms coco 2014 datasets with the mobilenet pretrained mode, I have the error reported "errors_impl.InvalidArgumentError: assertion failed: [predictions out of bound] [Condition x < y did not hold element-wise:] [x (mean_iou/confusion_matrix/control_dependency_1:0) = ] [0 0 0...]". I set the cropsize[641, 641] at running eval.py, how can solve this problem ,thanks.

Source code / logs
run eval.py in terminal on Ubuntu:
(tf3.6) root@DeepLearning:/home/video/videocom/sementatial_segmentation/Deeplabv/research/deeplab# sh ./local_test_mobilenetv2_coco_sunjf.sh

INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Starting evaluation at 2018-07-07-00:50:03
Traceback (most recent call last):
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
return fn(*args)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [predictions out of bound] [Condition x < y did not hold element-wise:] [x (mean_iou/confusion_matrix/control_dependency_1:0) = ] [0 0 0...] [y (mean_iou/ToInt64_2:0) = ] [81]
[[Node: mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert = Assert[T=[DT_STRING, DT_STRING, DT_STRING, DT_INT64, DT_STRING, DT_INT64], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/Switch/_2021, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/data_0, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/data_1, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/data_2, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/Switch_1/_2023, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/data_4, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/Switch_2/_2025)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/video/videocom/sementatial_segmentation/Deeplabv/code/models-master/research/deeplab/eval_sunjf_0612.py", line 202, in
tf.app.run()
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "/home/video/videocom/sementatial_segmentation/Deeplabv/code/models-master/research/deeplab/eval_sunjf_0612.py", line 195, in main
eval_interval_secs=FLAGS.eval_interval_secs)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/evaluation.py", line 301, in evaluation_loop
timeout=timeout)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/contrib/training/python/training/evaluation.py", line 450, in evaluate_repeatedly
session.run(eval_ops, feed_dict)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 567, in run
run_metadata=run_metadata)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1043, in run
run_metadata=run_metadata)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1134, in run
raise six.reraise(*original_exc_info)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/six.py", line 693, in reraise
raise value
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1119, in run
return self._sess.run(*args, **kwargs)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1191, in run
run_metadata=run_metadata)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 971, in run
return self._sess.run(*args, **kwargs)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/usr/local/anaconda2/envs/tf3.6/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [predictions out of bound] [Condition x < y did not hold element-wise:] [x (mean_iou/confusion_matrix/control_dependency_1:0) = ] [0 0 0...] [y (mean_iou/ToInt64_2:0) = ] [81]
[[Node: mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert = Assert[T=[DT_STRING, DT_STRING, DT_STRING, DT_INT64, DT_STRING, DT_INT64], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/Switch/_2021, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/data_0, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/data_1, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/data_2, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/Switch_1/_2023, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/data_4, mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/Switch_2/_2025)]]

@tensorflowbutler tensorflowbutler removed the stat:awaiting response Waiting on input from the contributor label Jul 7, 2018
@yhliang2018 yhliang2018 added the stat:awaiting model gardener Waiting on input from TensorFlow model gardener label Jul 13, 2018
@yhliang2018 yhliang2018 removed their assignment Jul 13, 2018
@MuadDev
Copy link

MuadDev commented Mar 13, 2019

Is there any progress on this subject?

@MuadDev
Copy link

MuadDev commented Mar 13, 2019

I solved the above error by following this answer.

@violet17
Copy link

@violet17
Copy link

@violet17
Copy link

My python is 2.7 with tensorflow 1.12.
I added these code before caculating mIoU:

indices = tf.cast(tf.less_equal(annotation_batch_tensor, number_of_classes - 1),tf.uint8)
annotation_batch_tensor = tf.multiply(annotation_batch_tensor,indices)
Because the later version of TF has added the assert:

labels = control_flow_ops.with_dependencies(
[check_ops.assert_less(
labels, num_classes_int64, message='labels out of bound')],
labels)
predictions = control_flow_ops.with_dependencies(
[check_ops.assert_less(
predictions, num_classes_int64,
message='predictions out of bound')],
predictions)
from the comments of @xxxzhi in #2239.

And it it so wired in TF 1.12 that I can't use the solution of #2239. After using tf.gather, the shape of annotation_batch_tensor is wired and is not compatable with the input labels in slim.metrics.streaming_mean_iou .

Then I thought I must ignore all labels greater than or equal to number_of_classes from the comments of @amlarraz DrSleep/tensorflow-deeplab-resnet#107.

That is my comment in warmspringwinds/tf-image-segmentation#33

@tensorflowbutler
Copy link
Member

Hi There,
We are checking to see if you still need help on this, as this seems to be considerably old issue. Please update this issue with the latest information, code snippet to reproduce your issue and error you are seeing.
If we don't hear from you in the next 7 days, this issue will be closed automatically. If you don't need help on this issue any more, please consider closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:awaiting model gardener Waiting on input from TensorFlow model gardener
Projects
None yet
Development

No branches or pull requests

6 participants