Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练问题,训练进行几次后,会出错,出现nan值,导致AssertionError!!! #127

Open
xiaofengBian opened this issue Sep 17, 2022 · 1 comment

Comments

@xiaofengBian
Copy link

C:\Users\bxf\anaconda3\envs\transt\python.exe C:/PyCharmProjects/TransT-main/ltr/run_training.py
Training: transt transt
WARNING: You are using tensorboardX instead sis you have a too old pytorch version.
loading annotations into memory...
Done (t=13.20s)
creating index...
index created!
number of params: 23016006
No matching checkpoint file found
[train: 1, 1 / 1000] FPS: 0.0 (0.0) , Loss/total: 12.99988 , Loss/ce: 0.69430 , Loss/bbox: 0.97997 , Loss/giou: 1.15687 , iou: 0.03106
[train: 1, 2 / 1000] FPS: 0.0 (5.1) , Loss/total: 13.18990 , Loss/ce: 0.67882 , Loss/bbox: 1.01086 , Loss/giou: 1.23913 , iou: 0.01553
[train: 1, 3 / 1000] FPS: 0.0 (5.1) , Loss/total: 13.00681 , Loss/ce: 0.69773 , Loss/bbox: 0.93112 , Loss/giou: 1.26818 , iou: 0.01083
[train: 1, 4 / 1000] FPS: 0.0 (5.3) , Loss/total: 12.93164 , Loss/ce: 0.69913 , Loss/bbox: 0.91258 , Loss/giou: 1.27109 , iou: 0.01094
[train: 1, 5 / 1000] FPS: 0.0 (4.9) , Loss/total: 12.94410 , Loss/ce: 0.69589 , Loss/bbox: 0.91288 , Loss/giou: 1.29008 , iou: 0.00936
[train: 1, 6 / 1000] FPS: 0.0 (5.1) , Loss/total: 12.90344 , Loss/ce: 0.69371 , Loss/bbox: 0.90170 , Loss/giou: 1.30679 , iou: 0.00780
Training crashed at epoch 1
Traceback for the error!
Traceback (most recent call last):
File "C:\PyCharmProjects\TransT-main\ltr\trainers\base_trainer.py", line 70, in train
self.train_epoch() # 调用ltr/trainers/ltr_trainer.py写的train_epoch方法
File "C:\PyCharmProjects\TransT-main\ltr\trainers\ltr_trainer.py", line 79, in train_epoch
self.cycle_dataset(loader) # 调用自己写的cycle_dataset方法
File "C:\PyCharmProjects\TransT-main\ltr\trainers\ltr_trainer.py", line 60, in cycle_dataset
loss, stats = self.actor(data) # 跳转到ltr/actors/tracking.py里面
File "C:\PyCharmProjects\TransT-main\ltr\actors\tracking.py", line 44, in call
loss_dict = self.objective(outputs, targets) # 跳转到ltr/models/tracking/transt.py的182行的forward方法,用于计算损失
File "C:\Users\bxf\anaconda3\envs\transt\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "C:\PyCharmProjects\TransT-main\ltr\models\tracking\transt.py", line 204, in forward
losses.update(self.get_loss(loss, outputs, targets, indices, num_boxes_pos))
File "C:\PyCharmProjects\TransT-main\ltr\models\tracking\transt.py", line 180, in get_loss
return loss_map[loss](outputs, targets, indices, num_boxes)
File "C:\PyCharmProjects\TransT-main\ltr\models\tracking\transt.py", line 153, in loss_boxes
box_ops.box_cxcywh_to_xyxy(target_boxes))
File "C:\PyCharmProjects\TransT-main\util\box_ops.py", line 52, in generalized_box_iou
assert (boxes1[:, 2:] >= boxes1[:, :2]).all()
AssertionError

@xiaofengBian xiaofengBian changed the title 训练问题,到第七次,会出错,出现nan值,导致asserterror!!! 训练问题,训练进行几次后,会出错,出现nan值,导致asserterror!!! Sep 17, 2022
@xiaofengBian xiaofengBian changed the title 训练问题,训练进行几次后,会出错,出现nan值,导致asserterror!!! 训练问题,训练进行几次后,会出错,出现nan值,导致AssertionError!!! Sep 17, 2022
@ChenJian7578
Copy link

请问一下解决了吗?请问如果想要自己训练的话,数据集路径和格式应该怎么放置?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants