You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Training with my own dataset appear error:
2024-04-16 18:56:22 - DEBUG - Training epoch 0 with 0 samples
File "/home/hyq/anaconda3/envs/cvnets/bin/cvnets-train", line 8, in
sys.exit(main_worker())
File "/home/hyq/文档/ml-cvnets/main_train.py", line 235, in main_worker
main(opts=opts, **kwargs)
File "/home/hyq/anaconda3/envs/cvnets/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/home/hyq/文档/ml-cvnets/main_train.py", line 174, in main
training_engine.run(train_sampler=train_sampler)
File "/home/hyq/文档/ml-cvnets/engine/training_engine.py", line 606, in run
train_loss, train_ckpt_metric = self.train_epoch(epoch)
File "/home/hyq/文档/ml-cvnets/engine/training_engine.py", line 357, in train_epoch
avg_loss = train_stats.avg_statistics(
File "/home/hyq/文档/ml-cvnets/metrics/stats.py", line 148, in avg_statistics
logger.error(
File "/home/hyq/文档/ml-cvnets/utils/logger.py", line 46, in error
traceback.print_stack()
2024-04-16 18:56:22 - LOGS - Training took 00:00:02.11
2024-04-16 18:56:22 - ERROR - total_loss not present in the dictionary. Available keys are: []. Exiting!!!
train to use:cvnets-train --common.config-file /home/hyq/下载/pspnet-mobilevitv2-1.0.yaml --common.results-loc segmentation_results
Training with my own dataset appear error:
2024-04-16 18:56:22 - DEBUG - Training epoch 0 with 0 samples
File "/home/hyq/anaconda3/envs/cvnets/bin/cvnets-train", line 8, in
sys.exit(main_worker())
File "/home/hyq/文档/ml-cvnets/main_train.py", line 235, in main_worker
main(opts=opts, **kwargs)
File "/home/hyq/anaconda3/envs/cvnets/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/home/hyq/文档/ml-cvnets/main_train.py", line 174, in main
training_engine.run(train_sampler=train_sampler)
File "/home/hyq/文档/ml-cvnets/engine/training_engine.py", line 606, in run
train_loss, train_ckpt_metric = self.train_epoch(epoch)
File "/home/hyq/文档/ml-cvnets/engine/training_engine.py", line 357, in train_epoch
avg_loss = train_stats.avg_statistics(
File "/home/hyq/文档/ml-cvnets/metrics/stats.py", line 148, in avg_statistics
logger.error(
File "/home/hyq/文档/ml-cvnets/utils/logger.py", line 46, in error
traceback.print_stack()
2024-04-16 18:56:22 - LOGS - Training took 00:00:02.11
2024-04-16 18:56:22 - ERROR - total_loss not present in the dictionary. Available keys are: []. Exiting!!!
train to use:cvnets-train --common.config-file /home/hyq/下载/pspnet-mobilevitv2-1.0.yaml --common.results-loc segmentation_results
pspnet-mobilevitv2-1.0.yaml:
common:
run_label: "run_1"
accum_freq: 1
accum_after_epoch: -1
log_freq: 200
auto_resume: false
mixed_precision: true
grad_clip: 10.0
dataset:
root_train: "/media/hyq/西部数据2TB/ml-cvnets_data/"
root_val: "/media/hyq/西部数据2TB/ml-cvnets_data/"
name: "ade20k1"
category: "segmentation"
train_batch_size0: 4 # effective batch size is 16 ( 4 * 4 GPUs)
val_batch_size0: 4
eval_batch_size0: 1
workers: 4
persistent_workers: false
pin_memory: false
image_augmentation:
random_crop:
enable: true
seg_class_max_ratio: 0.75
pad_if_needed: true
mask_fill: 0 # background idx is 0
random_horizontal_flip:
enable: true
resize:
enable: true
size: [512, 512]
interpolation: "bicubic"
random_short_size_resize:
enable: true
interpolation: "bicubic"
short_side_min: 256
short_side_max: 768
max_img_dim: 1024
photo_metric_distort:
enable: true
random_rotate:
enable: true
angle: 10
mask_fill: 0 # background idx is 0
random_gaussian_noise:
enable: true
sampler:
name: "batch_sampler"
bs:
crop_size_width: 512
crop_size_height: 512
loss:
category: "segmentation"
ignore_idx: -1
segmentation:
name: "cross_entropy"
cross_entropy:
aux_weight: 0.4
optim:
name: "sgd"
weight_decay: 1.e-4
no_decay_bn_filter_bias: true
sgd:
momentum: 0.9
scheduler:
name: "cosine"
is_iteration_based: false
max_epochs: 120
cosine:
max_lr: 0.02
min_lr: 0.0002
model:
segmentation:
name: "encoder_decoder"
lr_multiplier: 1
seg_head: "pspnet"
output_stride: 8
use_aux_head: true
activation:
name: "relu"
pspnet:
psp_dropout: 0.1
psp_out_channels: 512
psp_pool_sizes: [ 1, 2, 3, 6 ]
classification:
name: "mobilevit_v2"
mitv2:
width_multiplier: 1.0
attn_norm_layer: "layer_norm_2d"
activation:
name: "swish"
normalization:
name: "sync_batch_norm"
momentum: 0.1
activation:
name: "swish"
inplace: false
layer:
global_pool: "mean"
conv_init: "kaiming_uniform"
linear_init: "normal"
ema:
enable: true
momentum: 0.0005
stats:
val: [ "loss", "iou" ]
train: [ "loss", "grad_norm" ]
checkpoint_metric: "iou"
checkpoint_metric_max: true
The text was updated successfully, but these errors were encountered: