-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FileNotFoundError: [Errno 2] No such file or directory: 'datasets/coco_13/trainval/00020596.jpg' #42
Comments
In addition, I wonder why I need the folder datasets/coco_13/trainval/, the data preparation stage did not say that I need to create a folder named trainval |
Hello, I am also using a Coco format dataset and have not encountered the issue of not being able to find the graph in your dataset. Could you please check if your dataset is formatted incorrectly as datasets coco - (annotations/train2017/val2017)? Or maybe the DATASET has not been modified in YAML: coco,Your weight file also appears to have loaded incorrectly, and you need to use repvit instead of sam But the errors in my place are the same as yours, ValueError: Caught ValueError in DataLoader worker process 0. And do I have any further questions about distribution later on? Perhaps you have encountered it? I don't know if it's a version issue,Thanks for help! [2024-12-11 05:38:30 rep_vit_m1_fuse_sa_distill](train.py 186): INFO Start training Batch 0: |
Finally, I modified line 97 of /training/data/coco_dataset. It was changed to train/. Currently, it can be trained normally, but I encountered ZeroDivisionError: division by zero during the final evaluation |
@gold123fish I don't have the same problem as you. I'm sorry. In addition, may I ask why I used the wrong weight file? Didn't the author say in the teacher Embed to download the weights/sam_vit_h_4b8939.pth? Why do you need to use repvit instead of sam, thank you |
I noticed that I had previously modified the 98 line you mentioned. But it still shows that there is a problem with the distribution, and I still can't train. Regarding the weight file, I thought you had ended Teacher Embeddings and entered (Phase 1) Encoder Only Knowledge Distillation, which requires the use of repvit. I made a mistake |
You should try not to use distributed training, first on a GPU to see if it can run, first check whether it is an environment problem or cuda problem |
The problem shown in the title occurs after I run the code for preparing teacher embedding part. I use coco dataset, and have established folders for data preparation according to annotations and images, is there any problem? Thanks for help!
[2024-12-10 19:32:17 vit_h](save_embedding.py 56): INFO number of params: 637026048
[2024-12-10 19:32:17 vit_h](utils.py 60): INFO ==============> Resuming form weights/sam_vit_h_4b8939.pth....................
[2024-12-10 19:32:18 vit_h](utils.py 75): INFO
[2024-12-10 19:32:19 vit_h](save_embedding.py 69): INFO Start saving embeddings
Traceback (most recent call last):
File "training/save_embedding.py", line 238, in
main(config)
File "training/save_embedding.py", line 79, in main
save_embeddings_one_epoch(config, model, data_loader_train, epoch)
File "/home/work/miniforge3/envs/edgesam/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "training/save_embedding.py", line 99, in save_embeddings_one_epoch
for idx, ((samples, _), (keys, seeds)) in enumerate(data_loader):
File "/home/work/miniforge3/envs/edgesam/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 634, in next
data = self._next_data()
File "/home/work/miniforge3/envs/edgesam/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
return self._process_data(data)
File "/home/work/miniforge3/envs/edgesam/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
data.reraise()
File "/home/work/miniforge3/envs/edgesam/lib/python3.8/site-packages/torch/_utils.py", line 644, in reraise
raise exception
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/work/miniforge3/envs/edgesam/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
File "/home/work/miniforge3/envs/edgesam/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/work/miniforge3/envs/edgesam/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/work/EdgeSAM/training/data/augmentation/dataset_wrapper.py", line 31, in getitem
return self.__getitem_for_write(index)
File "/home/work/EdgeSAM/training/data/augmentation/dataset_wrapper.py", line 39, in __getitem_for_write
item = self.dataset[index]
File "/home/work/EdgeSAM/training/data/coco_dataset.py", line 98, in getitem
img = Image.open(img_path).convert('RGB')
File "/home/work/miniforge3/envs/edgesam/lib/python3.8/site-packages/PIL/Image.py", line 3431, in open
fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: '/home/work/EdgeSAM/datasets/coco_13/trainval/00020596.jpg'
The text was updated successfully, but these errors were encountered: