-
Notifications
You must be signed in to change notification settings - Fork 300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
如何解决数据集camelyon17 #151
Comments
Hi, unfortunately I'm an English speaker, but it looks like you're having issues using Camelyon17 because it is not downloaded? I woudl run this script, with line 304 uncommented to download the dataset: https://github.com/facebookresearch/DomainBed/blob/main/domainbed/scripts/download.py#L304 |
Hello, I have already downloaded the dataset through this link, and the path is /root/wangxy/AlignClip/DomainBed/domainbed/data/camelyon17_v1.0/. However, I keep getting an error that says camelyon17 cannot be found. I'm not sure if my file naming is correct, but I successfully ran other datasets like PACS and Officehome. Do I need to handle the camelyon dataset separately? |
Could you post here the command you use to run main.py, and the directory you run it from? To me it looks like args.root is set to |
main.zip |
Can you run with |
Hello, it's running now, but there's a problem. The camelyon17_v1.0 dataset contains raw dataset patches, and the patches are classified in the format patient_00X_node_X, which is necessary for it to work. However, I have already divided the dataset into hospital0, hospital1, hospital2, hospital3, hospital4, and now it's giving an error. Could you please explain why this is happening? |
Could you post the stack trace so I can have more information? |
Hi @yuu-Wang , It's pretty hard to understand what exactly is going on here without more details. In order to help you, I will need a minimal reproducable example including:
However, taking a quick look, I don't think the images need to be split into directories based on their hpspital source. |
hi, DomainBed/domainbed/scripts/download.py Line 304 in dad3ca3
2.Since I noticed that the WILD environment in the file( DomainBed/domainbed/datasets.py Line 348 in dad3ca3
3.I used this (https://github.com/thuml/CLIPood/blob/bc0d8745e8b0d97b0873bd8ed8589793abd1c1a7/engine.py#L53) (https://github.com/thuml/CLIPood/blob/bc0d8745e8b0d97b0873bd8ed8589793abd1c1a7/converter_domainbed.py#L18) to divide the dataset into training, validation, and test sets. |
Steps 2 and 3 are not needed; the code takes care of this internally. Could you re-download with step 1, skip steps 2 and 3, and try again? If this does not work, could you please send a few lines of code of how exactly you are loading the dataset in your training code? |
Sure, I downloaded it directly according to step one and then ran main.py |
I see that you're running main.py from another repository, CLIPood, and not the DomainBed repository. I'm not very familiar with the CLIPOod code. Could you run from an unmodified DomainBed codebase? |
Thank you very much for your patience. It's doable. |
您好,要是想用Camelyon17 ,该怎么编排数据集的结构,DomainBed/domainbed/data/camelyon17/是这样吗,但是它一直显示没找到。Traceback (most recent call last):
File "/root/wangxy/AlignClip/main.py", line 155, in
main(args)
File "/root/wangxy/AlignClip/main.py", line 44, in main
train_iter, val_loader, test_loaders, train_class_names, template = get_dataset(args)
File "/root/wangxy/AlignClip/engine.py", line 59, in get_dataset
converter_domainbed.get_domainbed_datasets(dataset_name=args.data, root=args.root, targets=args.targets,
File "/root/wangxy/AlignClip/converter_domainbed.py", line 21, in get_domainbed_datasets
datasets = vars(dbdatasets)[dataset_name](root, targets, hparams)
File "/root/wangxy/AlignClip/DomainBed/domainbed/datasets.py", line 347, in init
dataset = Camelyon17Dataset(root_dir=root)
File "/root/anaconda3/envs/pytorch_2.0.1/lib/python3.8/site-packages/wilds/datasets/camelyon17_dataset.py", line 64, in init
self._data_dir = self.initialize_data_dir(root_dir, download)
File "/root/anaconda3/envs/pytorch_2.0.1/lib/python3.8/site-packages/wilds/datasets/wilds_dataset.py", line 341, in initialize_data_dir
self.download_dataset(data_dir, download)
File "/root/anaconda3/envs/pytorch_2.0.1/lib/python3.8/site-packages/wilds/datasets/wilds_dataset.py", line 368, in download_dataset
raise FileNotFoundError(
FileNotFoundError: The camelyon17 dataset could not be found in DomainBed/domainbed/data/camelyon17_v1.0. Initialize the dataset with download=True to download the dataset. If you are using the example script, run with --download. This might take some time for large datasets.
The text was updated successfully, but these errors were encountered: