-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproducibility of GDN #68
Comments
would you be willing to share the code with me? i could not make gdn work with new versions of torch_geometric |
I tried to reproduce the results of the WADI and SWAT datasets on my computer, but the results are much worse than the original and the results you got. If it is convenient for you, could you please send me a copy of the code according to. /scripts/readme.md file to me? Thank you very much. My email address is [email protected]. |
I meet the same question, did you solve it? The results are different with the same setting. |
I also ran into this problem, while working with GDN. I fixed it by adding os.environ["CUBLAS_WORKSPACE_CONFIG"]=":4096:8"
torch.use_deterministic_algorithms(True) to the top of my code. Then the results were the same with the same random seed. |
Hello,
I am very impressed by your work, and am trying to start my anomaly detection research based off of your work.
The first thing I am trying to do is to reproduce the results for the SWaT dataset given in Table 2.
I followed the exact step that you provided in scripts/readme.md for SWaT preprocessing.
After running process_swat.py, I got these statistics for the final data.
I noticed that it is slightly different from the data statistics given in Table 1. (5 extra data points exist in my processed data)
After creating train.csv, test.csv, and list.txt, I compared the created files with demo data (swat_train_demo.csv, swat_test_demo.csv) given in https://drive.google.com/drive/folders/1_4TlatKh-f7QhstaaY7YTSCs8D4ywbWc?usp=sharing.
However, the first 999 rows of the data didn't match.
Finally, I tried to run your code with the same seed and data multiple times to see if the performance varies between each run. Unfortunately, fixing the seed didn't really work because the performance varied so much between each run. (For your understanding, I used the hyperparameter settings from #4) +) I also tried to run the code under cpu environment, but the results are still non-reproducible.
(1)
F1 score: 0.8163308589607635
precision: 0.9778963414634146
recall: 0.7007099945385036
(2)
F1 score: 0.7394631639063391
precision: 0.9926402943882244
recall: 0.5892954669579464
(3)
F1 score: 0.8220572640509013
precision: 0.9845020325203252
recall: 0.7054432914618606
(4)
F1 score: 0.8120639690887624
precision: 0.9895370128171593
recall: 0.6886947023484434
How did you evaluate your model when reporting to the paper? Have you come across this problem before?
My question can be arranged as follows.
The same thing happened for WADI as well.
The data statistics are different.
The processed data and the demo data do not match.
The code is not reproducible with a fixed seed for WADI dataset as well.
The results are nowhere near the reported results in the paper.
Has anyone been succesful at reproducing the results for SWaT and WADI?
The text was updated successfully, but these errors were encountered: