Reproducibility of GDN #68

mhkim9714 · 2023-05-12T16:44:12Z

Hello,
I am very impressed by your work, and am trying to start my anomaly detection research based off of your work.

The first thing I am trying to do is to reproduce the results for the SWaT dataset given in Table 2.
I followed the exact step that you provided in scripts/readme.md for SWaT preprocessing.
After running process_swat.py, I got these statistics for the final data.

train.csv : (47520, 52)
test.csv : (44991, 52)

I noticed that it is slightly different from the data statistics given in Table 1. (5 extra data points exist in my processed data)

After creating train.csv, test.csv, and list.txt, I compared the created files with demo data (swat_train_demo.csv, swat_test_demo.csv) given in https://drive.google.com/drive/folders/1_4TlatKh-f7QhstaaY7YTSCs8D4ywbWc?usp=sharing.
However, the first 999 rows of the data didn't match.

Finally, I tried to run your code with the same seed and data multiple times to see if the performance varies between each run. Unfortunately, fixing the seed didn't really work because the performance varied so much between each run. (For your understanding, I used the hyperparameter settings from #4) +) I also tried to run the code under cpu environment, but the results are still non-reproducible.

(1)
F1 score: 0.8163308589607635
precision: 0.9778963414634146
recall: 0.7007099945385036
(2)
F1 score: 0.7394631639063391
precision: 0.9926402943882244
recall: 0.5892954669579464
(3)
F1 score: 0.8220572640509013
precision: 0.9845020325203252
recall: 0.7054432914618606
(4)
F1 score: 0.8120639690887624
precision: 0.9895370128171593
recall: 0.6886947023484434

How did you evaluate your model when reporting to the paper? Have you come across this problem before?

My question can be arranged as follows.

Why does the difference in data statistics occur?
Why following the exact preprocessing step results in different data from the given demo data?
Why does fixing the seed not work in GDN? Is it something related to the atomic operations(non-deterministic operations) included in torch_scatter and torch_sparse?

The same thing happened for WADI as well.

The data statistics are different.
- train.csv : (102697,128)
- test.csv : (17280, 128)
The processed data and the demo data do not match.
The code is not reproducible with a fixed seed for WADI dataset as well.
The results are nowhere near the reported results in the paper.

Has anyone been succesful at reproducing the results for SWaT and WADI?

hamiid01 · 2023-05-18T15:59:22Z

would you be willing to share the code with me? i could not make gdn work with new versions of torch_geometric

DavidDong004 · 2023-06-06T01:09:43Z

I tried to reproduce the results of the WADI and SWAT datasets on my computer, but the results are much worse than the original and the results you got. If it is convenient for you, could you please send me a copy of the code according to. /scripts/readme.md file to me? Thank you very much. My email address is [email protected].

KeepMovingXX · 2023-10-12T13:19:15Z

I meet the same question, did you solve it? The results are different with the same setting.

peerschuett · 2025-01-07T16:44:04Z

I also ran into this problem, while working with GDN. I fixed it by adding

os.environ["CUBLAS_WORKSPACE_CONFIG"]=":4096:8"
torch.use_deterministic_algorithms(True)

to the top of my code. Then the results were the same with the same random seed.
Still, the performance varies a lot between different random seeds.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducibility of GDN #68

Reproducibility of GDN #68

mhkim9714 commented May 12, 2023

hamiid01 commented May 18, 2023

DavidDong004 commented Jun 6, 2023

KeepMovingXX commented Oct 12, 2023

peerschuett commented Jan 7, 2025

Reproducibility of GDN #68

Reproducibility of GDN #68

Comments

mhkim9714 commented May 12, 2023

hamiid01 commented May 18, 2023

DavidDong004 commented Jun 6, 2023

KeepMovingXX commented Oct 12, 2023

peerschuett commented Jan 7, 2025