about experiments hyperparameters #84

adverbial03 · 2023-11-27T13:44:42Z

Hello, thanks for sharing your excellent work!

I have some specific questions about the selection of hyperparameters in experiments and hope you can answer them:

In the GDN.py class OutLayer , there is code for multiple layers (i.e., layer_num > 1), but when calling OutLayer, layer_num =1. Why is this? Are there any experimental results and analyses supporting this parameter choice?
In class GraphLayer, there is a design for a multi-head attention mechanism (heads > 1), but when selecting parameters, heads=1. I think that multiple heads can help us mine richer temporal information. Why wasn't this done, and have you conducted experiments related to this decision?

I think this is an excellent paper, and I hope to know more experiment details and analysis. Is there a version of the paper with an appendix?

The text was updated successfully, but these errors were encountered:

d-ailin · 2023-11-28T07:29:05Z

Thanks for your kind words and interest in our work.

layer_num is not necessarily to be set as 1. For example, it could be set with other values in run.sh. The choice of values could be varied based on the datasets. For example, when we test on SWaT, the performance of layer_num=2 is close to the result of layer_num=1, so we choose layer_num=1 for simplicity in this case. In short, the hyperparameter could be chose based on the model performance, such as reconstruction error on validation set during training.
Yes, I agree that multi-head attention should be better compared to using single head. We have also tested using multi-head, but it seems in our cases, it would not improve too much or just close to the result of using single-head. But still, this could be potentially varied given different datasets.

We don't have an additional appendix for the paper, but please feel free to ask if there are any other questions. Thanks!

Provide feedback