-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pretrained Model #1
Comments
Hi Nicholas! Mind that we train one single network for both policy and value function, and we train both the feature layers (convolution layers) and the control layers (LSTM + linear layers). |
@sparisi , thanks to getting back to me so quickly! I was referring to model parameters (e.g., weights and biases) in my earlier remark, so as to avoid re-training the agent myself from scratch. Do you provide these? To your comment, I will be sure to reinitialize the appropriate parts of the network upon transfer, per your advice and suggestions in the paper, and take only the parameters I need! |
Oh I see. We do not set those parameters, we just let pytorch initialize them randomly according to its default initializer. All seeds are fixed before we initialize the models in order to reproduce the same results. |
I do not have access to the same resources as you for pretraining, but would still love to try transferring a pre-trained control system to explore another environment -- I think that may be feasible on my system or a colab notebook. Is there any chance that you uploaded the model parameters somewhere (i.e., path/to/model.tar)? |
Models are available as code release. Let me know if you can run them! |
Wow, thanks so much! I just requested access. Did you intend to make them public (must be set in sharing settings)? |
Yes, I wanted to make them public but it seems that with my account I cannot allow access just by sharing a link. For now I gave you access, I will find a better way for sharing the models later. |
Great! I was able to reproduce some numbers from seed 1: I'm curious as to why it's possible to visit a fraction of a state in Habitat, unless I'm missing something, which I probably am :-). Thank you for making these models available to me -- I really appreciate it. Hopefully others will benefit as well-- I can see some fun, exciting transfer applications for the Habitat model in particular! |
Great! Those numbers seem the same as the plots in the paper, so it works :) Yes I think Habitat could transfer well to other environments, whereas MiniGrid is more limited. |
One of my favorite components of the C-BET paper was the proposed paradigm shift from tabula-rasa exploration for each task to a system where new environments are explored with the context carried over from a pretrained model. I've found that a practical starting point for similar procedures on other large models (e.g., BERTs, ResNets) is to obtain a copy of the pre-trained model. I'd love to start working with C-BET as well!
I'm very curious as to where I might be able to find the C-BET parameters from your paper. Looking forward to experimenting with this!
The text was updated successfully, but these errors were encountered: