Pretrained Model #1

rothn · 2022-01-06T21:20:34Z

One of my favorite components of the C-BET paper was the proposed paradigm shift from tabula-rasa exploration for each task to a system where new environments are explored with the context carried over from a pretrained model. I've found that a practical starting point for similar procedures on other large models (e.g., BERTs, ResNets) is to obtain a copy of the pre-trained model. I'd love to start working with C-BET as well!

I'm very curious as to where I might be able to find the C-BET parameters from your paper. Looking forward to experimenting with this!

sparisi · 2022-01-06T21:58:37Z

Hi Nicholas!
Happy that you liked CBET :)
The parameters for training the policy/value network are all here and here. The ones defined in the slurm script override the default ones defined in the argument file.

Mind that we train one single network for both policy and value function, and we train both the feature layers (convolution layers) and the control layers (LSTM + linear layers).
If you want to use models like BERT and ResNet, it would make sense to replace the convolution layers and train only the control layers.

rothn · 2022-01-06T22:02:37Z

@sparisi , thanks to getting back to me so quickly! I was referring to model parameters (e.g., weights and biases) in my earlier remark, so as to avoid re-training the agent myself from scratch. Do you provide these?

To your comment, I will be sure to reinitialize the appropriate parts of the network upon transfer, per your advice and suggestions in the paper, and take only the parameters I need!

sparisi · 2022-01-06T22:16:26Z

Oh I see. We do not set those parameters, we just let pytorch initialize them randomly according to its default initializer. All seeds are fixed before we initialize the models in order to reproduce the same results.
We used seeds 1, 2, 3, ..., 7.
We did not notice much deviation in performance across runs.

rothn · 2022-01-06T22:25:42Z

I do not have access to the same resources as you for pretraining, but would still love to try transferring a pre-trained control system to explore another environment -- I think that may be feasible on my system or a colab notebook. Is there any chance that you uploaded the model parameters somewhere (i.e., path/to/model.tar)?

sparisi · 2022-01-07T04:37:59Z

Models are available as code release. Let me know if you can run them!

rothn · 2022-01-07T04:39:44Z

Wow, thanks so much! I just requested access. Did you intend to make them public (must be set in sharing settings)?

sparisi · 2022-01-07T07:05:47Z

Yes, I wanted to make them public but it seems that with my account I cannot allow access just by sharing a link. For now I gave you access, I will find a better way for sharing the models later.

rothn · 2022-01-07T21:04:39Z

Great! I was able to reproduce some numbers from seed 1:
Habitat Apartment 0 -> Hotel 0: visited_states = 2171.23
MiniGrid multi -> KeyCorridorS3R3: episodic win = 81.00

I'm curious as to why it's possible to visit a fraction of a state in Habitat, unless I'm missing something, which I probably am :-).

Thank you for making these models available to me -- I really appreciate it. Hopefully others will benefit as well-- I can see some fun, exciting transfer applications for the Habitat model in particular!

sparisi · 2022-01-07T23:00:10Z

Great! Those numbers seem the same as the plots in the paper, so it works :)
If you got those numbers running the test script, the state count is averaged over 100 episodes. So if your visited states counts are [100, 105, 101, 100] (assuming 4 episodes) your final count will be 101.5.

Yes I think Habitat could transfer well to other environments, whereas MiniGrid is more limited.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pretrained Model #1

Pretrained Model #1

rothn commented Jan 6, 2022

sparisi commented Jan 6, 2022

rothn commented Jan 6, 2022 •

edited

Loading

sparisi commented Jan 6, 2022

rothn commented Jan 6, 2022 •

edited

Loading

sparisi commented Jan 7, 2022 •

edited

Loading

rothn commented Jan 7, 2022

sparisi commented Jan 7, 2022

rothn commented Jan 7, 2022

sparisi commented Jan 7, 2022 •

edited

Loading

Pretrained Model #1

Pretrained Model #1

Comments

rothn commented Jan 6, 2022

sparisi commented Jan 6, 2022

rothn commented Jan 6, 2022 • edited Loading

sparisi commented Jan 6, 2022

rothn commented Jan 6, 2022 • edited Loading

sparisi commented Jan 7, 2022 • edited Loading

rothn commented Jan 7, 2022

sparisi commented Jan 7, 2022

rothn commented Jan 7, 2022

sparisi commented Jan 7, 2022 • edited Loading

rothn commented Jan 6, 2022 •

edited

Loading

rothn commented Jan 6, 2022 •

edited

Loading

sparisi commented Jan 7, 2022 •

edited

Loading

sparisi commented Jan 7, 2022 •

edited

Loading