This repository houses an open-source implementation of our paper "Accelerating Interface Adaptation with User-Friendly Priors".
This repository implements two environments (Treasure and Highway) and four interface optimizers (Bayes, LIMIT, Convexity, Proportionality). The Convexity and Proportionality optimizers can be used with or without the tuning algorithm LIMIT, see Arguments for details.
All dependencies required to run the code in this repository are tabulated in
requirements.txt
; call pip install -r requirements.txt
to install necessary
dependencies.
See below:
usage: main.py [-h] [--env {treasure,highway}] [--model {bayes,limit,ours-c,ours-p,ours-pc}]
[--dof DOF] [--lr LR] [--batch-size BATCH_SIZE] [--epochs EPOCHS]
[--episodes EPISODES] [--prior-weight PRIOR_WEIGHT]
options:
-h, --help show this help message and exit
--env {treasure,highway}
Environment to use.
--model {bayes,limit,ours-c,ours-p,ours-pc}
Interface Optimizer to use.
--dof DOF DoF for the treasure environment. Ignored for highway environment.
--lr LR Learning Rate for interface and human models
--batch-size BATCH_SIZE
Batch size for interface and human models
--epochs EPOCHS Number of epochs to train interface and human
--episodes EPISODES Number of episodes to perform interaction
--prior-weight PRIOR_WEIGHT
Weight of prior
The four interface optimizers can be combined in eight different ways:
- Bayes: follows the Bayesian Optimization process by treating the environment and the user as Gaussian Processes [1]
- LIMIT: uses an information-theoretic model to train the interface in a task-agnostic way [2]
-
Convexity: attempts to form a convex signal manifold for all
$\theta \in \Theta$ -
Proportionality: attempts to form proportional signals about the space
$\Theta$
Convexity and Proportionality are combined to form the model "Ours-PC".
Note that the weight of the prior in training can be adjusted using the
"--prior-weight" flag, setting this value to
In the Treasure environment, a simulated agent is attempting to navigate to a
hidden goal position
The environment has dynamics:
To launch this environment call python main.py --env treasure --dof <dim>
where dim specifies the dimension of the environment.
Here, a simulated human is attempting to pass an autonomous vehicle. The
autonomous vehicle's policy is parameterized by hidden information
The environment has dynamics
To launch this environment call python main.py --env highway
.
[1]: Schulz, Eric, Maarten Speekenbrink, and Andreas Krause. "A tutorial on Gaussian process regression: Modelling, exploring, and exploiting functions." Journal of Mathematical Psychology 85 (2018): 1-16.
[2]: Christie, Benjamin A., and Dylan P. Losey. "LIMIT: Learning interfaces to maximize information transfer." arXiv preprint arXiv:2304.08539 (2023).