Skip to content

Commit

Permalink
Update README v2
Browse files Browse the repository at this point in the history
  • Loading branch information
elisaalboni committed Nov 3, 2023
1 parent 398f39f commit 9b88948
Showing 1 changed file with 24 additions and 3 deletions.
27 changes: 24 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# CACTO: Continuous Actor-Critic algorithm with Trajectory Optimization

- ***main*** implements CACTO with state = *[x,t]*. Inputs: test-n (default: 0), system-id (default:'-'), seed (default: None), recover-training-flag (default: False), nb-cpus (default: 15), and w-S (default: 0).
**Files**:
- ***main*** implements CACTO with state = *[x,t]*. Inputs: test-n, system-id, seed, recover-training-flag, nb-cpus, and w-S.
- ***TO*** implements the TO problem of the selected *system* whose end effector has to reach a target state while avoiding an obstacle and ensuring low control effort. The TO problem is modelled in *CasADi* and solved with *ipopt*.
- ***RL*** implements the acotr-critic RL problem of the selected *system* whose end effector has to reach a target state while avoiding an obstacle and ensuring low control. It creates the state trajectory and controls to initialize TO.
- ***NeuralNetwork*** contains the functions to create the NN-models and to compute the quantities needed to update them.
Expand All @@ -12,4 +12,25 @@
- ***system_conf*** configures the training for the selected *system*.
- ***urdf*** contains *system* URDF file (double integrator and manipulator).

*Systems*: single integrator (system-id: single_integrator), double integrator (system-id: double_integrator), car (system-id: 'car'), car_park (system-id: 'car_park'), and 3 DOF planar manipulator (system-id: manipulator)
**Systems**:
single integrator (system-id: single_integrator), double integrator (system-id: double_integrator), car (system-id: 'car'), car_park (system-id: 'car_park'), and 3 DOF planar manipulator (system-id: manipulator)

**Inputs**:
| Argument Name | Type | Default | Choices | Help |
|-------------------------|--------|---------|------------------------------------------------------------------------------------------------------|-------------------------------------|
| `--test-n` | int | 0 | | Test number |
| `--seed` | int | 0 | | Random and tf.random seed |
| `--system-id` | str | 'single_integrator' | single_integrator, double_integrator, car, car_park, manipulator, ur5 | System-id (single_integrator, double_integrator, car, car_park, manipulator, ur5) |
| `--recover-training-flag` | bool | False | True, False | Flag to recover training |
| `--nb-cpus` | int | 2 | | Number of TO problems solved in parallel |
| `--w-S` | float | 0 | | Sobolev training - weight of the value related error |


Example of usage:

```python3 main.py --system-id='single_integrator' --seed=0 --nb-cpus=15 --w-S=1e-2 --test-n=0```
- The "single_integrator" system is selected;
- All the seeds are set to 0;
- 15 TO problems are solved in parallel (if enough resources are available);
- The weight of the value-error is set to 1e-2 (the value-gradient-error is set to 1). Note that w-S=0 corresponds to the standard CACTO algorithm (without Sobolev-Learning);
- The information about the test and the results are stored in the folder N_try_0.

0 comments on commit 9b88948

Please sign in to comment.