From 9b889486104e76cc2784d3b954de69c561985cac Mon Sep 17 00:00:00 2001 From: elisaalboni Date: Fri, 3 Nov 2023 21:34:19 +0100 Subject: [PATCH] Update README v2 --- README.md | 27 ++++++++++++++++++++++++--- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 84c5ef0..09a3ce2 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # CACTO: Continuous Actor-Critic algorithm with Trajectory Optimization - -- ***main*** implements CACTO with state = *[x,t]*. Inputs: test-n (default: 0), system-id (default:'-'), seed (default: None), recover-training-flag (default: False), nb-cpus (default: 15), and w-S (default: 0). +**Files**: +- ***main*** implements CACTO with state = *[x,t]*. Inputs: test-n, system-id, seed, recover-training-flag, nb-cpus, and w-S. - ***TO*** implements the TO problem of the selected *system* whose end effector has to reach a target state while avoiding an obstacle and ensuring low control effort. The TO problem is modelled in *CasADi* and solved with *ipopt*. - ***RL*** implements the acotr-critic RL problem of the selected *system* whose end effector has to reach a target state while avoiding an obstacle and ensuring low control. It creates the state trajectory and controls to initialize TO. - ***NeuralNetwork*** contains the functions to create the NN-models and to compute the quantities needed to update them. @@ -12,4 +12,25 @@ - ***system_conf*** configures the training for the selected *system*. - ***urdf*** contains *system* URDF file (double integrator and manipulator). -*Systems*: single integrator (system-id: single_integrator), double integrator (system-id: double_integrator), car (system-id: 'car'), car_park (system-id: 'car_park'), and 3 DOF planar manipulator (system-id: manipulator) +**Systems**: +single integrator (system-id: single_integrator), double integrator (system-id: double_integrator), car (system-id: 'car'), car_park (system-id: 'car_park'), and 3 DOF planar manipulator (system-id: manipulator) + +**Inputs**: +| Argument Name | Type | Default | Choices | Help | +|-------------------------|--------|---------|------------------------------------------------------------------------------------------------------|-------------------------------------| +| `--test-n` | int | 0 | | Test number | +| `--seed` | int | 0 | | Random and tf.random seed | +| `--system-id` | str | 'single_integrator' | single_integrator, double_integrator, car, car_park, manipulator, ur5 | System-id (single_integrator, double_integrator, car, car_park, manipulator, ur5) | +| `--recover-training-flag` | bool | False | True, False | Flag to recover training | +| `--nb-cpus` | int | 2 | | Number of TO problems solved in parallel | +| `--w-S` | float | 0 | | Sobolev training - weight of the value related error | + + +Example of usage: + +```python3 main.py --system-id='single_integrator' --seed=0 --nb-cpus=15 --w-S=1e-2 --test-n=0``` +- The "single_integrator" system is selected; +- All the seeds are set to 0; +- 15 TO problems are solved in parallel (if enough resources are available); +- The weight of the value-error is set to 1e-2 (the value-gradient-error is set to 1). Note that w-S=0 corresponds to the standard CACTO algorithm (without Sobolev-Learning); +- The information about the test and the results are stored in the folder N_try_0. \ No newline at end of file