diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..33ed6fa --- /dev/null +++ b/.gitignore @@ -0,0 +1,12 @@ +*.npy +*.npz +*.pkl +*.json +*.png +*.zip +generated/* +nn_models/* +data/* +.idea/ +**/c_generated_code/ +**/__pycache__/ \ No newline at end of file diff --git a/README.md b/README.md index 7f6e19e..39a63df 100644 --- a/README.md +++ b/README.md @@ -5,11 +5,6 @@ The algorithms use Optimal Control Problems (OCPs) to generate training data and ## Algorithms -### Active Learning (AL) - -The Active Learning algorithm, referred to as "AL" solves OCPs for system's initial states to verify from which of these states it is possible to stop (reach the zero-velocity set). -AL leverages then Active Learning techniques to iteratively select batches of new states to be tested to maximize the resulting NN classifier accuracy. - ### Viability-Boundary Optimal Control (VBOC) The Viability-Boundary Optimal Control algorithm, referred to as "VBOC," utilizes OCPs to directly compute states that lie exactly on the boundary of the viability set and uses an NN regressor to approximate the set. @@ -19,40 +14,38 @@ The Viability-Boundary Optimal Control algorithm, referred to as "VBOC," utilize The Hamilton-Jacoby Reachability algorithm, referred to as "HJB," is an adaptation of a reachability algorithm presented in the paper "Recursive Regression with Neural Networks: Approximating the HJI PDE Solution" by V. Rubies-Royo and C. Tomlin. HJR computes the solution of the Hamilton-Jacoby-Isaacs (HJI) Partial Differential Equation (PDE) through recursive regression. NN classifiers are employed to approximate the set. -## Additional Information - -For more details, please refer to the paper "VBOC: Learning the Viability Boundary of a Robot Manipulator using Optimal Control" by A. La Rocca, M. Saveriano, and A. Del Prete. The paper provides analysis and comparison of the algorithms mentioned above. - -## References - -- A. La Rocca, M. Saveriano, A. Del Prete, "VBOC: Learning the Viability Boundary of a Robot Manipulator using Optimal Control," arXiv:2305.07535 [cs.RO], 2023. Available at: http://arxiv.org/abs/2305.07535 -- V. Rubies-Royo and C. Tomlin, "Recursive Regression with Neural Networks: Approximating the HJI PDE Solution," in 5th International Conference on Learning Representations, 2017 -- A. Chakrabarty, C. Danielson, S. D. Cairano, and A. Raghunathan, "Active Learning for Estimating Reachable Sets for Systems With Unknown Dynamics", IEEE Transactions on Cybernetics, 2020 - -## Dependencies - -The algorithms are implemented in Python and they rely on: - -- [PyTorch with CUDA](https://pytorch.org/) for the NNs training -- [ACADOS](https://docs.acados.org/python_interface/) to solve the OCPs -- [matplotlib](https://pypi.org/project/matplotlib/), [numpy](https://pypi.org/project/numpy/), [scipy](https://pypi.org/project/scipy/), [casadi](https://pypi.org/project/casadi/), [urdf2casadi](https://pypi.org/project/urdf2casadi/) (optional) - -## Usage +### Active Learning (AL) -The main folder contains scripts for the generation of test data and for the comparison of the performance of the algorithms. The subfolders "AL", "VBOC" and "HJR" contain the implementation of the different algorithms for the computation of the Viability Kernels of 1, 2 and 3 DOFs systems. +The Active Learning algorithm, referred to as "AL" solves OCPs for system's initial states to verify from which of these states it is possible to stop (reach the zero-velocity set). +AL leverages then Active Learning techniques to iteratively select batches of new states to be tested to maximize the resulting NN classifier accuracy. -Work in progress: in the "VBOC" folder you can also find other subfolders containing first studies on the application of VBOC to problems with cartesian constraints, to an UR5 robot manipulator, and some first tests on the usage of the learned sets to ensure recursive feasibility with Model Predictive Control. +## Installation +- Clone the repository\ +`git clone https://github.com/idra-lab/VBOC.git` +- Install the requirements\ +`pip install -r requirements.txt` +- Follow the instructions to install [CasADi](https://web.casadi.org/get/), [Acados](https://docs.acados.org/installation/index.html) and [Pytorch](https://pytorch.org/get-started/locally/). -To try the algorithms and compare their performance you have to first generate the test data for the selected system (either "pendulum", "doublependulum", "triplependulum") in the main folder: +# Usage +Run the script `main.py` inside the `scripts` folder. One can consult the help for the available options: ``` -python3 _testdata.py +cd scripts +python3 main.py --help ``` -then compute the set with the three algorithms ("al", "vboc", "hjr") within the relative subfolder: +For example: +- find the states on the boundary of the viability kernel through VBOC ``` -python3 _.py +python3 main.py -v ``` -and then compare the resulting set approximations: +- use NN regression to learn an approximation of the N-control invariant set ``` -python3 _comparison.py +python3 main.py -t ``` +## References + +- A. La Rocca, M. Saveriano, A. Del Prete, "VBOC: Learning the Viability Boundary of a Robot Manipulator using Optimal Control", IEEE Robotics and Automation Letters, 2023 +- V. Rubies-Royo and C. Tomlin, "Recursive Regression with Neural Networks: Approximating the HJI PDE Solution", in 5th International Conference on Learning Representations, 2017 +- A. Chakrabarty, C. Danielson, S. D. Cairano, and A. Raghunathan, "Active Learning for Estimating Reachable Sets for Systems With Unknown Dynamics", IEEE Transactions on Cybernetics, 2020 + + diff --git a/doublependulum_comparison.py b/TestData/doublependulum_comparison.py similarity index 99% rename from doublependulum_comparison.py rename to TestData/doublependulum_comparison.py index a1bd089..deeed75 100644 --- a/doublependulum_comparison.py +++ b/TestData/doublependulum_comparison.py @@ -28,7 +28,7 @@ model_dir = NeuralNetDIR(4, 300, 1).to(device) criterion_dir = nn.MSELoss() model_dir.load_state_dict(torch.load('VBOC/model_2dof_vboc')) -data_reverse = np.load('VBOC/data_2dof_vboc.npy') +data_reverse = np.load('VBOC/2dof_vboc.npy') mean_dir = torch.load('VBOC/mean_2dof_vboc') std_dir = torch.load('VBOC/std_2dof_vboc') diff --git a/doublependulum_testdata.py b/TestData/doublependulum_testdata.py similarity index 100% rename from doublependulum_testdata.py rename to TestData/doublependulum_testdata.py diff --git a/pendulum_comparison.py b/TestData/pendulum_comparison.py similarity index 100% rename from pendulum_comparison.py rename to TestData/pendulum_comparison.py diff --git a/pendulum_testdata.py b/TestData/pendulum_testdata.py similarity index 100% rename from pendulum_testdata.py rename to TestData/pendulum_testdata.py diff --git a/triplependulum_comparison.py b/TestData/triplependulum_comparison.py similarity index 100% rename from triplependulum_comparison.py rename to TestData/triplependulum_comparison.py diff --git a/triplependulum_testdata.py b/TestData/triplependulum_testdata.py similarity index 100% rename from triplependulum_testdata.py rename to TestData/triplependulum_testdata.py diff --git a/VBOC/Cartesian constraints/doublependulum_class_fixedveldir.py b/VBOC/Cartesian constraints/doublependulum_class_fixedveldir.py deleted file mode 100644 index 80f52e3..0000000 --- a/VBOC/Cartesian constraints/doublependulum_class_fixedveldir.py +++ /dev/null @@ -1,308 +0,0 @@ -from acados_template import AcadosOcp, AcadosOcpSolver, AcadosSim, AcadosSimSolver -import numpy as np -from acados_template import AcadosModel -from casadi import SX, vertcat, cos, sin, sqrt - -class OCPdoublependulum: - def __init__(self): - - # --------------------SET MODEL-------------------- - # ------------------------------------------------- - model_name = "double_pendulum_ode" - - # constants - self.m1 = 0.4 # mass of the first link [kself.g] - self.m2 = 0.4 # mass of the second link [kself.g] - self.g = 9.81 # self.gravity constant [m/s^2] - self.l1 = 0.8 # lenself.gth of the first link [m] - self.l2 = 0.8 # lenself.gth of the second link [m] - - # states - theta1 = SX.sym("theta1") - theta2 = SX.sym("theta2") - dtheta1 = SX.sym("dtheta1") - dtheta2 = SX.sym("dtheta2") - dt = SX.sym('dt') - self.x = vertcat(theta1, theta2, dtheta1, dtheta2, dt) - - # controls - C1 = SX.sym("C1") - C2 = SX.sym("C2") - u = vertcat(C1, C2) - - # parameters - w1 = SX.sym("w1") - w2 = SX.sym("w2") - wt = SX.sym("wt") - p = vertcat(w1, w2, wt) - - # dynamics - f_expl = dt*vertcat( - dtheta1, - dtheta2, - ( - self.l1**2 - * self.l2 - * self.m2 - * dtheta1**2 - * sin(-2 * theta2 + 2 * theta1) - + 2 * C2 * cos(-theta2 + theta1) * self.l1 - + 2 - * ( - self.g * sin(-2 * theta2 + theta1) * self.l1 * self.m2 / 2 - + sin(-theta2 + theta1) * dtheta2**2 * self.l1 * self.l2 * self.m2 - + self.g * self.l1 * (self.m1 + self.m2 / 2) * sin(theta1) - - C1 - ) - * self.l2 - ) - / self.l1**2 - / self.l2 - / (self.m2 * cos(-2 * theta2 + 2 * theta1) - 2 * self.m1 - self.m2), - ( - -self.g - * self.l1 - * self.l2 - * self.m2 - * (self.m1 + self.m2) - * sin(-theta2 + 2 * theta1) - - self.l1 - * self.l2**2 - * self.m2**2 - * dtheta2**2 - * sin(-2 * theta2 + 2 * theta1) - - 2 - * dtheta1**2 - * self.l1**2 - * self.l2 - * self.m2 - * (self.m1 + self.m2) - * sin(-theta2 + theta1) - + 2 * C1 * cos(-theta2 + theta1) * self.l2 * self.m2 - + self.l1 - * (self.m1 + self.m2) - * (sin(theta2) * self.g * self.l2 * self.m2 - 2 * C2) - ) - / self.l2**2 - / self.l1 - / self.m2 - / (self.m2 * cos(-2 * theta2 + 2 * theta1) - 2 * self.m1 - self.m2), - 0. - ) - - self.model = AcadosModel() - - self.model.f_expl_expr = f_expl - self.model.x = self.x - self.model.u = u - self.model.p = p - self.model.name = model_name - # ------------------------------------------------- - - # ---------------------SET OCP--------------------- - # ------------------------------------------------- - self.ocp = AcadosOcp() - - # times - self.N = 100 - self.ocp.solver_options.tf = self.N - self.ocp.dims.N = self.N - - # ocp model - self.ocp.model = self.model - - # set constraints - self.Cmax = 10. - self.thetamax = np.pi / 4 + np.pi - self.thetamin = - np.pi / 4 + np.pi - self.dthetamax = 10. - - # cost - self.ocp.cost.cost_type_0 = 'EXTERNAL' - self.ocp.cost.cost_type = 'EXTERNAL' - - self.ocp.model.cost_expr_ext_cost_0 = w1 * dtheta1 + w2 * dtheta2 + wt * dt - self.ocp.model.cost_expr_ext_cost = wt * dt - self.ocp.parameter_values = np.array([0., 0., 0.]) - - self.ocp.constraints.lbu = np.array([-self.Cmax, -self.Cmax]) - self.ocp.constraints.ubu = np.array([self.Cmax, self.Cmax]) - self.ocp.constraints.idxbu = np.array([0, 1]) - self.ocp.constraints.lbx = np.array( - [self.thetamin, self.thetamin, -self.dthetamax, -self.dthetamax, 0.] - ) - self.ocp.constraints.ubx = np.array( - [self.thetamax, self.thetamax, self.dthetamax, self.dthetamax, 1e-2] - ) - self.ocp.constraints.idxbx = np.array([0, 1, 2, 3, 4]) - self.ocp.constraints.lbx_e = np.array( - [self.thetamin, self.thetamin, -self.dthetamax, -self.dthetamax, 0.] - ) - self.ocp.constraints.ubx_e = np.array( - [self.thetamax, self.thetamax, self.dthetamax, self.dthetamax, 1e-2] - ) - self.ocp.constraints.idxbx_e = np.array([0, 1, 2, 3, 4]) - self.ocp.constraints.lbx_0 = np.array([self.thetamin, self.thetamin, -self.dthetamax, -self.dthetamax, 0.]) - self.ocp.constraints.ubx_0 = np.array([self.thetamax, self.thetamax, self.dthetamax, self.dthetamax, 1e-2]) - self.ocp.constraints.idxbx_0 = np.array([0, 1, 2, 3, 4]) - - self.ocp.constraints.C = np.zeros((2,5)) - self.ocp.constraints.D = np.zeros((2,2)) - self.ocp.constraints.lg = np.zeros((2)) - self.ocp.constraints.ug = np.zeros((2)) - - self.radius = self.l2/4 - self.x_c = 0 - self.y_c = - self.l1 - self.l2/2 - - self.model.con_h_expr = (self.l1*sin(theta1) + self.l2*sin(theta2) - self.x_c)**2 + (self.l1*cos(theta1) + self.l2*cos(theta2) - self.y_c)**2 - self.ocp.constraints.lh = np.array([(self.radius)**2]) - self.ocp.constraints.uh = np.array([1e6]) - - # ------------------------------------------------- - - self.ocp.solver_options.nlp_solver_type = "SQP" - self.ocp.solver_options.hessian_approx = 'EXACT' - self.ocp.solver_options.exact_hess_constr = 0 - # self.ocp.solver_options.exact_hess_cost = 0 - self.ocp.solver_options.exact_hess_dyn = 0 - self.ocp.solver_options.nlp_solver_tol_stat = 1e-3 - self.ocp.solver_options.qp_solver_tol_stat = 1e-3 - self.ocp.solver_options.qp_solver_iter_max = 100 - self.ocp.solver_options.nlp_solver_max_iter = 1000 - self.ocp.solver_options.globalization = "MERIT_BACKTRACKING" - self.ocp.solver_options.alpha_reduction = 0.3 - self.ocp.solver_options.alpha_min = 1e-2 - self.ocp.solver_options.levenberg_marquardt = 1e-5 - -class OCPdoublependulumINIT(OCPdoublependulum): - def __init__(self): - - # inherit initialization - super().__init__() - - # solver - self.ocp_solver = AcadosOcpSolver(self.ocp, json_file="acados_ocp.json") - - def OCP_solve(self, x_sol_guess, u_sol_guess, p, q_lb, q_ub, u_lb, u_ub, q_init_lb, q_init_ub, q_fin_lb, q_fin_ub): - - # Reset current iterate: - self.ocp_solver.reset() - - # Set parameters, guesses and constraints: - for i in range(self.N): - self.ocp_solver.set(i, 'x', x_sol_guess[i]) - self.ocp_solver.set(i, 'u', u_sol_guess[i]) - self.ocp_solver.set(i, 'p', p) - self.ocp_solver.constraints_set(i, 'lbx', q_lb) - self.ocp_solver.constraints_set(i, 'ubx', q_ub) - self.ocp_solver.constraints_set(i, 'lbu', u_lb) - self.ocp_solver.constraints_set(i, 'ubu', u_ub) - self.ocp_solver.constraints_set(i, 'C', np.zeros((2,5))) - self.ocp_solver.constraints_set(i, 'D', np.zeros((2,2))) - self.ocp_solver.constraints_set(i, 'lg', np.zeros((2))) - self.ocp_solver.constraints_set(i, 'ug', np.zeros((2))) - - C = np.zeros((2,5)) - d = np.array([p[:2].tolist()]) - dt = np.transpose(d) - C[:,2:4] = np.identity(2)-np.matmul(dt,d) - self.ocp_solver.constraints_set(0, "C", C, api='new') - - self.ocp_solver.constraints_set(0, "lbx", q_init_lb) - self.ocp_solver.constraints_set(0, "ubx", q_init_ub) - - self.ocp_solver.constraints_set(self.N, "lbx", q_fin_lb) - self.ocp_solver.constraints_set(self.N, "ubx", q_fin_ub) - self.ocp_solver.set(self.N, 'x', x_sol_guess[-1]) - self.ocp_solver.set(self.N, 'p', p) - - # Solve the OCP: - status = self.ocp_solver.solve() - - return status - - -class SYMdoublependulumINIT(OCPdoublependulum): - def __init__(self): - - # inherit initialization - super().__init__() - - # states - theta1 = SX.sym("theta1") - theta2 = SX.sym("theta2") - dtheta1 = SX.sym("dtheta1") - dtheta2 = SX.sym("dtheta2") - self.x = vertcat(theta1, theta2, dtheta1, dtheta2) - - # controls - C1 = SX.sym("C1") - C2 = SX.sym("C2") - u = vertcat(C1, C2) - - p = [] - - # dynamics - f_expl = vertcat( - dtheta1, - dtheta2, - ( - self.l1**2 - * self.l2 - * self.m2 - * dtheta1**2 - * sin(-2 * theta2 + 2 * theta1) - + 2 * C2 * cos(-theta2 + theta1) * self.l1 - + 2 - * ( - self.g * sin(-2 * theta2 + theta1) * self.l1 * self.m2 / 2 - + sin(-theta2 + theta1) * dtheta2**2 * self.l1 * self.l2 * self.m2 - + self.g * self.l1 * (self.m1 + self.m2 / 2) * sin(theta1) - - C1 - ) - * self.l2 - ) - / self.l1**2 - / self.l2 - / (self.m2 * cos(-2 * theta2 + 2 * theta1) - 2 * self.m1 - self.m2), - ( - -self.g - * self.l1 - * self.l2 - * self.m2 - * (self.m1 + self.m2) - * sin(-theta2 + 2 * theta1) - - self.l1 - * self.l2**2 - * self.m2**2 - * dtheta2**2 - * sin(-2 * theta2 + 2 * theta1) - - 2 - * dtheta1**2 - * self.l1**2 - * self.l2 - * self.m2 - * (self.m1 + self.m2) - * sin(-theta2 + theta1) - + 2 * C1 * cos(-theta2 + theta1) * self.l2 * self.m2 - + self.l1 - * (self.m1 + self.m2) - * (sin(theta2) * self.g * self.l2 * self.m2 - 2 * C2) - ) - / self.l2**2 - / self.l1 - / self.m2 - / (self.m2 * cos(-2 * theta2 + 2 * theta1) - 2 * self.m1 - self.m2), - ) - - self.model.f_expl_expr = f_expl - self.model.x = self.x - self.model.p = p - self.model.u = u - - sim = AcadosSim() - sim.model = self.model - sim.solver_options.T = 1e-2 - sim.solver_options.num_stages = 4 - self.acados_integrator = AcadosSimSolver(sim) diff --git a/VBOC/Cartesian constraints/my_nn.py b/VBOC/Cartesian constraints/my_nn.py deleted file mode 100644 index a124b9b..0000000 --- a/VBOC/Cartesian constraints/my_nn.py +++ /dev/null @@ -1,18 +0,0 @@ -import torch.nn as nn - -class NeuralNetRegression(nn.Module): - def __init__(self, input_size, hidden_size, output_size): - super(NeuralNetRegression, self).__init__() - self.linear_relu_stack = nn.Sequential( - nn.Linear(input_size, hidden_size), - nn.ReLU(), - nn.Linear(hidden_size, hidden_size), - nn.ReLU(), - nn.Linear(hidden_size, output_size), - nn.ReLU(), - ) - - def forward(self, x): - out = self.linear_relu_stack(x) - return out - \ No newline at end of file diff --git a/VBOC/Cartesian constraints/plots_2dof.py b/VBOC/Cartesian constraints/plots_2dof.py deleted file mode 100644 index eee7c0c..0000000 --- a/VBOC/Cartesian constraints/plots_2dof.py +++ /dev/null @@ -1,298 +0,0 @@ -import numpy as np -import matplotlib.pyplot as plt -from numpy.linalg import norm as norm -import warnings -warnings.filterwarnings("ignore") -import torch -import torch.nn as nn -import random -from matplotlib.patches import Circle -from doublependulum_class_fixedveldir import OCPdoublependulumINIT - -def plots_2dof(X_save, q_min, q_max, v_min, v_max, model_dir, mean_dir, std_dir, device,ocp): #, model_dir_2, mean_dir_2, std_dir_2 - - # Plot all training data: - plt.figure(figsize=(6, 4)) - plt.scatter(X_save[:,0],X_save[:,1],s=0.1) - plt.xlim([q_min, q_max]) - plt.ylim([q_min, q_max]) - plt.grid(True) - plt.ylabel('$q_2$ (rad)') - plt.xlabel('$q_1$ (rad)') - plt.title("Training dataset positions") - plt.figure(figsize=(6, 4)) - plt.scatter(X_save[:,2],X_save[:,3],s=0.1) - plt.grid(True) - plt.xlim([v_min, v_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_2$ (rad/s)') - plt.xlabel('$\dot{q}_1$ (rad/s)') - plt.title("Training dataset velocities") - - l1 = ocp.l1 - l2 = ocp.l2 - theta1 = np.pi / 4 + np.pi - theta2 = np.pi + np.pi / 8 - - circle = plt.Circle((ocp.x_c, ocp.y_c), ocp.radius, color='b') - fig, ax = plt.subplots(figsize=(4, 4)) - ax.add_patch(circle) - # for theta1 in np.linspace(ocp.thetamin, ocp.thetamax, 10): - # for theta2 in np.linspace(ocp.thetamin, ocp.thetamax, 10): - plt.plot(0, 0, marker = 'o', color='k', markersize=10) - plt.plot([0,l1*np.sin(theta1),l1*np.sin(theta1) + l2*np.sin(theta2)], [0,l1*np.cos(theta1),l1*np.cos(theta1) + l2*np.cos(theta2)], marker = 'o', color='k') - plt.plot([0,l1*np.sin(theta1-np.pi/12),l1*np.sin(theta1-np.pi/12) + l2*np.sin(theta2-np.pi/12)], [0,l1*np.cos(theta1-np.pi/12),l1*np.cos(theta1-np.pi/12) + l2*np.cos(theta2-np.pi/12)], marker = 'o', color='gray',alpha=0.2) - plt.xlim([-l1-l2+0.7, l1+l2-0.7]) - plt.ylim([-l1-l2-0.1, 0.1]) - plt.axis('off') - - # Show the resulting set approximation: - with torch.no_grad(): - h = 0.01 - xx, yy = np.meshgrid(np.arange(q_min, q_max, h), np.arange(v_min, v_max, h)) - xrav = xx.ravel() - yrav = yy.ravel() - - plt.figure() - inp = np.c_[ - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - xrav, - np.zeros(yrav.shape[0]), - yrav, - np.empty(yrav.shape[0]), - ] - for i in range(inp.shape[0]): - vel_norm = norm([inp[i][2],inp[i][3]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][2] = inp[i][2] / vel_norm - inp[i][3] = inp[i][3] / vel_norm - inp[i][4] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:4].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][4] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - # xit = [] - # yit = [] - # for i in range(X_save.shape[0]): - # if ( - # norm(X_save[i][0] - (q_min + q_max) / 2) < 0.01 - # and norm(X_save[i][2]) < 0.1 - # ): - # xit.append(X_save[i][1]) - # yit.append(X_save[i][3]) - # plt.plot( - # xit, - # yit, - # "ko", - # markersize=2 - # ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_2$ (rad/s)') - plt.xlabel('$q_2$ (rad)') - plt.grid() - plt.title("Set section at $q_1=\pi$ rad and $\dot{q}_1=0$ rad/s") - - plt.figure() - inp = np.c_[ - xrav, - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - yrav, - np.zeros(yrav.shape[0]), - np.empty(yrav.shape[0]), - ] - for i in range(inp.shape[0]): - vel_norm = norm([inp[i][2],inp[i][3]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][2] = inp[i][2] / vel_norm - inp[i][3] = inp[i][3] / vel_norm - inp[i][4] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:4].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][4] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - # xit = [] - # yit = [] - # for i in range(X_save.shape[0]): - # if ( - # norm(X_save[i][1] - (q_min + q_max) / 2) < 0.01 - # and norm(X_save[i][3]) < 0.1 - # ): - # xit.append(X_save[i][0]) - # yit.append(X_save[i][2]) - # plt.plot( - # xit, - # yit, - # "ko", - # markersize=2 - # ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_1$S (rad/s)') - plt.xlabel('$q_1$ (rad)') - plt.grid() - plt.title("Set section at $q_2=\pi$ rad and $\dot{q}_2=0$ rad/s") - - # Plots: - h = 0.05 - xx, yy = np.meshgrid(np.arange(v_min, v_max+h, h), np.arange(v_min, v_max+h, h)) - xrav = xx.ravel() - yrav = yy.ravel() - - for _ in range(5): - q1ran = q_min + random.random() * (q_max-q_min) - q2ran = q_min + random.random() * (q_max-q_min) - # q1ran = np.pi - np.arcsin(0.6) - # q2ran = np.pi + np.arcsin(0.6) - - # Plot the results: - plt.figure() - inp = np.float32( - np.c_[ - q1ran * np.ones(xrav.shape[0]), - q2ran * np.ones(xrav.shape[0]), - xrav, - yrav, - np.empty(yrav.shape[0]), - ] - ) - for i in range(inp.shape[0]): - vel_norm = norm([inp[i][2],inp[i][3]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][2] = inp[i][2] / vel_norm - inp[i][3] = inp[i][3] / vel_norm - inp[i][4] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:4].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][4] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - xit = [] - yit = [] - for i in range(X_save.shape[0]): - if ( - norm(X_save[i][0] - q1ran) < 0.01 - and norm(X_save[i][1] - q2ran) < 0.01 - ): - xit.append(X_save[i][2]) - yit.append(X_save[i][3]) - plt.plot( - xit, - yit, - "ko", - markersize=2 - ) - plt.xlim([v_min, v_max]) - plt.ylim([v_min, v_max]) - plt.grid() - plt.title("q1="+str(q1ran)+" q2="+str(q2ran)+" RT") - - # # Plot the results: - # plt.figure() - # inp = np.float32( - # np.c_[ - # q1ran * np.ones(xrav.shape[0]), - # q2ran * np.ones(xrav.shape[0]), - # xrav, - # yrav, - # np.empty(yrav.shape[0]), - # ] - # ) - # for i in range(inp.shape[0]): - # vel_norm = norm([inp[i][2],inp[i][3]]) - # inp[i][0] = (inp[i][0] - mean_dir_2) / std_dir_2 - # inp[i][1] = (inp[i][1] - mean_dir_2) / std_dir_2 - # if vel_norm != 0: - # inp[i][2] = inp[i][2] / vel_norm - # inp[i][3] = inp[i][3] / vel_norm - # inp[i][4] = vel_norm - # out = (model_dir_2(torch.from_numpy(inp[:,:4].astype(np.float32)).to(device))).cpu().numpy() - # y_pred = np.empty(out.shape) - # for i in range(len(out)): - # if inp[i][4] > out[i]: - # y_pred[i] = 0 - # else: - # y_pred[i] = 1 - # Z = y_pred.reshape(xx.shape) - # plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - # xit = [] - # yit = [] - # for i in range(X_save.shape[0]): - # if ( - # norm(X_save[i][0] - q1ran) < 0.01 - # and norm(X_save[i][1] - q2ran) < 0.01 - # ): - # xit.append(X_save[i][2]) - # yit.append(X_save[i][3]) - # plt.plot( - # xit, - # yit, - # "ko", - # markersize=2 - # ) - # plt.xlim([v_min, v_max]) - # plt.ylim([v_min, v_max]) - # plt.grid() - # plt.title("q1="+str(q1ran)+" q2="+str(q2ran)+" RT2") - - plt.show() - -# ocp = OCPdoublependulumINIT() -# sim = SYMdoublependulumINIT() - -# # Position, velocity and torque bounds: -# v_max = ocp.dthetamax -# v_min = - ocp.dthetamax -# q_max = ocp.thetamax -# q_min = ocp.thetamin -# tau_max = ocp.Cmax - -# device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # pytorch device - -# input_layers = ocp.ocp.dims.nx - 1 -# hidden_layers = (input_layers - 1) * 100 -# output_layers = 1 -# learning_rate = 1e-3 - -# # Load training data: -# X_save = np.load('data_' + str(2) + 'dof_vboc_' + str(int(v_max)) + '.npy') - -# # Model and optimizer: -# model_dir = NeuralNetRegression(input_layers, hidden_layers, output_layers).to(device) -# criterion_dir = nn.MSELoss() -# optimizer_dir = torch.optim.Adam(model_dir.parameters(), lr=learning_rate) - -# mean_dir = torch.load('mean_' + str(2) + 'dof_vboc_' + str(int(v_max)) + '_' + str(hidden_layers)) -# std_dir = torch.load('std_' + str(2) + 'dof_vboc_' + str(int(v_max)) + '_' + str(hidden_layers)) - -# model_dir.load_state_dict(torch.load('model_2dof_vboc_10_300_0.5_0.73790205')) - -# # mean_dir_2 = torch.load('mean_' + str(2) + 'dof_vboc_' + str(int(v_max)) + '_' + str(hidden_layers) + '_nocons') -# # std_dir_2 = torch.load('std_' + str(2) + 'dof_vboc_' + str(int(v_max)) + '_' + str(hidden_layers) + '_nocons') - -# # model_dir_2 = NeuralNetRegression(input_layers, hidden_layers, output_layers).to(device) -# # model_dir_2.load_state_dict(torch.load('model_2dof_vboc_10_300_0.5_2.4007833_nocons')) - -# plots_2dof(X_save, q_min, q_max, v_min, v_max, model_dir, mean_dir, std_dir, device) #, model_dir_2, mean_dir_2, std_dir_2 - -# plt.show() \ No newline at end of file diff --git a/VBOC/Cartesian constraints/vboc_multiprocessing.py b/VBOC/Cartesian constraints/vboc_multiprocessing.py deleted file mode 100644 index caf0c00..0000000 --- a/VBOC/Cartesian constraints/vboc_multiprocessing.py +++ /dev/null @@ -1,784 +0,0 @@ -import numpy as np -import random -import matplotlib.pyplot as plt -from numpy.linalg import norm as norm -import time -# from triplependulum_class_fixedveldir import OCPtriplependulumINIT, SYMtriplependulumINIT -from doublependulum_class_fixedveldir import OCPdoublependulumINIT, SYMdoublependulumINIT -import warnings -warnings.filterwarnings("ignore") -import torch -import torch.nn as nn -from my_nn import NeuralNetRegression -from multiprocessing import Pool -from torch.utils.data import DataLoader -from plots_2dof import plots_2dof -# from plots_3dof import plots_3dof -import torch.nn.utils.prune as prune - -def testing_test(v): - - # Reset the time parameters: - N = N_start - ocp.N = N - ocp.ocp_solver.set_new_time_steps(np.full((N,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N) - - # Initialization of the OCP: The OCP is set to find an extreme trajectory. The initial joint positions - # are set to random values, except for the reference joint whose position is set to an extreme value. - # The initial joint velocities are left free. The final velocities are all set to 0. The OCP has to maximise - # the initial velocity norm in a predefined direction. - - # Initial velocity optimization direction: - p = np.zeros((system_sel+1)) - - for l in range(system_sel): - p[l] = random.random() * random.choice([-1, 1]) - - norm_weights = norm(p) - p = p/norm_weights - - # Bounds on the initial state: - q_init_lb = np.full((ocp.ocp.dims.nx), v_min) - q_init_ub = np.full((ocp.ocp.dims.nx), v_max) - q_init_lb[-1] = dt_sym - q_init_ub[-1] = dt_sym - - q_init_oth = np.empty((system_sel,)) - - for l in range(system_sel): - q_init_oth[l] = q_min + random.random() * (q_max-q_min) - - q_init_lb[l] = q_init_oth[l] - q_init_ub[l] = q_init_oth[l] - - # State and input bounds: - q_lb = np.copy(q_init_lb) - q_ub = np.copy(q_init_ub) - - for l in range(system_sel): - q_lb[l] = q_min - q_ub[l] = q_max - - u_lb = np.full((system_sel), -tau_max) - u_ub = np.full((system_sel), tau_max) - - # Bounds on the final state: - q_fin_lb = np.copy(q_lb) - q_fin_ub = np.copy(q_ub) - - for l in range(system_sel): - q_fin_lb[l+system_sel] = 0. - q_fin_ub[l+system_sel] = 0. - - # Guess: - x_sol_guess = np.empty((N, ocp.ocp.dims.nx)) - u_sol_guess = np.zeros((N, ocp.ocp.dims.nu)) - - x_guess = np.zeros((ocp.ocp.dims.nx,)) - - for l in range(system_sel): - x_guess[l] = q_init_oth[l] - - x_guess[-1] = dt_sym - - x_sol_guess[:] = x_guess - - cost_old = 1e6 - all_ok = False - - # Iteratively solve the OCP with an increased number of time steps until the solution does not change. - # If the solver fails, try with a slightly different initial condition: - while True: - - # Solve the OCP: - status = ocp.OCP_solve(x_sol_guess, u_sol_guess, p, q_lb, q_ub, u_lb, u_ub, q_init_lb, q_init_ub, q_fin_lb, q_fin_ub) - - if status == 0: # the solver has found a solution - - # Compare the current cost with the previous one: - cost_new = ocp.ocp_solver.get_cost() - - if cost_new > cost_old - tol: # the time is sufficient to have achieved an optimal solution - all_ok = True - break - - cost_old = cost_new - - # Update the guess with the current solution: - x_sol_guess = np.empty((N+1,ocp.ocp.dims.nx)) - u_sol_guess = np.empty((N+1,ocp.ocp.dims.nu)) - for i in range(N): - x_sol_guess[i] = ocp.ocp_solver.get(i, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i, "u") - x_sol_guess[N] = ocp.ocp_solver.get(N, "x") - u_sol_guess[N] = np.zeros((ocp.ocp.dims.nu)) - - # Increase the number of time steps: - N = N + 1 - ocp.N = N - ocp.ocp_solver.set_new_time_steps(np.full((N,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N) - - else: - break - - if all_ok: - return [ocp.ocp_solver.get(0, "x")] - else: - return None - -def testing(v): - - valid_data = np.ndarray((0, ocp.ocp.dims.nx - 1)) - - # Reset the time parameters: - N = N_start - ocp.N = N - ocp.ocp_solver.set_new_time_steps(np.full((N,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N) - - # Initialization of the OCP: The OCP is set to find an extreme trajectory. The initial joint positions - # are set to random values, except for the reference joint whose position is set to an extreme value. - # The initial joint velocities are left free. The final velocities are all set to 0. The OCP has to maximise - # the initial velocity norm in a predefined direction. - - # Selection of the reference joint: - joint_sel = random.choice(range(system_sel)) - - # Selection of the start position of the reference joint: - vel_sel = random.choice([-1, 1]) # -1 to maximise initial vel, + 1 to minimize it - - # Initial velocity optimization direction: - p = np.zeros((system_sel+1)) - - for l in range(system_sel): - if l == joint_sel: - p[l] = random.random() * vel_sel - else: - p[l] = random.random() * random.choice([-1, 1]) - - norm_weights = norm(p) - p = p/norm_weights - - # Bounds on the initial state: - q_init_lb = np.full((ocp.ocp.dims.nx), v_min) - q_init_ub = np.full((ocp.ocp.dims.nx), v_max) - q_init_lb[-1] = dt_sym - q_init_ub[-1] = dt_sym - - for l in range(system_sel): - - if l == joint_sel: - - if vel_sel == -1: - q_init_sel = q_min + eps - q_fin_sel = q_max - eps - else: - q_init_sel = q_max - eps - q_fin_sel = q_min + eps - - q_init_lb[l] = q_init_sel - q_init_ub[l] = q_init_sel - - else: - q_init_oth = q_min + random.random() * (q_max-q_min) - - if q_init_oth > q_max - eps: - q_init_oth = q_init_oth - eps - if q_init_oth < q_min + eps: - q_init_oth = q_init_oth + eps - - q_init_lb[l] = q_init_oth - q_init_ub[l] = q_init_oth - - # State and input bounds: - q_lb = np.copy(q_init_lb) - q_ub = np.copy(q_init_ub) - - for l in range(system_sel): - q_lb[l] = q_min - q_ub[l] = q_max - - u_lb = np.full((system_sel), -tau_max) - u_ub = np.full((system_sel), tau_max) - - # Bounds on the final state: - q_fin_lb = np.copy(q_lb) - q_fin_ub = np.copy(q_ub) - - for l in range(system_sel): - q_fin_lb[l+system_sel] = 0. - q_fin_ub[l+system_sel] = 0. - - # Guess: - x_sol_guess = np.empty((N, ocp.ocp.dims.nx)) - u_sol_guess = np.empty((N, ocp.ocp.dims.nu)) - - for i, tau in enumerate(np.linspace(0, 1, N)): - - x_guess = np.copy(q_init_ub) - - for l in range(system_sel): - if l == joint_sel: - x_guess[l] = (1-tau)*q_init_sel + tau*q_fin_sel - x_guess[l+system_sel] = 2*(1-tau)*(q_fin_sel-q_init_sel) - else: - x_guess[l] = q_init_oth - x_guess[l+system_sel] = 0 - - x_sol_guess[i] = x_guess - u_sol_guess[i] = np.zeros((system_sel)) - - cost_old = 1e6 - all_ok = False - - # Iteratively solve the OCP with an increased number of time steps until the solution does not change. - # If the solver fails, try with a slightly different initial condition: - for _ in range(10): - - # Solve the OCP: - status = ocp.OCP_solve(x_sol_guess, u_sol_guess, p, q_lb, q_ub, u_lb, u_ub, q_init_lb, q_init_ub, q_fin_lb, q_fin_ub) - - if status == 0: # the solver has found a solution - - # Compare the current cost with the previous one: - cost_new = ocp.ocp_solver.get_cost() - - if cost_new > cost_old - tol: # the time is sufficient to have achieved an optimal solution - all_ok = True - break - - cost_old = cost_new - - # Update the guess with the current solution: - x_sol_guess = np.empty((N+1,ocp.ocp.dims.nx)) - u_sol_guess = np.empty((N+1,ocp.ocp.dims.nu)) - for i in range(N): - x_sol_guess[i] = ocp.ocp_solver.get(i, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i, "u") - x_sol_guess[N] = ocp.ocp_solver.get(N, "x") - u_sol_guess[N] = np.zeros((ocp.ocp.dims.nu)) - - # Increase the number of time steps: - N = N + 1 - ocp.N = N - ocp.ocp_solver.set_new_time_steps(np.full((N,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N) - - else: - break - - # else: - - # # Reset the number of steps used in the OCP: - # N = N_start - # ocp.N = N - # ocp.ocp_solver.set_new_time_steps(np.full((N,), 1.)) - # ocp.ocp_solver.update_qp_solver_cond_N(N) - - # # Initial velocity optimization direction (the cost is in the form p[0] * dtheta1 + p[1] * dtheta2 + p[2] * dt): - # for l in range(system_sel): - # p[l] = p[l] + random.random() * random.choice([-1, 1]) * 0.01 - - # norm_weights = norm(p) - # p = p/norm_weights - - # # Initial position of the other joint: - # for j in range(system_sel): - # if j != joint_sel: - - # val = q_init_lb[j] + random.random() * random.choice([-1, 1]) * 0.01 - - # if val > q_max - eps: - # val = val - eps - # if val < q_min + eps: - # val = val + eps - - # q_init_lb[j] = val - # q_init_ub[j] = val - - # # Guess: - # x_sol_guess = np.empty((N, ocp.ocp.dims.nx)) - # u_sol_guess = np.empty((N, ocp.ocp.dims.nu)) - - # for i, tau in enumerate(np.linspace(0, 1, N)): - - # x_guess = np.copy(q_init_ub) - - # for l in range(system_sel): - # if l == joint_sel: - # x_guess[l+system_sel] = 2*(1-tau)*(q_fin_sel-q_init_sel) - # else: - # x_guess[l+system_sel] = 0 - - # x_guess[joint_sel] = (1-tau)*q_init_sel + tau*q_fin_sel - - # x_sol_guess[i] = x_guess - # u_sol_guess[i] = np.zeros((system_sel)) - - # cost_old = 1e6 - - if all_ok: - - # Save the optimal trajectory: - x_sol = np.empty((N+1, ocp.ocp.dims.nx)) - u_sol = np.empty((N, ocp.ocp.dims.nu)) - - for i in range(N): - x_sol[i] = ocp.ocp_solver.get(i, "x") - u_sol[i] = ocp.ocp_solver.get(i, "u") - - x_sol[N] = ocp.ocp_solver.get(N, "x") - - # Generate the unviable sample in the cost direction: - x_sym = np.full((N+1,ocp.ocp.dims.nx - 1), None) - - x_out = np.copy(x_sol[0][:ocp.ocp.dims.nx - 1]) - for l in range(system_sel): - x_out[l+system_sel] = x_out[l+system_sel] - eps * p[l] - - # save the initial state: - valid_data = np.append(valid_data, [x_sol[0][:ocp.ocp.dims.nx - 1]], axis = 0) - - # Check if initial velocities lie on a limit: - if any(i > v_max or i < v_min for i in x_out[system_sel:ocp.ocp.dims.nx - 1]): - is_x_at_limit = True # the state is on dX - else: - is_x_at_limit = False # the state is on dV - x_sym[0] = x_out - - # Iterate through the trajectory to verify the location of the states with respect to V: - for f in range(1, N): - - if is_x_at_limit: - - x_out = np.copy(x_sol[f][:ocp.ocp.dims.nx - 1]) - norm_vel = norm(x_out[system_sel:]) - - for l in range(system_sel): - x_out[l+system_sel] = x_out[l+system_sel] + eps * x_out[l+system_sel]/norm_vel - - # If the previous state was on a limit, the current state location cannot be identified using - # the corresponding unviable state but it has to rely on the proximity to the state limits - # (more restrictive): - if any(i > q_max - eps or i < q_min + eps for i in x_sol[f][:system_sel]) or any(i > v_max or i < v_min for i in x_out[system_sel:ocp.ocp.dims.nx - 1]): - is_x_at_limit = True # the state is on dX - - else: - is_x_at_limit = False # the state is either on the interior of V or on dV - - # if the traj detouches from a position limit it usually enters V: - if any(i > q_max - eps or i < q_min + eps for i in x_sol[f-1][:system_sel]): - break - - # Solve an OCP to verify whether the following part of the trajectory is on V or dV. To do so - # the initial joint positions are set to the current ones and the final state is fixed to the - # final state of the trajectory. The initial velocities are left free and maximized in the - # direction of the current joint velocities. - - N_test = N - f - ocp.N = N_test - ocp.ocp_solver.set_new_time_steps(np.full((N_test,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N_test) - - # Cost: - norm_weights = norm(x_sol[f][system_sel:ocp.ocp.dims.nx - 1]) - p = np.zeros((system_sel+1)) - for l in range(system_sel): - p[l] = -x_sol[f][l+system_sel]/norm_weights # the cost direction is based on the current velocity direction - - # Bounds on the initial state: - for l in range(system_sel): - q_init_lb[l] = x_sol[f][l] - q_init_ub[l] = x_sol[f][l] - - # Guess: - x_sol_guess = np.empty((N_test+1, ocp.ocp.dims.nx)) - u_sol_guess = np.empty((N_test+1, ocp.ocp.dims.nu)) - for i in range(N_test): - x_sol_guess[i] = x_sol[i+f] - u_sol_guess[i] = u_sol[i+f] - x_sol_guess[N_test] = x_sol[N] - u_sol_guess[N_test] = np.zeros((ocp.ocp.dims.nu)) - - norm_old = norm(x_sol[f][system_sel:ocp.ocp.dims.nx - 1]) # velocity norm of the original solution - norm_bef = 0 - all_ok = False - - for _ in range(5): - - # Solve the OCP: - status = ocp.OCP_solve(x_sol_guess, u_sol_guess, p, q_lb, q_ub, u_lb, u_ub, q_init_lb, q_init_ub, q_fin_lb, q_fin_ub) - - if status == 0: # the solver has found a solution - - # Compare the current cost with the previous one: - x0_new = ocp.ocp_solver.get(0, "x") - norm_new = norm(x0_new[system_sel:ocp.ocp.dims.nx - 1]) - - if norm_new < norm_bef + tol: - all_ok = True - break - - norm_bef = norm_new - - # Update the guess with the current solution: - x_sol_guess = np.empty((N_test+1,ocp.ocp.dims.nx)) - u_sol_guess = np.empty((N_test+1,ocp.ocp.dims.nu)) - for i in range(N_test): - x_sol_guess[i] = ocp.ocp_solver.get(i, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i, "u") - x_sol_guess[N_test] = ocp.ocp_solver.get(N_test, "x") - u_sol_guess[N_test] = np.zeros((ocp.ocp.dims.nu)) - - # Increase the number of time steps: - N_test = N_test + 1 - ocp.N = N_test - ocp.ocp_solver.set_new_time_steps(np.full((N_test,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N_test) - - else: - break - - if all_ok: - - # Compare the old and new velocity norms: - if norm_new > norm_old + tol: # the state is inside V - - # Update the optimal solution: - for i in range(N-f): - x_sol[i+f] = ocp.ocp_solver.get(i, "x") - u_sol[i+f] = ocp.ocp_solver.get(i, "u") - - x_out = np.copy(x_sol[f][:ocp.ocp.dims.nx - 1]) - - for l in range(system_sel): - x_out[l+system_sel] = x_out[l+system_sel] + eps * x_out[l+system_sel]/norm_new - - # Check if velocities lie on a limit: - if any(i > v_max or i < v_min for i in x_out[system_sel:ocp.ocp.dims.nx - 1]): - is_x_at_limit = True # the state is on dX - else: - is_x_at_limit = False # the state is on dV - x_sym[f] = x_out - - else: - is_x_at_limit = False # the state is on dV - - # Generate the new corresponding unviable state in the cost direction: - x_out = np.copy(x_sol[f][:ocp.ocp.dims.nx - 1]) - - for l in range(system_sel): - x_out[l+system_sel] = x_out[l+system_sel] - eps * p[l] - - if x_out[joint_sel+system_sel] > v_max: - x_out[joint_sel+system_sel] = v_max - if x_out[joint_sel+system_sel] < v_min: - x_out[joint_sel+system_sel] = v_min - - x_sym[f] = x_out - - else: # we cannot say whether the state is on dV or inside V - - for r in range(f, N): - if any(abs(i) > v_max - eps for i in x_sol[r][system_sel:ocp.ocp.dims.nx - 1]): - - # Save the viable states at velocity limits: - valid_data = np.append(valid_data, [x_sol[f][:ocp.ocp.dims.nx - 1]], axis = 0) - - break - - else: - # If the previous state was not on a limit, the current state location can be identified using - # the corresponding unviable state which can be computed by simulating the system starting from - # the previous unviable state. - - # Simulate next unviable state: - u_sym = np.copy(u_sol[f-1]) - sim.acados_integrator.set("u", u_sym) - sim.acados_integrator.set("x", x_sym[f-1]) - # sim.acados_integrator.set("T", dt_sym) - status = sim.acados_integrator.solve() - x_out = sim.acados_integrator.get("x") - x_sym[f] = x_out - - # When the state of the unviable simulated trajectory violates a limit, the corresponding viable state - # of the optimal trajectory is on dX: - if any(i > q_max or i < q_min for i in x_out[:system_sel]) or any(i > v_max or i < v_min for i in x_out[system_sel:ocp.ocp.dims.nx - 1]): - is_x_at_limit = False # the state is on dV - else: - is_x_at_limit = True # the state is on dX - - if all(i < q_max - eps and i > q_min + eps for i in x_sol[f][:system_sel]) and all(abs(i) > tol for i in x_sol[f][system_sel:ocp.ocp.dims.nx - 1]): - - # Save the viable and unviable states: - valid_data = np.append(valid_data, [x_sol[f][:ocp.ocp.dims.nx - 1]], axis = 0) - - return valid_data.tolist() - - else: - return None - -start_time = time.time() - -# Select system: -system_sel = 2 # 2 for 2dof, 3 for 3dof - -# Prune the model: -prune_model = False -prune_amount = 0.5 # percentage of connections to delete - -# Ocp initialization: -if system_sel == 3: - # ocp = OCPtriplependulumINIT() - # sim = SYMtriplependulumINIT() - raise Exception("Sorry, not implemented for now") -elif system_sel == 2: - ocp = OCPdoublependulumINIT() - sim = SYMdoublependulumINIT() -else: - raise Exception("Sorry, the selected system is not recognised") - -# Position, velocity and torque bounds: -v_max = ocp.dthetamax -v_min = - ocp.dthetamax -q_max = ocp.thetamax -q_min = ocp.thetamin -tau_max = ocp.Cmax - -device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # pytorch device - -dt_sym = 1e-2 # time step duration -N_start = 100 # initial number of timesteps -tol = ocp.ocp.solver_options.nlp_solver_tol_stat # OCP cost tolerance -eps = tol*10 # unviable data generation parameter - -# Testing data generation: -cpu_num = 30 -num_prob = 1000 -with Pool(cpu_num) as p: - traj = p.map(testing_test, range(num_prob)) - -X_temp = [i for i in traj if i is not None] -X_test = np.array([i for f in X_temp for i in f]) - -print('Start data generation') - -# Data generation: -cpu_num = 30 -num_prob = 100000 -with Pool(cpu_num) as p: - traj = p.map(testing_test, range(num_prob)) - -# traj, statpos, statneg = zip(*temp) -X_temp = [i for i in traj if i is not None] -print('Data generation completed') - -# Print data generations statistics: -solved=len(X_temp) -print('Solved/tot', len(X_temp)/num_prob) -X_save = np.array([i for f in X_temp for i in f]) -print('Saved/tot', len(X_save)/(solved*100)) - -# Save training data: -np.save('data_' + str(system_sel) + 'dof_vboc_' + str(int(v_max)), np.asarray(X_save)) - -# # Load training data: -# X_save = np.load('data_' + str(system_sel) + 'dof_vboc_' + str(int(v_max)) + '.npy') - -# Pytorch params: -input_layers = ocp.ocp.dims.nx - 1 -hidden_layers = (input_layers - 1) * 100 -output_layers = 1 -learning_rate = 1e-3 - -# Model and optimizer: -model_dir = NeuralNetRegression(input_layers, hidden_layers, output_layers).to(device) -criterion_dir = nn.MSELoss() -optimizer_dir = torch.optim.Adam(model_dir.parameters(), lr=learning_rate) - -# Joint positions mean and variance: -mean_dir, std_dir = torch.mean(torch.tensor(X_save[:,:system_sel].tolist())).to(device).item(), torch.std(torch.tensor(X_save[:,:system_sel].tolist())).to(device).item() -torch.save(mean_dir, 'mean_' + str(system_sel) + 'dof_vboc_' + str(int(v_max)) + '_' + str(hidden_layers)) -torch.save(std_dir, 'std_' + str(system_sel) + 'dof_vboc_' + str(int(v_max)) + '_' + str(hidden_layers)) - -# Rewrite data in the form [normalized positions, velocity direction, velocity norm]: -X_train_dir = np.empty((X_save.shape[0],ocp.ocp.dims.nx)) - -for i in range(X_train_dir.shape[0]): - - vel_norm = norm(X_save[i][system_sel:ocp.ocp.dims.nx - 1]) - X_train_dir[i][ocp.ocp.dims.nx - 1] = vel_norm - - for l in range(system_sel): - X_train_dir[i][l] = (X_save[i][l] - mean_dir) / std_dir - X_train_dir[i][l+system_sel] = X_save[i][l+system_sel] / vel_norm - -# model_dir.load_state_dict(torch.load('model_' + str(system_sel) + 'dof_vboc_' + str(int(v_max)) + '_' + str(hidden_layers))) - -beta = 0.95 -n_minibatch = 4096 -B = int(X_save.shape[0]*100/n_minibatch) # number of iterations for 100 epoch -it_max = B * 100 - -training_evol = [] - -print('Start model training') - -it = 1 -val = max(X_train_dir[:,ocp.ocp.dims.nx - 1]) - -# Train the model -while val > 1e-3 and it < it_max: - ind = random.sample(range(len(X_train_dir)), n_minibatch) - - X_iter_tensor = torch.Tensor([X_train_dir[i][:ocp.ocp.dims.nx - 1] for i in ind]).to(device) - y_iter_tensor = torch.Tensor([[X_train_dir[i][ocp.ocp.dims.nx - 1]] for i in ind]).to(device) - - # Forward pass - outputs = model_dir(X_iter_tensor) - loss = criterion_dir(outputs, y_iter_tensor) - - # Backward and optimize - loss.backward() - optimizer_dir.step() - optimizer_dir.zero_grad() - - val = beta * val + (1 - beta) * loss.item() - it += 1 - - if it % B == 0: - print(val) - training_evol.append(val) - -print('Model training completed') - -# Show the resulting RMSE on the training data: -outputs = np.empty((len(X_train_dir),1)) -n_minibatch_model = pow(2,15) -with torch.no_grad(): - X_iter_tensor = torch.Tensor(X_train_dir[:,:ocp.ocp.dims.nx - 1]).to(device) - y_iter_tensor = torch.Tensor(X_train_dir[:,ocp.ocp.dims.nx - 1:]).to(device) - my_dataloader = DataLoader(X_iter_tensor,batch_size=n_minibatch_model,shuffle=False) - for (idx, batch) in enumerate(my_dataloader): - if n_minibatch_model*(idx+1) > len(X_train_dir): - outputs[n_minibatch_model*idx:len(X_train_dir)] = model_dir(batch).cpu().numpy() - else: - outputs[n_minibatch_model*idx:n_minibatch_model*(idx+1)] = model_dir(batch).cpu().numpy() - outputs_tensor = torch.Tensor(outputs).to(device) - print('RMSE train data: ', torch.sqrt(criterion_dir(outputs_tensor, y_iter_tensor))) - -X_test_dir = np.empty((X_test.shape[0],ocp.ocp.dims.nx)) -for i in range(X_test_dir.shape[0]): - vel_norm = norm(X_test[i][system_sel:ocp.ocp.dims.nx - 1]) - X_test_dir[i][ocp.ocp.dims.nx - 1] = vel_norm - for l in range(system_sel): - X_test_dir[i][l] = (X_test[i][l] - mean_dir) / std_dir - X_test_dir[i][l+system_sel] = X_test[i][l+system_sel] / vel_norm - -with torch.no_grad(): - X_iter_tensor = torch.Tensor(X_test_dir[:,:ocp.ocp.dims.nx - 1]).to(device) - y_iter_tensor = torch.Tensor(X_test_dir[:,ocp.ocp.dims.nx - 1:]).to(device) - outputs = model_dir(X_iter_tensor) - print('RMSE test data: ', torch.sqrt(criterion_dir(outputs, y_iter_tensor))) - -# Save the model: -torch.save(model_dir.state_dict(), 'model_' + str(system_sel) + 'dof_vboc_' + str(int(v_max)) + '_' + str(hidden_layers)) - -if prune_model: - - parameters_to_prune = ( - (model_dir.linear_relu_stack[0], 'weight'), - (model_dir.linear_relu_stack[2], 'weight'), - (model_dir.linear_relu_stack[4], 'weight'), - (model_dir.linear_relu_stack[0], 'bias'), - (model_dir.linear_relu_stack[2], 'bias'), - (model_dir.linear_relu_stack[4], 'bias'), - ) - - prune.global_unstructured( - parameters_to_prune, - pruning_method=prune.L1Unstructured, - amount=prune_amount, - ) - - # prune.l1_unstructured(model_dir.linear_relu_stack[0], name='weight', amount=prune_amount) - # prune.l1_unstructured(model_dir.linear_relu_stack[2], name='weight', amount=prune_amount) - # prune.l1_unstructured(model_dir.linear_relu_stack[4], name='weight', amount=prune_amount) - # prune.l1_unstructured(model_dir.linear_relu_stack[0], name='bias', amount=prune_amount) - # prune.l1_unstructured(model_dir.linear_relu_stack[2], name='bias', amount=prune_amount) - # prune.l1_unstructured(model_dir.linear_relu_stack[4], name='bias', amount=prune_amount) - - # model_dir.load_state_dict({k: v for k, v in initial_model_params.items() if 'weight' in k or 'bias' in k}, strict=False) - - print('Restart model training after pruning') - - it = 1 - val = max(X_train_dir[:,ocp.ocp.dims.nx - 1]) - - # Train the model - while val > 1e-3 and it < it_max: - ind = random.sample(range(len(X_train_dir)), n_minibatch) - - X_iter_tensor = torch.Tensor([X_train_dir[i][:ocp.ocp.dims.nx - 1] for i in ind]).to(device) - y_iter_tensor = torch.Tensor([[X_train_dir[i][ocp.ocp.dims.nx - 1]] for i in ind]).to(device) - - # Forward pass - outputs = model_dir(X_iter_tensor) - loss = criterion_dir(outputs, y_iter_tensor) - - # Backward and optimize - loss.backward() - optimizer_dir.step() - optimizer_dir.zero_grad() - - val = beta * val + (1 - beta) * loss.item() - it += 1 - - if it % B == 0: - print(val) - training_evol.append(val) - - print('Model training completed') - - prune.remove(model_dir.linear_relu_stack[0], 'weight') - prune.remove(model_dir.linear_relu_stack[2], 'weight') - prune.remove(model_dir.linear_relu_stack[4], 'weight') - prune.remove(model_dir.linear_relu_stack[0], 'bias') - prune.remove(model_dir.linear_relu_stack[2], 'bias') - prune.remove(model_dir.linear_relu_stack[4], 'bias') - - # print('----------------------') - # print(list(model_dir.named_parameters())) - # print(list(model_dir.named_buffers())) - - with torch.no_grad(): - X_iter_tensor = torch.Tensor(X_test_dir[:,:ocp.ocp.dims.nx - 1]).to(device) - y_iter_tensor = torch.Tensor(X_test_dir[:,ocp.ocp.dims.nx - 1:]).to(device) - outputs = model_dir(X_iter_tensor) - print('RMSE test data after pruning: ', torch.sqrt(criterion_dir(outputs, y_iter_tensor))) - - with torch.no_grad(): - # Compute safety margin: - outputs = model_dir(X_iter_tensor).cpu().numpy() - safety_margin = np.amax(np.array([(outputs[i] - X_test_dir[i][-1])/X_test_dir[i][-1] for i in range(X_test_dir.shape[0]) if outputs[i] - X_test_dir[i][-1] > 0])) - print(safety_margin) - - # Save the pruned model: - torch.save(model_dir.state_dict(), 'model_' + str(system_sel) + 'dof_vboc_' + str(int(v_max)) + '_' + str(hidden_layers) + '_' + str(prune_amount) + '_' + str(safety_margin)) - -print("Execution time: %s seconds" % (time.time() - start_time + 2399)) - -# Show the training evolution: -plt.figure() -plt.plot(training_evol) - -# Show training data and resulting set approximation: -if system_sel == 3: - # plots_3dof(X_save, q_min, q_max, v_min, v_max, model_dir, mean_dir, std_dir, device) - pass -elif system_sel == 2: - plots_2dof(X_save, q_min, q_max, v_min, v_max, model_dir, mean_dir, std_dir, device, ocp) - -plt.show() diff --git a/VBOC/Safe MPC/comparison.py b/VBOC/Safe MPC/comparison.py deleted file mode 100644 index c2924ea..0000000 --- a/VBOC/Safe MPC/comparison.py +++ /dev/null @@ -1,64 +0,0 @@ -import numpy as np -import matplotlib.pyplot as plt - -# Load data -data_dir = "data/" -data_no = np.load(data_dir + "results_no_constraint.npz") - -use_data = 'receidinghard' # soft_term, soft_traj, hard_term, receidinghard, receidingsoft - -if use_data == 'soft_term': - data_vboc = np.load(data_dir + "results_softterm.npz") -if use_data == 'soft_traj': - data_vboc = np.load(data_dir + "results_softtraj.npz") -if use_data == 'hard_term': - data_vboc = np.load(data_dir + "results_hardterm.npz") -if use_data == 'receidinghard': - data_vboc = np.load(data_dir + "results_receidinghard.npz") -if use_data == 'receidingsoft': - data_vboc = np.load(data_dir + "results_receidingsoft.npz") - -dt = data_no["dt"] -tot_time = data_no["tot_time"] -times_no = data_no["times"] -res_steps_no = data_no["res_steps"] - -times_hard = data_vboc["times"] -res_steps_hard = data_vboc["res_steps_term"] -better = data_vboc["better"] -worse = data_vboc["worse"] -equal = data_vboc["equal"] - -# Plot timing -plt.figure() -plt.plot(np.linspace(0, len(times_no), len(times_no)), times_no, label="naive MPC", color='red') -plt.plot(np.linspace(0, len(times_hard), len(times_hard)), times_hard, label="MPC + VBOC", color='green') -plt.plot(np.linspace(0, len(times_no), len(times_no)), np.ones(len(times_no)) * np.quantile(times_no, 0.9), - label="90% quantile naive MPC", color='fuchsia', linestyle='--') -plt.plot(np.linspace(0, len(times_hard), len(times_hard)), np.ones(len(times_hard)) * np.quantile(times_hard, 0.9), - label="90% quantile MPC + VBOC", color='DarkBlue', linestyle='--') -plt.xlabel('MPC Iteration') -plt.ylabel('Solve time [s]') -plt.legend() -plt.title("Solve time comparison") -plt.savefig(data_dir + 'times_' + use_data + '.png') - -# Barchart that states when the hard terminal constraint is better, equal or worse than the naive MPC -plt.figure() -plt.bar(["Better", "Equal", "Worse"], [better, equal, worse], color=["green", "blue", "red"]) -plt.ylabel("Number") -plt.title("Comparison between MPC + VBOC and naive MPC") -plt.savefig(data_dir + 'numbers_' + use_data + '.png') - -# Directly compare the number of iteration taken by the two MPCs before the first infeasible solution -plt.figure() -total = np.linspace(1, 100, 100) -plt.plot(total, res_steps_no, label="naive MPC", color='red') -plt.plot(total, np.ones(len(total)) * np.mean(res_steps_no), label="mean naive MPC", color='fuchsia', linestyle='--') -plt.plot(total, res_steps_hard, label="MPC + VBOC", color='green') -plt.plot(total, np.ones(len(total)) * np.mean(res_steps_hard), label="mean MPC + VBOC", color='DarkBlue', linestyle='--') -plt.title("Comparison between MPC + VBOC and naive MPC") -plt.xlabel("Number of problems") -plt.ylabel("Number of iterations") -plt.legend() -plt.savefig(data_dir + 'iterations_' + use_data + '.png') diff --git a/VBOC/Safe MPC/hard_terminal_constraints/3dof_sym.py b/VBOC/Safe MPC/hard_terminal_constraints/3dof_sym.py deleted file mode 100644 index 4e0b6ff..0000000 --- a/VBOC/Safe MPC/hard_terminal_constraints/3dof_sym.py +++ /dev/null @@ -1,162 +0,0 @@ -import os -import sys -sys.path.insert(1, os.getcwd() + '/..') - -import numpy as np -import time -import torch -from triplependulum_class_vboc import OCPtriplependulumHardTerm, SYMtriplependulum -from my_nn import NeuralNetDIR -from multiprocessing import Pool -from scipy.stats import qmc - -import warnings -warnings.filterwarnings("ignore") - -def simulate(p): - - x0 = np.zeros((ocp.ocp.dims.nx,)) - x0[:ocp.ocp.dims.nu] = data[p] - - simX = np.ndarray((tot_steps + 1, ocp.ocp.dims.nx)) - simU = np.ndarray((tot_steps, ocp.ocp.dims.nu)) - simX[0] = np.copy(x0) - - times = [None] * tot_steps - - failed_iter = -1 - - # Guess: - x_sol_guess = x_sol_guess_vec[p] - u_sol_guess = u_sol_guess_vec[p] - - for f in range(tot_steps): - - status = ocp.OCP_solve(simX[f], x_sol_guess, u_sol_guess, ocp.thetamax-0.05, joint_vec[f]) - times[f] = ocp.ocp_solver.get_stats('time_tot') - - if status != 0: - - if failed_iter >= N-1 or failed_iter < 0: - break - - failed_iter += 1 - - simU[f] = u_sol_guess[0] - - for i in range(N-1): - x_sol_guess[i] = np.copy(x_sol_guess[i+1]) - u_sol_guess[i] = np.copy(u_sol_guess[i+1]) - - x_sol_guess[N-1] = np.copy(x_sol_guess[N]) - - else: - failed_iter = 0 - - simU[f] = ocp.ocp_solver.get(0, "u") - - for i in range(N-1): - x_sol_guess[i] = ocp.ocp_solver.get(i+1, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i+1, "u") - - x_sol_guess[N-1] = ocp.ocp_solver.get(N, "x") - x_sol_guess[N] = np.copy(x_sol_guess[N-1]) - u_sol_guess[N-1] = np.copy(u_sol_guess[N-2]) - - simU[f] += noise_vec[f] - - sim.acados_integrator.set("u", simU[f]) - sim.acados_integrator.set("x", simX[f]) - status = sim.acados_integrator.solve() - simX[f+1] = sim.acados_integrator.get("x") - - return f, times - -start_time = time.time() - -# Pytorch params: -device = torch.device("cpu") - -model = NeuralNetDIR(6, 500, 1).to(device) -model.load_state_dict(torch.load('../model_3dof_vboc')) -mean = torch.load('../mean_3dof_vboc') -std = torch.load('../std_3dof_vboc') -safety_margin = 2.0 - -cpu_num = 1 -test_num = 100 - -time_step = 5*1e-3 -tot_time = 0.16 -tot_steps = 100 - -regenerate = True - -x_sol_guess_vec = np.load('../x_sol_guess.npy') -u_sol_guess_vec = np.load('../u_sol_guess.npy') -noise_vec = np.load('../noise.npy') -noise_vec = np.load('../selected_joint.npy') - -quant = 10. -r = 1 - -while quant > 4*1e-3: - - ocp = OCPtriplependulumHardTerm("SQP_RTI", time_step, tot_time, list(model.parameters()), mean, std, regenerate) - sim = SYMtriplependulum(time_step, tot_time, True) - - N = ocp.ocp.dims.N - - # Generate low-discrepancy unlabeled samples: - sampler = qmc.Halton(d=ocp.ocp.dims.nu, scramble=False) - sample = sampler.random(n=test_num) - l_bounds = ocp.Xmin_limits[:ocp.ocp.dims.nu] - u_bounds = ocp.Xmax_limits[:ocp.ocp.dims.nu] - data = qmc.scale(sample, l_bounds, u_bounds) - - # MPC controller without terminal constraints: - with Pool(cpu_num) as p: - res = p.map(simulate, range(data.shape[0])) - - res_steps_term, stats = zip(*res) - - times = np.array([i for f in stats for i in f if i is not None]) - - quant = np.quantile(times, 0.99) - - print('iter: ', str(r)) - print('tot time: ' + str(tot_time)) - print('99 percent quantile solve time: ' + str(quant)) - print('Mean solve time: ' + str(np.mean(times))) - - tot_time -= time_step - r += 1 - - print(np.array(res_steps_term).astype(int)) - - del ocp - -np.save('res_steps_hardterm.npy', np.array(res_steps_term).astype(int)) - -res_steps = np.load('../no_constraints/res_steps_noconstr.npy') - -better = 0 -equal = 0 -worse = 0 - -for i in range(res_steps.shape[0]): - if res_steps_term[i]-res_steps[i]>0: - better += 1 - elif res_steps_term[i]-res_steps[i]==0: - equal += 1 - else: - worse += 1 - -print('MPC standard vs MPC with hard term constraints') -print('Percentage of initial states in which the MPC+VBOC behaves better: ' + str(better)) -print('Percentage of initial states in which the MPC+VBOC behaves equal: ' + str(equal)) -print('Percentage of initial states in which the MPC+VBOC behaves worse: ' + str(worse)) - -np.savez('../data/results_hardterm.npz', res_steps_term=res_steps_term, - better=better, worse=worse, equal=equal, times=times, - dt=time_step, tot_time=tot_time) \ No newline at end of file diff --git a/VBOC/Safe MPC/my_nn.py b/VBOC/Safe MPC/my_nn.py deleted file mode 100644 index 32740b3..0000000 --- a/VBOC/Safe MPC/my_nn.py +++ /dev/null @@ -1,35 +0,0 @@ -import torch.nn as nn - -# Fully connected neural network -class NeuralNetCLS(nn.Module): - def __init__(self, input_size, hidden_size, output_size): - super(NeuralNetCLS, self).__init__() - self.linear_relu_stack = nn.Sequential( - nn.Linear(input_size, hidden_size), - nn.ReLU(), - nn.Linear(hidden_size, hidden_size), - nn.ReLU(), - nn.Linear(hidden_size, output_size), - # nn.Sigmoid(), - ) - - def forward(self, x): - out = self.linear_relu_stack(x) - return out - -class NeuralNetDIR(nn.Module): - def __init__(self, input_size, hidden_size, output_size): - super(NeuralNetDIR, self).__init__() - self.linear_relu_stack = nn.Sequential( - nn.Linear(input_size, hidden_size), - nn.ReLU(), - nn.Linear(hidden_size, hidden_size), - nn.ReLU(), - nn.Linear(hidden_size, output_size), - nn.ReLU(), - ) - - def forward(self, x): - out = self.linear_relu_stack(x) - return out - diff --git a/VBOC/Safe MPC/no_constraints/3dof_sym.py b/VBOC/Safe MPC/no_constraints/3dof_sym.py deleted file mode 100644 index 7d71d8f..0000000 --- a/VBOC/Safe MPC/no_constraints/3dof_sym.py +++ /dev/null @@ -1,156 +0,0 @@ -import os -import sys -sys.path.insert(1, os.getcwd() + '/..') - -import numpy as np -import time -from triplependulum_class_vboc import OCPtriplependulumSTD, SYMtriplependulum -from multiprocessing import Pool -from scipy.stats import qmc -import random - -import warnings -warnings.filterwarnings("ignore") - -def simulate(p): - - x0 = np.zeros((ocp.ocp.dims.nx,)) - x0[:ocp.ocp.dims.nu] = data[p] - - simX = np.ndarray((tot_steps + 1, ocp.ocp.dims.nx)) - simU = np.ndarray((tot_steps, ocp.ocp.dims.nu)) - simX[0] = np.copy(x0) - - times = [None] * tot_steps - - failed_iter = -1 - - # Guess: - x_sol_guess = x_sol_guess_vec[p] - u_sol_guess = u_sol_guess_vec[p] - - for f in range(tot_steps): - - status = ocp.OCP_solve(simX[f], x_sol_guess, u_sol_guess, ocp.thetamax-0.05, joint_vec[f]) - times[f] = ocp.ocp_solver.get_stats('time_tot') - - if status != 0: - - if failed_iter >= N-1 or failed_iter < 0: - break - - failed_iter += 1 - - simU[f] = u_sol_guess[0] - - for i in range(N-1): - x_sol_guess[i] = np.copy(x_sol_guess[i+1]) - u_sol_guess[i] = np.copy(u_sol_guess[i+1]) - - x_sol_guess[N-1] = np.copy(x_sol_guess[N]) - - else: - failed_iter = 0 - - simU[f] = ocp.ocp_solver.get(0, "u") - - for i in range(N-1): - x_sol_guess[i] = ocp.ocp_solver.get(i+1, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i+1, "u") - - x_sol_guess[N-1] = ocp.ocp_solver.get(N, "x") - x_sol_guess[N] = np.copy(x_sol_guess[N-1]) - u_sol_guess[N-1] = np.copy(u_sol_guess[N-2]) - - simU[f] += noise_vec[f] - - sim.acados_integrator.set("u", simU[f]) - sim.acados_integrator.set("x", simX[f]) - status = sim.acados_integrator.solve() - simX[f+1] = sim.acados_integrator.get("x") - - return f, times - -def init_guess(p): - - x0 = np.zeros((ocp.ocp.dims.nx,)) - x0[:ocp.ocp.dims.nu] = data[p] - - # Guess: - x_sol_guess = np.full((N+1, ocp.ocp.dims.nx), x0) - u_sol_guess = np.zeros((N, ocp.ocp.dims.nu)) - - status = ocp.OCP_solve(x0, x_sol_guess, u_sol_guess, ocp.thetamax-0.05, joint_vec[0]) - - if status == 0: - - for i in range(N): - x_sol_guess[i] = ocp.ocp_solver.get(i, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i, "u") - - x_sol_guess[N] = ocp.ocp_solver.get(N, "x") - - return x_sol_guess, u_sol_guess - -start_time = time.time() - -cpu_num = 1 -test_num = 100 - -time_step = 5*1e-3 -tot_time = 0.16 -tot_steps = 100 - -ocp = OCPtriplependulumSTD("SQP", time_step, 0.2, True) - -# Generate low-discrepancy unlabeled samples: -sampler = qmc.Halton(d=ocp.ocp.dims.nu, scramble=False) -sample = sampler.random(n=test_num) -l_bounds = ocp.Xmin_limits[:ocp.ocp.dims.nu] -u_bounds = ocp.Xmax_limits[:ocp.ocp.dims.nu] -data = qmc.scale(sample, l_bounds, u_bounds) - -N = ocp.ocp.dims.N - -joint_vec = np.array([random.choice(range(ocp.nu)) for _ in range(tot_steps)]) -np.save('../selected_joint', joint_vec) - -with Pool(30) as p: - res = p.map(init_guess, range(data.shape[0])) - -x_sol_guess_vec, u_sol_guess_vec = zip(*res) - -np.save('../x_sol_guess', np.asarray(x_sol_guess_vec)) -np.save('../u_sol_guess', np.asarray(u_sol_guess_vec)) - -noise_perc = 0 -noise_vec = np.full((ocp.ocp.dims.nu,tot_steps), np.random.normal(0,noise_perc*ocp.Cmax/200,tot_steps)) -noise_vec = noise_vec.reshape((tot_steps,ocp.ocp.dims.nu)) -np.save('../noise', noise_vec) - -del ocp - -ocp = OCPtriplependulumSTD("SQP_RTI", time_step, tot_time, True) -sim = SYMtriplependulum(time_step, tot_time, True) - -N = ocp.ocp.dims.N - -# MPC controller without terminal constraints: -with Pool(cpu_num) as p: - res = p.map(simulate, range(data.shape[0])) - -res_steps, stats = zip(*res) - -times = np.array([i for f in stats for i in f if i is not None]) - -print('99 percent quantile solve time: ' + str(np.quantile(times, 0.99))) -print('Mean solve time: ' + str(np.mean(times))) - -print(np.array(res_steps).astype(int)) -print(np.mean(res_steps)) - -np.save('res_steps_noconstr.npy', np.array(res_steps).astype(int)) - -# Save the results in an npz file -np.savez('../data/results_no_constraint.npz', times=times, - dt=time_step, tot_time=tot_time, res_steps=res_steps) diff --git a/VBOC/Safe MPC/parallel/2dof_sym.py b/VBOC/Safe MPC/parallel/2dof_sym.py deleted file mode 100644 index d247b61..0000000 --- a/VBOC/Safe MPC/parallel/2dof_sym.py +++ /dev/null @@ -1,274 +0,0 @@ -import numpy as np -import matplotlib.pyplot as plt -from numpy.linalg import norm as norm -import time -from doublependulum_class_fixedveldir import OCPdoublependulumINIT, SYMdoublependulumINIT -import warnings -warnings.filterwarnings("ignore") -import torch -import torch.nn as nn -from my_nn import NeuralNetRegression -import math -from multiprocessing import Pool -from scipy.stats import qmc -import random -import scipy.linalg as lin - -def simulate(p): - x0 = np.array([data[p,0], data[p,1], 1e-10, 1e-10]) - - simX = np.ndarray((tot_steps + 1, ocp.ocp.dims.nx)) - simU = np.ndarray((tot_steps, ocp.ocp.dims.nu)) - simX[0] = np.copy(x0) - - failed_iter = 0 - receiding = 0 - - # Guess: - x_sol_guess = np.full((ocp.N+1, ocp.ocp.dims.nx), x0) - u_sol_guess = np.full((ocp.N, ocp.ocp.dims.nu), np.array([ocp.g*ocp.l1*(ocp.m1+ocp.m2)*math.sin(x0[0]),ocp.g*ocp.l2*ocp.m2*math.sin(x0[1])])) - - for f in range(tot_steps): - - receiding_iter = ocp.N-failed_iter-receiding - - Q = np.diag([1e-2+pow(10,receiding_iter/ocp.N*2), 1e-2+pow(10,receiding_iter/ocp.N*2), 1e-4, 1e-4]) - R = np.diag([1e-4, 1e-4]) - - for i in range(ocp.N): - ocp.ocp_solver.cost_set(i, "W", lin.block_diag(Q, R)) - ocp.ocp_solver.cost_set(ocp.N, "W", Q) - - all_ok = False - - for p in reversed(range(receiding_iter, ocp.N+1)): - - for i in range(ocp.N+1): - if i == p: - ocp.ocp_solver.cost_set(i, "Zl", 1e9*np.ones((1,))) - else: - ocp.ocp_solver.cost_set(i, "Zl", np.zeros((1,))) - # ocp.ocp_solver.cost_set(i, "Zl", pow(10, (1-receiding_iter/ocp.N)*6)*np.ones((1,))) - - status = ocp.OCP_solve(simX[f], q_ref, x_sol_guess, u_sol_guess) - - if status == 0: - for l in reversed(range(receiding_iter,ocp.N+1)): - if ocp.nn_decisionfunction(params, mean_dir, std_dir, safety_margin, ocp.ocp_solver.get(l, 'x')) >= 0.: - receiding = ocp.N - l + 1 - break - - simU[f] = ocp.ocp_solver.get(0, "u") - - for i in range(ocp.N-1): - x_sol_guess[i] = ocp.ocp_solver.get(i+1, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i+1, "u") - - x_sol_guess[ocp.N-1] = ocp.ocp_solver.get(ocp.N, "x") - x_sol_guess[ocp.N] = np.copy(x_sol_guess[ocp.N-1]) - u_sol_guess[ocp.N-1] = [ocp.g*ocp.l1*(ocp.m1+ocp.m2)*math.sin(x_sol_guess[ocp.N-1,0]),ocp.g*ocp.l2*ocp.m2*math.sin(x_sol_guess[ocp.N-1,1])] - - all_ok = True - - break - - if not all_ok: - # print("acados returned status " + str(status) + " at iteration " + str(f)) - # print(ocp.ocp_solver.get_residuals()) - - if failed_iter >= ocp.N-1 or f == 0: - break - - failed_iter += 1 - - simU[f] = u_sol_guess[0] - - for i in range(ocp.N-1): - x_sol_guess[i] = np.copy(x_sol_guess[i+1]) - u_sol_guess[i] = np.copy(u_sol_guess[i+1]) - - x_sol_guess[ocp.N-1] = np.copy(x_sol_guess[ocp.N]) - u_sol_guess[ocp.N-1] = [ocp.g*ocp.l1*(ocp.m1+ocp.m2)*math.sin(x_sol_guess[ocp.N-1,0]),ocp.g*ocp.l2*ocp.m2*math.sin(x_sol_guess[ocp.N-1,1])] - - else: - failed_iter = 0 - #receiding = receiding + 1 - - simU[f] = ocp.ocp_solver.get(0, "u") - - for i in range(ocp.N-1): - x_sol_guess[i] = ocp.ocp_solver.get(i+1, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i+1, "u") - - x_sol_guess[ocp.N-1] = ocp.ocp_solver.get(ocp.N, "x") - x_sol_guess[ocp.N] = np.copy(x_sol_guess[ocp.N-1]) - u_sol_guess[ocp.N-1] = [ocp.g*ocp.l1*(ocp.m1+ocp.m2)*math.sin(x_sol_guess[ocp.N-1,0]),ocp.g*ocp.l2*ocp.m2*math.sin(x_sol_guess[ocp.N-1,1])] - - noise = np.array([(ocp.thetamax-ocp.thetamin)*noise_intensity*random.uniform(-1, 1), (ocp.thetamax-ocp.thetamin)*2*noise_intensity*random.uniform(-1, 1), ocp.dthetamax*2*noise_intensity*random.uniform(-1, 1), ocp.dthetamax*2*noise_intensity*random.uniform(-1, 1)]) - - sim.acados_integrator.set("u", simU[f]) - sim.acados_integrator.set("x", simX[f]) - status = sim.acados_integrator.solve() - simX[f+1] = sim.acados_integrator.get("x") + noise - simU[f] = u_sol_guess[0] - - return f, simX, simU - -def plots_results(sol_steps): - t = np.linspace(0, 1e-2*sol_steps, sol_steps + 1) - th = np.linspace(0, 1e-2*(tot_steps-1), tot_steps) - - plt.figure(figsize=(8, 5)) - plt.subplot(2, 1, 1) - (line,) = plt.step(t, np.append([simU[0, 0]], simU[:sol_steps, 0]), label='Control trajectory') - plt.ylabel("$u_1$ [Nm]") - plt.hlines(ocp.Cmax, th[0], th[-1], linestyles="dashed", alpha=0.7, label='Control limits') - plt.hlines(-ocp.Cmax, th[0], th[-1], linestyles="dashed", alpha=0.7) - plt.ylim([-1.2 * ocp.Cmax, 1.2 * ocp.Cmax]) - plt.title("Controls") - plt.legend(loc="best") - plt.grid() - plt.subplot(2, 1, 2) - (line,) = plt.step(t, np.append([simU[0, 1]], simU[:sol_steps, 1]), label='Control trajectory') - plt.ylabel("$u_2$ [Nm]") - plt.xlabel("t [s]") - plt.hlines(ocp.Cmax, th[0], th[-1], linestyles="dashed", alpha=0.7, label='Control limits') - plt.hlines(-ocp.Cmax, th[0], th[-1], linestyles="dashed", alpha=0.7) - plt.ylim([-1.2 * ocp.Cmax, 1.2 * ocp.Cmax]) - plt.legend(loc="best") - plt.grid() - plt.savefig('fig_control.pdf') - - plt.figure(figsize=(8, 5)) - plt.subplot(2, 1, 1) - (line,) = plt.plot(t, simX[:sol_steps+1, 0], label='Position trajectory') - plt.hlines(ocp.thetamax, th[0], th[-1], linestyles="dashed", alpha=0.7, label='Position limits') - plt.hlines(ocp.thetamin, th[0], th[-1], linestyles="dashed", alpha=0.7) - plt.hlines(q_ref[0], th[0], th[-1], linestyles="dashed", - alpha=0.7, color='green', label='Position reference') - #plt.ylim([1.2 * mpc.xmin[0], 1.2 * mpc.xmax[0]]) - plt.ylabel("$x_1$ [rad]") - plt.title("Positions") - plt.legend(loc="best") - plt.grid() - plt.subplot(2, 1, 2) - (line,) = plt.plot(t, simX[:sol_steps+1, 1], label='Position trajectory') - plt.hlines(ocp.thetamax, th[0], th[-1], linestyles="dashed", alpha=0.7, label='Position limits') - plt.hlines(ocp.thetamin, th[0], th[-1], linestyles="dashed", alpha=0.7) - plt.hlines(q_ref[1], th[0], th[-1], linestyles="dashed", - alpha=0.7, color='green', label='Position reference') - #plt.ylim([1.2 * mpc.xmin[1], 1.2 * mpc.xmax[1]]) - plt.ylabel("$x_2$ [rad]") - plt.xlabel("t [s]") - plt.legend(loc="best") - plt.grid() - plt.savefig('fig_positions.pdf') - - plt.figure(figsize=(8, 5)) - plt.subplot(2, 1, 1) - (line,) = plt.plot(t, simX[:sol_steps+1, 2], label='Velocity trajectory') - inp = np.empty((sol_steps+1,4)) - for i in range(inp.shape[0]): - vel_norm = norm(simX[i][2:4]) - for l in range(2): - inp[i][l] = (simX[i][l] - mean_dir) / std_dir - inp[i][l+2] = simX[i][l+2] / vel_norm - max_vel = np.multiply(np.reshape(model_dir(torch.from_numpy(inp.astype(np.float32)).to(device)).detach().cpu().numpy()*(100-safety_margin)/100,(sol_steps+1,)),inp[:,2]) - (line,) = plt.plot(t, max_vel, label='Max safe velocity') - plt.hlines(ocp.dthetamax, th[0], th[-1], linestyles="dashed", alpha=0.7, label='Velocity limits') - plt.hlines(-ocp.dthetamax, th[0], th[-1], linestyles="dashed", alpha=0.7) - #plt.hlines(mpc.ocp.cost.yref[2], th[0], th[-1], linestyles="dashed", alpha=0.7, color='green', label='Velocity reference') - #plt.ylim([1.2 * mpc.xmin[2], 1.2 * mpc.xmax[2]]) - plt.ylabel("$\dot{x}_1$ [rad/s]") - plt.title("Velocities") - plt.legend(loc="best") - plt.grid() - plt.subplot(2, 1, 2) - (line,) = plt.plot(t, simX[:sol_steps+1, 3], label='Velocity trajectory') - inp = np.empty((sol_steps+1,4)) - for i in range(inp.shape[0]): - vel_norm = norm(simX[i][2:4]) - for l in range(2): - inp[i][l] = (simX[i][l] - mean_dir) / std_dir - inp[i][l+2] = simX[i][l+2] / vel_norm - max_vel = np.multiply(np.reshape(model_dir(torch.from_numpy(inp.astype(np.float32)).to(device)).detach().cpu().numpy()*(100-safety_margin)/100,(sol_steps+1,)),inp[:,3]) - (line,) = plt.plot(t, max_vel, label='Max safe velocity') - plt.hlines(ocp.dthetamax, th[0], th[-1], linestyles="dashed", alpha=0.7, label='Velocity limits') - plt.hlines(-ocp.dthetamax, th[0], th[-1], linestyles="dashed", alpha=0.7) - #plt.hlines(mpc.ocp.cost.yref[3], th[0], th[-1], linestyles="dashed", alpha=0.7, color='green', label='Velocity reference') - #plt.ylim([1.2 * mpc.xmin[3], 1.2 * mpc.xmax[3]]) - plt.ylabel("$\dot{x}_2$ [rad/s]") - plt.xlabel("t [s]") - plt.legend(loc="best") - plt.grid() - plt.savefig('fig_velocities.pdf') - -start_time = time.time() - -# Pytorch params: -device = torch.device("cpu") # "cuda" if torch.cuda.is_available() else - -model_dir = NeuralNetRegression(4, 300, 1).to(device) -model_dir.load_state_dict(torch.load('../model_2dof_vboc')) -mean_dir = torch.load('../mean_2dof_vboc') -std_dir = torch.load('../std_2dof_vboc') -safety_margin = 2.0 - -params = list(model_dir.parameters()) - -ocp = OCPdoublependulumINIT(True, params, mean_dir, std_dir, safety_margin) -sim = SYMdoublependulumINIT(True) - -# Generate low-discrepancy unlabeled samples: -sampler = qmc.Halton(d=2, scramble=False) -sample = sampler.random(n=pow(10, 2)) -q_max = ocp.thetamax -q_min = ocp.thetamin -l_bounds = [q_min, q_min] -u_bounds = [q_max, q_max] -data = qmc.scale(sample, l_bounds, u_bounds) - -noise_intensity = 0 #1e-3 - -tot_steps = 100 -q_ref = np.array([(ocp.thetamax+ocp.thetamin)/2, ocp.thetamax - 0.05]) - -cpu_num = 1# 30 -with Pool(cpu_num) as p: - res = p.map(simulate, range(data.shape[0])) - -res_steps_term, temp1, temp2 = zip(*res) - -# simX, simU = temp1[0], temp2[0] - -# plots_results(res_steps_term[0]) - -res_steps = np.load('no_constraints/res_steps_noconstr.npy') - -better = 0 -equal = 0 -worse = 0 - -plt.figure() -for i in range(res_steps.shape[0]): - if res_steps_term[i]-res_steps[i]>0: - plt.plot(data[i,0], data[i,1],color='darkblue',marker="*",markersize=9,zorder=1,linestyle="None") - better += 1 - elif res_steps_term[i]-res_steps[i]==0: - plt.plot(data[i,0], data[i,1],color='darkgrey',marker=".",markersize=7,zorder=1,linestyle="None") - equal += 1 - else: - plt.plot(data[i,0], data[i,1],color='darkred',marker="X",markersize=7,zorder=1,linestyle="None") - worse += 1 -plt.title('MPC standard vs MPC with receiding hard constraints') - -print('MPC standard vs MPC with receiding hard constraints') -print('Percentage of initial states in which the MPC+VBOC behaves better: ' + str(better)) -print('Percentage of initial states in which the MPC+VBOC behaves equal: ' + str(equal)) -print('Percentage of initial states in which the MPC+VBOC behaves worse: ' + str(worse)) - -print(np.array(res_steps_term).astype(int)) - -np.save('res_steps_hardreceidingconstr.npy', np.array(res_steps_term).astype(int)) - -plt.show() diff --git a/VBOC/Safe MPC/parallel/doublependulum_class_fixedveldir.py b/VBOC/Safe MPC/parallel/doublependulum_class_fixedveldir.py deleted file mode 100644 index 0eb7c29..0000000 --- a/VBOC/Safe MPC/parallel/doublependulum_class_fixedveldir.py +++ /dev/null @@ -1,277 +0,0 @@ -from acados_template import AcadosOcp, AcadosOcpSolver, AcadosSim, AcadosSimSolver -import numpy as np -from acados_template import AcadosModel -from casadi import SX, vertcat, cos, sin, fmax, norm_2 -import scipy.linalg as lin - -class OCPdoublependulum: - def __init__(self): - - # --------------------SET MODEL-------------------- - # ------------------------------------------------- - model_name = "double_pendulum_ode" - - # constants - self.m1 = 0.4 # mass of the first link [kself.g] - self.m2 = 0.4 # mass of the second link [kself.g] - self.g = 9.81 # self.gravity constant [m/s^2] - self.l1 = 0.8 # lenself.gth of the first link [m] - self.l2 = 0.8 # lenself.gth of the second link [m] - - # states - theta1 = SX.sym("theta1") - theta2 = SX.sym("theta2") - dtheta1 = SX.sym("dtheta1") - dtheta2 = SX.sym("dtheta2") - self.x = vertcat(theta1, theta2, dtheta1, dtheta2) - - # xdot - theta1_dot = SX.sym("theta1_dot") - theta2_dot = SX.sym("theta1_dot") - dtheta1_dot = SX.sym("dtheta2_dot") - dtheta2_dot = SX.sym("dtheta2_dot") - xdot = vertcat(theta1_dot, theta2_dot, dtheta1_dot, dtheta2_dot) - - # controls - C1 = SX.sym("C1") - C2 = SX.sym("C2") - u = vertcat(C1, C2) - - # parameters - p = [] - - # dynamics - f_expl = vertcat( - dtheta1, - dtheta2, - ( - self.l1**2 - * self.l2 - * self.m2 - * dtheta1**2 - * sin(-2 * theta2 + 2 * theta1) - + 2 * C2 * cos(-theta2 + theta1) * self.l1 - + 2 - * ( - self.g * sin(-2 * theta2 + theta1) * self.l1 * self.m2 / 2 - + sin(-theta2 + theta1) * dtheta2**2 * self.l1 * self.l2 * self.m2 - + self.g * self.l1 * (self.m1 + self.m2 / 2) * sin(theta1) - - C1 - ) - * self.l2 - ) - / self.l1**2 - / self.l2 - / (self.m2 * cos(-2 * theta2 + 2 * theta1) - 2 * self.m1 - self.m2), - ( - -self.g - * self.l1 - * self.l2 - * self.m2 - * (self.m1 + self.m2) - * sin(-theta2 + 2 * theta1) - - self.l1 - * self.l2**2 - * self.m2**2 - * dtheta2**2 - * sin(-2 * theta2 + 2 * theta1) - - 2 - * dtheta1**2 - * self.l1**2 - * self.l2 - * self.m2 - * (self.m1 + self.m2) - * sin(-theta2 + theta1) - + 2 * C1 * cos(-theta2 + theta1) * self.l2 * self.m2 - + self.l1 - * (self.m1 + self.m2) - * (sin(theta2) * self.g * self.l2 * self.m2 - 2 * C2) - ) - / self.l2**2 - / self.l1 - / self.m2 - / (self.m2 * cos(-2 * theta2 + 2 * theta1) - 2 * self.m1 - self.m2), - ) - - self.model = AcadosModel() - - f_impl = xdot - f_expl - - self.model.f_expl_expr = f_expl - self.model.f_impl_expr = f_impl - self.model.x = self.x - self.model.xdot = xdot - self.model.u = u - self.model.p = p - self.model.name = model_name - - -class OCPdoublependulumINIT(OCPdoublependulum): - def __init__(self, regenerate, nn_params, mean, std, safety_margin): - - # inherit initialization - super().__init__() - - # ---------------------SET OCP--------------------- - # ------------------------------------------------- - self.ocp = AcadosOcp() - - # times - self.Tf = 0.01 - self.N = int(1000 * self.Tf) - self.ocp.solver_options.tf = self.Tf - self.ocp.dims.N = self.N - - self.nx = self.model.x.size()[0] - self.nu = self.model.u.size()[0] - self.ny = self.nx + self.nu - self.ny_e = self.nx - - # cost - Q = np.diag([1e4, 1e4, 1e-4, 1e-4]) - R = np.diag([1e-4, 1e-4]) - - self.ocp.cost.W_e = Q - self.ocp.cost.W = lin.block_diag(Q, R) - - self.ocp.cost.cost_type = "LINEAR_LS" - self.ocp.cost.cost_type_e = "LINEAR_LS" - - self.ocp.cost.Vx = np.zeros((self.ny, self.nx)) - self.ocp.cost.Vx[: self.nx, :self.nx] = np.eye(self.nx) - self.ocp.cost.Vu = np.zeros((self.ny, self.nu)) - self.ocp.cost.Vu[self.nx:, :self.nu] = np.eye(self.nu) - self.ocp.cost.Vx_e = np.eye(self.nx) - - # reference - self.ocp.cost.yref = np.array([np.pi, np.pi, 0., 0., 0., 0.]) - self.ocp.cost.yref_e = np.array([np.pi, np.pi, 0., 0.]) - - # set constraints - self.Cmax = 10. - self.thetamax = np.pi / 4 + np.pi - self.thetamin = - np.pi / 4 + np.pi - self.dthetamax = 10. - - Cmax_limits = np.array([self.Cmax, self.Cmax]) - Xmax_limits = np.array([self.thetamax, self.thetamax, self.dthetamax, self.dthetamax]) - Xmin_limits = np.array([self.thetamin, self.thetamin, -self.dthetamax, -self.dthetamax]) - - self.ocp.constraints.lbu = -Cmax_limits - self.ocp.constraints.ubu = Cmax_limits - self.ocp.constraints.idxbu = np.array([0, 1]) - self.ocp.constraints.lbx = Xmin_limits - self.ocp.constraints.ubx = Xmax_limits - self.ocp.constraints.idxbx = np.array([0, 1, 2, 3]) - - self.ocp.constraints.lbx_e = Xmin_limits - self.ocp.constraints.ubx_e = Xmax_limits - self.ocp.constraints.idxbx_e = np.array([0, 1, 2, 3]) - - self.ocp.constraints.lbx_0 = Xmin_limits - self.ocp.constraints.ubx_0 = Xmax_limits - self.ocp.constraints.idxbx_0 = np.array([0, 1, 2, 3]) - - # nonlinear constraints - self.model.con_h_expr_e = self.nn_decisionfunction(nn_params, mean, std, safety_margin, self.x) - - self.ocp.constraints.lh_e = np.array([0.]) - self.ocp.constraints.uh_e = np.array([1e6]) - - self.ocp.constraints.idxsh_e = np.array([0]) - - self.ocp.cost.zl_e = np.zeros((1,)) - self.ocp.cost.zu_e = np.zeros((1,)) - self.ocp.cost.Zu_e = np.zeros((1,)) - self.ocp.cost.Zl_e = np.zeros((1,)) - - self.model.con_h_expr = self.model.con_h_expr_e - - self.ocp.constraints.lh = np.array([0.]) - self.ocp.constraints.uh = np.array([1e6]) - - self.ocp.constraints.idxsh = np.array([0]) - - self.ocp.cost.zl = np.zeros((1,)) - self.ocp.cost.zu = np.zeros((1,)) - self.ocp.cost.Zu = np.zeros((1,)) - self.ocp.cost.Zl = np.zeros((1,)) - - # ------------------------------------------------- - - # self.ocp.solver_options.nlp_solver_type = "SQP" - self.ocp.solver_options.qp_solver = "FULL_CONDENSING_HPIPM" - self.ocp.solver_options.tol = 1e-2 - self.ocp.solver_options.qp_solver_iter_max = 100 - self.ocp.solver_options.nlp_solver_max_iter = 1000 - self.ocp.solver_options.globalization = "MERIT_BACKTRACKING" - self.ocp.solver_options.alpha_reduction = 0.3 - self.ocp.solver_options.alpha_min = 1e-2 - self.ocp.solver_options.levenberg_marquardt = 1. - - # ocp model - self.ocp.model = self.model - - # solver - self.ocp_solver = AcadosOcpSolver(self.ocp, build=regenerate) - - def OCP_solve(self, x0, q_ref, x_sol_guess, u_sol_guess): - - # Reset current iterate: - self.ocp_solver.reset() - - self.ocp_solver.constraints_set(0, "lbx", x0) - self.ocp_solver.constraints_set(0, "ubx", x0) - - # Set parameters, guesses and constraints: - for i in range(self.N): - self.ocp_solver.set(i, 'x', x_sol_guess[i]) - self.ocp_solver.set(i, 'u', u_sol_guess[i]) - self.ocp_solver.cost_set(i, 'y_ref', np.array([q_ref[0],q_ref[1],0.,0.,0.,0.])) - - self.ocp_solver.set(self.N, 'x', x_sol_guess[self.N]) - self.ocp_solver.cost_set(self.N, 'y_ref', np.array([q_ref[0],q_ref[1],0.,0.])) - - # Solve the OCP: - status = self.ocp_solver.solve() - - return status - - def nn_decisionfunction(self, params, mean, std, safety_margin, x): - - vel_norm = fmax(norm_2(x[2:]), 1e-3) - - mean = vertcat(mean,mean,0.,0.) - std = vertcat(std,std,vel_norm,vel_norm) - - out = (x - mean) / std - it = 0 - - for param in params: - - param = SX(param.tolist()) - - if it % 2 == 0: - out = param @ out - else: - out = param + out - - if it == 1 or it == 3: - out = fmax(0., out) - - it += 1 - - return out*(100-safety_margin)/100 - vel_norm - - -class SYMdoublependulumINIT(OCPdoublependulum): - def __init__(self, regenerate): - - # inherit initialization - super().__init__() - - sim = AcadosSim() - sim.model = self.model - sim.solver_options.T = 1e-3 - sim.solver_options.num_stages = 4 - self.acados_integrator = AcadosSimSolver(sim, build=regenerate) diff --git a/VBOC/Safe MPC/parallel/my_nn.py b/VBOC/Safe MPC/parallel/my_nn.py deleted file mode 100644 index a124b9b..0000000 --- a/VBOC/Safe MPC/parallel/my_nn.py +++ /dev/null @@ -1,18 +0,0 @@ -import torch.nn as nn - -class NeuralNetRegression(nn.Module): - def __init__(self, input_size, hidden_size, output_size): - super(NeuralNetRegression, self).__init__() - self.linear_relu_stack = nn.Sequential( - nn.Linear(input_size, hidden_size), - nn.ReLU(), - nn.Linear(hidden_size, hidden_size), - nn.ReLU(), - nn.Linear(hidden_size, output_size), - nn.ReLU(), - ) - - def forward(self, x): - out = self.linear_relu_stack(x) - return out - \ No newline at end of file diff --git a/VBOC/Safe MPC/receiding_hard_constraints/3dof_sym.py b/VBOC/Safe MPC/receiding_hard_constraints/3dof_sym.py deleted file mode 100644 index cb259e6..0000000 --- a/VBOC/Safe MPC/receiding_hard_constraints/3dof_sym.py +++ /dev/null @@ -1,184 +0,0 @@ -import os -import sys -sys.path.insert(1, os.getcwd() + '/..') - -import numpy as np -import time -import torch -from triplependulum_class_vboc import OCPtriplependulumHardTraj, SYMtriplependulum -from my_nn import NeuralNetDIR -from multiprocessing import Pool -import scipy.linalg as lin -from scipy.stats import qmc - -import warnings -warnings.filterwarnings("ignore") - -def simulate(p): - - x0 = np.zeros((ocp.ocp.dims.nx,)) - x0[:ocp.ocp.dims.nu] = data[p] - - simX = np.ndarray((tot_steps + 1, ocp.ocp.dims.nx)) - simU = np.ndarray((tot_steps, ocp.ocp.dims.nu)) - simX[0] = np.copy(x0) - - times = [None] * tot_steps - - failed_iter = -1 - - # Guess: - x_sol_guess = x_sol_guess_vec[p] - u_sol_guess = u_sol_guess_vec[p] - - for f in range(tot_steps): - - receiding = 0 - - if failed_iter == 0 and f > 0: - for i in range(1,N+1): - if ocp.nn_decisionfunction(params, mean, std, ocp.ocp_solver.get(i, 'x')) >= 0.: - receiding = N - i + 1 - - receiding_iter = N-failed_iter-receiding - Q = np.diag([1e-4+pow(10,receiding_iter/N*4), 1e-4, 1e-4, 1e-4, 1e-4, 1e-4]) - R = np.diag([1e-4, 1e-4, 1e-4]) - - for i in range(N): - ocp.ocp_solver.cost_set(i, "W", lin.block_diag(Q, R)) - - ocp.ocp_solver.cost_set(N, "W", Q) - - # for i in range(N+1): - # if i == receiding_iter: - # ocp.ocp_solver.cost_set(i, "Zl", 1e8*np.ones((1,))) - # elif i == N: - # ocp.ocp_solver.cost_set(i, "Zl", pow(10, (1-receiding_iter/N)*4)*np.ones((1,))) - # else: - # ocp.ocp_solver.cost_set(i, "Zl", np.zeros((1,))) - - for i in range(N+1): - if i == receiding_iter: - ocp.ocp_solver.constraints_set(i, "lh", np.array([0.])) - else: - ocp.ocp_solver.constraints_set(i, "lh", np.array([-1e6])) - - status = ocp.OCP_solve(simX[f], x_sol_guess, u_sol_guess, ocp.thetamax-0.05, joint_vec[f]) - times[f] = ocp.ocp_solver.get_stats('time_tot') - - if status != 0: - - if failed_iter >= N-1 or failed_iter < 0: - break - - failed_iter += 1 - - simU[f] = u_sol_guess[0] - - for i in range(N-1): - x_sol_guess[i] = np.copy(x_sol_guess[i+1]) - u_sol_guess[i] = np.copy(u_sol_guess[i+1]) - - x_sol_guess[N-1] = np.copy(x_sol_guess[N]) - - else: - failed_iter = 0 - - simU[f] = ocp.ocp_solver.get(0, "u") - - for i in range(N-1): - x_sol_guess[i] = ocp.ocp_solver.get(i+1, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i+1, "u") - - x_sol_guess[N-1] = ocp.ocp_solver.get(N, "x") - x_sol_guess[N] = np.copy(x_sol_guess[N-1]) - u_sol_guess[N-1] = np.copy(u_sol_guess[N-2]) - - simU[f] += noise_vec[f] - - sim.acados_integrator.set("u", simU[f]) - sim.acados_integrator.set("x", simX[f]) - status = sim.acados_integrator.solve() - simX[f+1] = sim.acados_integrator.get("x") - - return f, times - -start_time = time.time() - -# Pytorch params: -device = torch.device("cpu") - -model = NeuralNetDIR(6, 500, 1).to(device) -model.load_state_dict(torch.load('../model_3dof_vboc')) -mean = torch.load('../mean_3dof_vboc') -std = torch.load('../std_3dof_vboc') -safety_margin = 2.0 - -cpu_num = 1 -test_num = 100 - -time_step = 5*1e-3 -tot_time = 0.16 -tot_steps = 100 - -regenerate = True - -x_sol_guess_vec = np.load('../x_sol_guess.npy') -u_sol_guess_vec = np.load('../u_sol_guess.npy') -noise_vec = np.load('../noise.npy') -noise_vec = np.load('../selected_joint.npy') - -params = list(model.parameters()) - -ocp = OCPtriplependulumHardTraj("SQP_RTI", time_step, tot_time, params, mean, std, regenerate) -sim = SYMtriplependulum(time_step, tot_time, regenerate) - -# Generate low-discrepancy unlabeled samples: -sampler = qmc.Halton(d=ocp.ocp.dims.nu, scramble=False) -sample = sampler.random(n=test_num) -l_bounds = ocp.Xmin_limits[:ocp.ocp.dims.nu] -u_bounds = ocp.Xmax_limits[:ocp.ocp.dims.nu] -data = qmc.scale(sample, l_bounds, u_bounds) - -N = ocp.ocp.dims.N - -# MPC controller without terminal constraints: -with Pool(cpu_num) as p: - res = p.map(simulate, range(data.shape[0])) - -res_steps_traj, stats = zip(*res) - -times = np.array([i for f in stats for i in f if i is not None]) - -quant = np.quantile(times, 0.99) - -print('tot time: ' + str(tot_time)) -print('99 percent quantile solve time: ' + str(quant)) -print('Mean solve time: ' + str(np.mean(times))) - -print(np.array(res_steps_traj).astype(int)) - -np.save('res_steps_receiding.npy', np.array(res_steps_traj).astype(int)) - -res_steps = np.load('../no_constraints/res_steps_noconstr.npy') - -better = 0 -equal = 0 -worse = 0 - -for i in range(res_steps.shape[0]): - if res_steps_traj[i]-res_steps[i]>0: - better += 1 - elif res_steps_traj[i]-res_steps[i]==0: - equal += 1 - else: - worse += 1 - -print('MPC standard vs MPC with receiding hard constraints') -print('Percentage of initial states in which the MPC+VBOC behaves better: ' + str(better)) -print('Percentage of initial states in which the MPC+VBOC behaves equal: ' + str(equal)) -print('Percentage of initial states in which the MPC+VBOC behaves worse: ' + str(worse)) - -np.savez('../data/results_receidinghard.npz', res_steps_term=res_steps_traj, - better=better, worse=worse, equal=equal, times=times, - dt=time_step, tot_time=tot_time) \ No newline at end of file diff --git a/VBOC/Safe MPC/receiding_soft_constraints/3dof_sym.py b/VBOC/Safe MPC/receiding_soft_constraints/3dof_sym.py deleted file mode 100644 index 34ac4c6..0000000 --- a/VBOC/Safe MPC/receiding_soft_constraints/3dof_sym.py +++ /dev/null @@ -1,178 +0,0 @@ -import os -import sys -sys.path.insert(1, os.getcwd() + '/..') - -import numpy as np -import time -import torch -from triplependulum_class_vboc import OCPtriplependulumSoftTraj, SYMtriplependulum -from my_nn import NeuralNetDIR -from multiprocessing import Pool -import scipy.linalg as lin -from scipy.stats import qmc - -import warnings -warnings.filterwarnings("ignore") - -def simulate(p): - - x0 = np.zeros((ocp.ocp.dims.nx,)) - x0[:ocp.ocp.dims.nu] = data[p] - - simX = np.ndarray((tot_steps + 1, ocp.ocp.dims.nx)) - simU = np.ndarray((tot_steps, ocp.ocp.dims.nu)) - simX[0] = np.copy(x0) - - times = [None] * tot_steps - - failed_iter = -1 - - # Guess: - x_sol_guess = x_sol_guess_vec[p] - u_sol_guess = u_sol_guess_vec[p] - - for f in range(tot_steps): - - receiding = 0 - - if failed_iter == 0 and f > 0: - for i in range(1,N+1): - if ocp.nn_decisionfunction_conservative(params, mean, std, safety_margin, ocp.ocp_solver.get(i, 'x')) >= 0.: - receiding = N - i + 1 - - receiding_iter = N-failed_iter-receiding - Q = np.diag([1e-4+pow(10,receiding_iter/N*4), 1e-4, 1e-4, 1e-4, 1e-4, 1e-4]) - R = np.diag([1e-4, 1e-4, 1e-4]) - - for i in range(N): - ocp.ocp_solver.cost_set(i, "W", lin.block_diag(Q, R)) - - ocp.ocp_solver.cost_set(N, "W", Q) - - for i in range(N+1): - if i == receiding_iter: - ocp.ocp_solver.cost_set(i, "Zl", 1e8*np.ones((1,))) - elif i == N and receiding_iter != N: - ocp.ocp_solver.cost_set(i, "Zl", pow(10, (1-receiding_iter/N)*4)*np.ones((1,))) - else: - ocp.ocp_solver.cost_set(i, "Zl", np.zeros((1,))) - - status = ocp.OCP_solve(simX[f], x_sol_guess, u_sol_guess, ocp.thetamax-0.05, joint_vec[f]) - times[f] = ocp.ocp_solver.get_stats('time_tot') - - if status != 0: - - if failed_iter >= N-1 or failed_iter < 0: - break - - failed_iter += 1 - - simU[f] = u_sol_guess[0] - - for i in range(N-1): - x_sol_guess[i] = np.copy(x_sol_guess[i+1]) - u_sol_guess[i] = np.copy(u_sol_guess[i+1]) - - x_sol_guess[N-1] = np.copy(x_sol_guess[N]) - - else: - failed_iter = 0 - - simU[f] = ocp.ocp_solver.get(0, "u") - - for i in range(N-1): - x_sol_guess[i] = ocp.ocp_solver.get(i+1, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i+1, "u") - - x_sol_guess[N-1] = ocp.ocp_solver.get(N, "x") - x_sol_guess[N] = np.copy(x_sol_guess[N-1]) - u_sol_guess[N-1] = np.copy(u_sol_guess[N-2]) - - simU[f] += noise_vec[f] - - sim.acados_integrator.set("u", simU[f]) - sim.acados_integrator.set("x", simX[f]) - status = sim.acados_integrator.solve() - simX[f+1] = sim.acados_integrator.get("x") - - return f, times - -start_time = time.time() - -# Pytorch params: -device = torch.device("cpu") - -model = NeuralNetDIR(6, 500, 1).to(device) -model.load_state_dict(torch.load('../model_3dof_vboc')) -mean = torch.load('../mean_3dof_vboc') -std = torch.load('../std_3dof_vboc') -safety_margin = 2.0 - -cpu_num = 1 -test_num = 100 - -time_step = 5*1e-3 -tot_time = 0.16 -tot_steps = 100 - -regenerate = True - -x_sol_guess_vec = np.load('../x_sol_guess.npy') -u_sol_guess_vec = np.load('../u_sol_guess.npy') -noise_vec = np.load('../noise.npy') -noise_vec = np.load('../selected_joint.npy') - -params = list(model.parameters()) - -ocp = OCPtriplependulumSoftTraj("SQP_RTI", time_step, tot_time, params, mean, std, safety_margin, regenerate) -sim = SYMtriplependulum(time_step, tot_time, regenerate) - -# Generate low-discrepancy unlabeled samples: -sampler = qmc.Halton(d=ocp.ocp.dims.nu, scramble=False) -sample = sampler.random(n=test_num) -l_bounds = ocp.Xmin_limits[:ocp.ocp.dims.nu] -u_bounds = ocp.Xmax_limits[:ocp.ocp.dims.nu] -data = qmc.scale(sample, l_bounds, u_bounds) - -N = ocp.ocp.dims.N - -# MPC controller without terminal constraints: -with Pool(cpu_num) as p: - res = p.map(simulate, range(data.shape[0])) - -res_steps_traj, stats = zip(*res) - -times = np.array([i for f in stats for i in f if i is not None]) - -quant = np.quantile(times, 0.99) - -print('tot time: ' + str(tot_time)) -print('99 percent quantile solve time: ' + str(quant)) -print('Mean solve time: ' + str(np.mean(times))) - -print(np.array(res_steps_traj).astype(int)) - -np.save('res_steps_receiding.npy', np.array(res_steps_traj).astype(int)) - -res_steps = np.load('../no_constraints/res_steps_noconstr.npy') - -better = 0 -equal = 0 -worse = 0 - -for i in range(res_steps.shape[0]): - if res_steps_traj[i]-res_steps[i]>0: - better += 1 - elif res_steps_traj[i]-res_steps[i]==0: - equal += 1 - else: - worse += 1 - -print('MPC standard vs MPC with receiding hard constraints') -print('Percentage of initial states in which the MPC+VBOC behaves better: ' + str(better)) -print('Percentage of initial states in which the MPC+VBOC behaves equal: ' + str(equal)) -print('Percentage of initial states in which the MPC+VBOC behaves worse: ' + str(worse)) - -np.savez('../data/results_receidingsoft.npz', res_steps_term=res_steps_traj, - better=better, worse=worse, equal=equal, times=times, - dt=time_step, tot_time=tot_time) \ No newline at end of file diff --git a/VBOC/Safe MPC/soft_terminal_constraints/3dof_sym.py b/VBOC/Safe MPC/soft_terminal_constraints/3dof_sym.py deleted file mode 100644 index 29ac97b..0000000 --- a/VBOC/Safe MPC/soft_terminal_constraints/3dof_sym.py +++ /dev/null @@ -1,164 +0,0 @@ -import os -import sys -sys.path.insert(1, os.getcwd() + '/..') - -import numpy as np -import time -import torch -from triplependulum_class_vboc import OCPtriplependulumSoftTerm, SYMtriplependulum -from my_nn import NeuralNetDIR -from multiprocessing import Pool -from scipy.stats import qmc - -import warnings -warnings.filterwarnings("ignore") - -def simulate(p): - - x0 = np.zeros((ocp.ocp.dims.nx,)) - x0[:ocp.ocp.dims.nu] = data[p] - - simX = np.ndarray((tot_steps + 1, ocp.ocp.dims.nx)) - simU = np.ndarray((tot_steps, ocp.ocp.dims.nu)) - simX[0] = np.copy(x0) - - times = [None] * tot_steps - - failed_iter = -1 - - # Guess: - x_sol_guess = x_sol_guess_vec[p] - u_sol_guess = u_sol_guess_vec[p] - - for f in range(tot_steps): - - status = ocp.OCP_solve(simX[f], x_sol_guess, u_sol_guess, ocp.thetamax-0.05, joint_vec[f]) - times[f] = ocp.ocp_solver.get_stats('time_tot') - - if status != 0: - - if failed_iter >= N-1 or failed_iter < 0: - break - - failed_iter += 1 - - simU[f] = u_sol_guess[0] - - for i in range(N-1): - x_sol_guess[i] = np.copy(x_sol_guess[i+1]) - u_sol_guess[i] = np.copy(u_sol_guess[i+1]) - - x_sol_guess[N-1] = np.copy(x_sol_guess[N]) - - else: - failed_iter = 0 - - simU[f] = ocp.ocp_solver.get(0, "u") - - for i in range(N-1): - x_sol_guess[i] = ocp.ocp_solver.get(i+1, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i+1, "u") - - x_sol_guess[N-1] = ocp.ocp_solver.get(N, "x") - x_sol_guess[N] = np.copy(x_sol_guess[N-1]) - u_sol_guess[N-1] = np.copy(u_sol_guess[N-2]) - - simU[f] += noise_vec[f] - - sim.acados_integrator.set("u", simU[f]) - sim.acados_integrator.set("x", simX[f]) - status = sim.acados_integrator.solve() - simX[f+1] = sim.acados_integrator.get("x") - - return f, times - -start_time = time.time() - -# Pytorch params: -device = torch.device("cpu") - -model = NeuralNetDIR(6, 500, 1).to(device) -model.load_state_dict(torch.load('../model_3dof_vboc')) -mean = torch.load('../mean_3dof_vboc') -std = torch.load('../std_3dof_vboc') -safety_margin = 2.0 - -cpu_num = 1 -test_num = 100 - -time_step = 5*1e-3 -tot_time = 0.16 -tot_steps = 100 - -regenerate = True - -x_sol_guess_vec = np.load('../x_sol_guess.npy') -u_sol_guess_vec = np.load('../u_sol_guess.npy') -noise_vec = np.load('../noise.npy') -noise_vec = np.load('../selected_joint.npy') - -quant = 10. -r = 1 - -while quant > 4*1e-3: - - ocp = OCPtriplependulumSoftTerm("SQP_RTI", time_step, tot_time, list(model.parameters()), mean, std, safety_margin, regenerate) - sim = SYMtriplependulum(time_step, tot_time, True) - - N = ocp.ocp.dims.N - - # Generate low-discrepancy unlabeled samples: - sampler = qmc.Halton(d=ocp.ocp.dims.nu, scramble=False) - sample = sampler.random(n=test_num) - l_bounds = ocp.Xmin_limits[:ocp.ocp.dims.nu] - u_bounds = ocp.Xmax_limits[:ocp.ocp.dims.nu] - data = qmc.scale(sample, l_bounds, u_bounds) - - ocp.ocp_solver.cost_set(N, "Zl", 1e4*np.ones((1,))) - - # MPC controller without terminal constraints: - with Pool(cpu_num) as p: - res = p.map(simulate, range(data.shape[0])) - - res_steps_term, stats = zip(*res) - - times = np.array([i for f in stats for i in f if i is not None]) - - quant = np.quantile(times, 0.9) - - print('iter: ', str(r)) - print('tot time: ' + str(tot_time)) - print('99 percent quantile solve time: ' + str(quant)) - print('Mean solve time: ' + str(np.mean(times))) - - tot_time -= time_step - r += 1 - - print(np.array(res_steps_term).astype(int)) - - del ocp - -np.save('res_steps_hardterm.npy', np.array(res_steps_term).astype(int)) - -res_steps = np.load('../no_constraints/res_steps_noconstr.npy') - -better = 0 -equal = 0 -worse = 0 - -for i in range(res_steps.shape[0]): - if res_steps_term[i]-res_steps[i]>0: - better += 1 - elif res_steps_term[i]-res_steps[i]==0: - equal += 1 - else: - worse += 1 - -print('MPC standard vs MPC with hard term constraints') -print('Percentage of initial states in which the MPC+VBOC behaves better: ' + str(better)) -print('Percentage of initial states in which the MPC+VBOC behaves equal: ' + str(equal)) -print('Percentage of initial states in which the MPC+VBOC behaves worse: ' + str(worse)) - -np.savez('../data/results_softterm.npz', res_steps_term=res_steps_term, - better=better, worse=worse, equal=equal, times=times, - dt=time_step, tot_time=tot_time) \ No newline at end of file diff --git a/VBOC/Safe MPC/soft_traj_constraints/3dof_sym.py b/VBOC/Safe MPC/soft_traj_constraints/3dof_sym.py deleted file mode 100644 index 0de2db9..0000000 --- a/VBOC/Safe MPC/soft_traj_constraints/3dof_sym.py +++ /dev/null @@ -1,166 +0,0 @@ -import os -import sys -sys.path.insert(1, os.getcwd() + '/..') - -import numpy as np -import time -import torch -from triplependulum_class_vboc import OCPtriplependulumSoftTraj, SYMtriplependulum -from my_nn import NeuralNetDIR -from multiprocessing import Pool -from scipy.stats import qmc - -import warnings -warnings.filterwarnings("ignore") - -def simulate(p): - - x0 = np.zeros((ocp.ocp.dims.nx,)) - x0[:ocp.ocp.dims.nu] = data[p] - - simX = np.ndarray((tot_steps + 1, ocp.ocp.dims.nx)) - simU = np.ndarray((tot_steps, ocp.ocp.dims.nu)) - simX[0] = np.copy(x0) - - times = [None] * tot_steps - - failed_iter = -1 - - # Guess: - x_sol_guess = x_sol_guess_vec[p] - u_sol_guess = u_sol_guess_vec[p] - - for f in range(tot_steps): - - status = ocp.OCP_solve(simX[f], x_sol_guess, u_sol_guess, ocp.thetamax-0.05, joint_vec[f]) - times[f] = ocp.ocp_solver.get_stats('time_tot') - - if status != 0: - - if failed_iter >= N-1 or failed_iter < 0: - break - - failed_iter += 1 - - simU[f] = u_sol_guess[0] - - for i in range(N-1): - x_sol_guess[i] = np.copy(x_sol_guess[i+1]) - u_sol_guess[i] = np.copy(u_sol_guess[i+1]) - - x_sol_guess[N-1] = np.copy(x_sol_guess[N]) - - else: - failed_iter = 0 - - simU[f] = ocp.ocp_solver.get(0, "u") - - for i in range(N-1): - x_sol_guess[i] = ocp.ocp_solver.get(i+1, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i+1, "u") - - x_sol_guess[N-1] = ocp.ocp_solver.get(N, "x") - x_sol_guess[N] = np.copy(x_sol_guess[N-1]) - u_sol_guess[N-1] = np.copy(u_sol_guess[N-2]) - - simU[f] += noise_vec[f] - - sim.acados_integrator.set("u", simU[f]) - sim.acados_integrator.set("x", simX[f]) - status = sim.acados_integrator.solve() - simX[f+1] = sim.acados_integrator.get("x") - - return f, times - -start_time = time.time() - -# Pytorch params: -device = torch.device("cpu") - -model = NeuralNetDIR(6, 500, 1).to(device) -model.load_state_dict(torch.load('../model_3dof_vboc')) -mean = torch.load('../mean_3dof_vboc') -std = torch.load('../std_3dof_vboc') -safety_margin = 2.0 - -cpu_num = 1 -test_num = 100 - -time_step = 5*1e-3 -tot_time = 0.16 -tot_steps = 100 - -regenerate = True - -x_sol_guess_vec = np.load('../x_sol_guess.npy') -u_sol_guess_vec = np.load('../u_sol_guess.npy') -noise_vec = np.load('../noise.npy') -noise_vec = np.load('../selected_joint.npy') - -quant = 10. -r = 1 - -while quant > 4*1e-3: - - ocp = OCPtriplependulumSoftTraj("SQP_RTI", time_step, tot_time, list(model.parameters()), mean, std, safety_margin, regenerate) - sim = SYMtriplependulum(time_step, tot_time, regenerate) - - # Generate low-discrepancy unlabeled samples: - sampler = qmc.Halton(d=ocp.ocp.dims.nu, scramble=False) - sample = sampler.random(n=test_num) - l_bounds = ocp.Xmin_limits[:ocp.ocp.dims.nu] - u_bounds = ocp.Xmax_limits[:ocp.ocp.dims.nu] - data = qmc.scale(sample, l_bounds, u_bounds) - - N = ocp.ocp.dims.N - - for i in range(1,N): - ocp.ocp_solver.cost_set(i, "Zl", 1e4*np.ones((1,))) - ocp.ocp_solver.cost_set(N, "Zl", 1e4*np.ones((1,))) - - # MPC controller without terminal constraints: - with Pool(cpu_num) as p: - res = p.map(simulate, range(data.shape[0])) - - res_steps_traj, stats = zip(*res) - - times = np.array([i for f in stats for i in f if i is not None]) - - quant = np.quantile(times, 0.99) - - print('iter: ', str(r)) - print('tot time: ' + str(tot_time)) - print('99 percent quantile solve time: ' + str(quant)) - print('Mean solve time: ' + str(np.mean(times))) - - tot_time -= time_step - r += 1 - - del ocp - -print(np.array(res_steps_traj).astype(int)) - -np.save('res_steps_softtraj.npy', np.array(res_steps_traj).astype(int)) - -res_steps = np.load('../no_constraints/res_steps_noconstr.npy') - -better = 0 -equal = 0 -worse = 0 - -for i in range(res_steps.shape[0]): - if res_steps_traj[i]-res_steps[i]>0: - better += 1 - elif res_steps_traj[i]-res_steps[i]==0: - equal += 1 - else: - worse += 1 - -print('MPC standard vs MPC with soft traj constraints') -print('Percentage of initial states in which the MPC+VBOC behaves better: ' + str(better)) -print('Percentage of initial states in which the MPC+VBOC behaves equal: ' + str(equal)) -print('Percentage of initial states in which the MPC+VBOC behaves worse: ' + str(worse)) - -np.savez('../data/results_softtraj.npz', res_steps_term=res_steps_traj, - better=better, worse=worse, equal=equal, times=times, - dt=time_step, tot_time=tot_time) \ No newline at end of file diff --git a/VBOC/Safe MPC/triplependulum_class_vboc.py b/VBOC/Safe MPC/triplependulum_class_vboc.py deleted file mode 100644 index b9ab6d9..0000000 --- a/VBOC/Safe MPC/triplependulum_class_vboc.py +++ /dev/null @@ -1,418 +0,0 @@ -from acados_template import AcadosOcp, AcadosOcpSolver, AcadosSim, AcadosSimSolver -import numpy as np -from acados_template import AcadosModel -from casadi import SX, vertcat, cos, sin, fmax, norm_2 -import scipy.linalg as lin - - -class MODELtriplependulum: - def __init__(self, time_step, tot_time): - - model_name = "triple_pendulum_ode" - - # constants - self.m1 = 0.4 # mass of the first link [kself.g] - self.m2 = 0.4 # mass of the second link [kself.g] - self.m3 = 0.4 # mass of the third link [kself.g] - self.g = 9.81 # self.gravity constant [m/s^2] - self.l1 = 0.8 # lenself.gth of the first link [m] - self.l2 = 0.8 # lenself.gth of the second link [m] - self.l3 = 0.8 # lenself.gth of the second link [m] - - self.time_step = time_step - self.tot_time = tot_time - - # states - theta1 = SX.sym("theta1") - theta2 = SX.sym("theta2") - theta3 = SX.sym("theta3") - dtheta1 = SX.sym("dtheta1") - dtheta2 = SX.sym("dtheta2") - dtheta3 = SX.sym("dtheta3") - self.x = vertcat(theta1, theta2, theta3, dtheta1, dtheta2, dtheta3) - - # xdot - theta1_dot = SX.sym("theta1_dot") - theta2_dot = SX.sym("theta2_dot") - theta3_dot = SX.sym("theta3_dot") - dtheta1_dot = SX.sym("dtheta1_dot") - dtheta2_dot = SX.sym("dtheta2_dot") - dtheta3_dot = SX.sym("dtheta3_dot") - xdot = vertcat(theta1_dot, theta2_dot, theta3_dot, dtheta1_dot, dtheta2_dot, dtheta3_dot) - - # controls - C1 = SX.sym("C1") - C2 = SX.sym("C2") - C3 = SX.sym("C3") - u = vertcat(C1, C2, C3) - - # parameters - p = [] - - # dynamics - f_expl = vertcat( - dtheta1, - dtheta2, - dtheta3, - (-self.g * self.l1 * self.l2 * self.l3 * self.m1 * self.m3 * sin(-2 * theta3 + 2 * theta2 + theta1) - self.g * self.l1 * self.l2 * self.l3 * self.m1 * self.m3 * sin(2 * theta3 - 2 * theta2 + theta1) + 2 * C1 * self.l2 * self.l3 * self.m3 * cos(-2 * theta3 + 2 * theta2) + 2 * dtheta1 ** 2 * self.l1 ** 2 * self.l2 * self.l3 * self.m2 * (self.m2 + self.m3) * sin(-2 * theta2 + 2 * theta1) - 2 * C3 * self.l1 * self.l2 * (self.m2 + self.m3) * cos(-2 * theta2 + theta1 + theta3) - 2 * C2 * self.l1 * self.l3 * self.m3 * cos(-2 * theta3 + theta2 + theta1) + 2 * self.l1 * self.l2 * self.l3 ** 2 * self.m2 * self.m3 * dtheta3 ** 2 * sin(-2 * theta2 + theta1 + theta3) + 2 * C3 * self.l1 * self.l2 * (self.m2 + self.m3) * - cos(theta1 - theta3) + 2 * (C2 * self.l1 * (self.m3 + 2 * self.m2) * cos(-theta2 + theta1) + (self.g * self.l1 * self.m2 * (self.m2 + self.m3) * sin(-2 * theta2 + theta1) + 2 * dtheta2 ** 2 * self.l1 * self.l2 * self.m2 * (self.m2 + self.m3) * sin(-theta2 + theta1) + self.m3 * dtheta3 ** 2 * sin(theta1 - theta3) * self.l1 * self.l3 * self.m2 + self.g * self.l1 * (self.m2 ** 2 + (self.m3 + 2 * self.m1) * self.m2 + self.m1 * self.m3) * sin(theta1) - C1 * (self.m3 + 2 * self.m2)) * self.l2) * self.l3) / self.l1 ** 2 / self.l3 / (self.m2 * (self.m2 + self.m3) * cos(-2 * theta2 + 2 * theta1) + self.m1 * self.m3 * cos(-2 * theta3 + 2 * theta2) - self.m2 ** 2 + (-self.m3 - 2 * self.m1) * self.m2 - self.m1 * self.m3) / self.l2 / 2, - (-2 * C3 * self.l1 * self.l2 * (self.m2 + self.m3) * cos(2 * theta1 - theta3 - theta2) - 2 * self.l1 * self.l2 * self.l3 ** 2 * self.m2 * self.m3 * dtheta3 ** 2 * sin(2 * theta1 - theta3 - theta2) + self.g * self.l1 * self.l2 * self.l3 * self.m1 * self.m3 * sin(theta2 + 2 * theta1 - 2 * theta3) - self.g * self.l1 * self.l3 * ((self.m1 + 2 * self.m2) * self.m3 + 2 * self.m2 * (self.m1 + self.m2)) * self.l2 * sin(-theta2 + 2 * theta1) - 2 * dtheta2 ** 2 * self.l1 * self.l2 ** 2 * self.l3 * self.m2 * (self.m2 + self.m3) * sin(-2 * theta2 + 2 * theta1) + 2 * C2 * self.l1 * self.l3 * self.m3 * cos(-2 * theta3 + 2 * theta1) + 2 * self.l1 * self.l2 ** 2 * self.l3 * self.m1 * self.m3 * dtheta2 ** 2 * sin(-2 * theta3 + 2 * theta2) - 2 * C1 * self.l2 * self.l3 * self.m3 * cos(-2 * theta3 + theta2 + theta1) + 2 * self.l1 ** 2 * self.l2 * self.l3 * self.m1 * self.m3 * dtheta1 ** 2 * sin(-2 * theta3 + - theta2 + theta1) - 2 * self.l1 ** 2 * self.l3 * dtheta1 ** 2 * ((self.m1 + 2 * self.m2) * self.m3 + 2 * self.m2 * (self.m1 + self.m2)) * self.l2 * sin(-theta2 + theta1) + 2 * C3 * self.l1 * self.l2 * (self.m3 + 2 * self.m1 + self.m2) * cos(-theta3 + theta2) + (2 * C1 * self.l2 * (self.m3 + 2 * self.m2) * cos(-theta2 + theta1) + self.l1 * (4 * dtheta3 ** 2 * self.m3 * self.l3 * (self.m1 + self.m2 / 2) * self.l2 * sin(-theta3 + theta2) + self.g * self.m3 * self.l2 * self.m1 * sin(-2 * theta3 + theta2) + self.g * ((self.m1 + 2 * self.m2) * self.m3 + 2 * self.m2 * (self.m1 + self.m2)) * self.l2 * sin(theta2) - 2 * C2 * (self.m3 + 2 * self.m1 + 2 * self.m2))) * self.l3) / (self.m2 * (self.m2 + self.m3) * cos(-2 * theta2 + 2 * theta1) + self.m1 * self.m3 * cos(-2 * theta3 + 2 * theta2) + (-self.m1 - self.m2) * self.m3 - 2 * self.m1 * self.m2 - self.m2 ** 2) / self.l1 / self.l3 / self.l2 ** 2 / 2, - (-2 * self.m3 * C2 * self.l1 * self.l3 * (self.m2 + self.m3) * cos(2 * theta1 - theta3 - theta2) + self.g * self.m3 * self.l1 * self.l2 * self.l3 * self.m1 * (self.m2 + self.m3) * sin(2 * theta1 + theta3 - 2 * theta2) + 2 * C3 * self.l1 * self.l2 * (self.m2 + self.m3) ** 2 * cos(-2 * theta2 + 2 * theta1) - self.g * self.m3 * self.l1 * self.l2 * self.l3 * self.m1 * (self.m2 + self.m3) * sin(2 * theta1 - theta3) - self.g * self.m3 * self.l1 * self.l2 * self.l3 * self.m1 * (self.m2 + self.m3) * sin(-theta3 + 2 * theta2) - 2 * self.l1 * self.l2 * self.l3 ** 2 * self.m1 * self.m3 ** 2 * dtheta3 ** 2 * sin(-2 * theta3 + 2 * theta2) - 2 * C1 * self.l2 * self.l3 * self.m3 * (self.m2 + self.m3) * cos(-2 * theta2 + theta1 + theta3) + 2 * self.m3 * dtheta1 ** 2 * self.l1 ** 2 * - self.l2 * self.l3 * self.m1 * (self.m2 + self.m3) * sin(-2 * theta2 + theta1 + theta3) + 2 * self.m3 * C2 * self.l1 * self.l3 * (self.m3 + 2 * self.m1 + self.m2) * cos(-theta3 + theta2) + (self.m2 + self.m3) * (2 * C1 * self.l3 * self.m3 * cos(theta1 - theta3) + self.l1 * (-2 * self.m3 * dtheta1 ** 2 * self.l1 * self.l3 * self.m1 * sin(theta1 - theta3) - 4 * self.m3 * dtheta2 ** 2 * sin(-theta3 + theta2) * self.l2 * self.l3 * self.m1 + self.g * self.m3 * sin(theta3) * self.l3 * self.m1 - 2 * C3 * (self.m3 + 2 * self.m1 + self.m2))) * self.l2) / self.m3 / (self.m2 * (self.m2 + self.m3) * cos(-2 * theta2 + 2 * theta1) + self.m1 * self.m3 * cos(-2 * theta3 + 2 * theta2) + (-self.m1 - self.m2) * self.m3 - 2 * self.m1 * self.m2 - self.m2 ** 2) / self.l1 / self.l3 ** 2 / self.l2 / 2, - ) - - self.model = AcadosModel() - - f_impl = xdot - f_expl - - self.model.f_expl_expr = f_expl - self.model.f_impl_expr = f_impl - self.model.x = self.x - self.model.xdot = xdot - self.model.u = u - self.model.p = p - self.model.name = model_name - - -class SYMtriplependulum(MODELtriplependulum): - def __init__(self, time_step, tot_time, regenerate): - - # inherit initialization - super().__init__(time_step, tot_time) - - sim = AcadosSim() - sim.model = self.model - sim.solver_options.T = self.time_step - sim.solver_options.num_stages = 4 - self.acados_integrator = AcadosSimSolver(sim, build=regenerate) - - -class OCPtriplependulum(MODELtriplependulum): - def __init__(self, nlp_solver_type, time_step, tot_time): - - # inherit initialization - super().__init__(time_step, tot_time) - - self.ocp = AcadosOcp() - - # times - self.ocp.solver_options.tf = self.tot_time - self.ocp.dims.N = int(self.tot_time/self.time_step) - - self.nx = self.model.x.size()[0] - self.nu = self.model.u.size()[0] - self.ny = self.nx + self.nu - self.ny_e = self.nx - - # cost - self.Q = np.diag([1e-4, 1e-4, 1e-4, 1e-4, 1e-4, 1e-4]) - self.R = np.diag([1e-4, 1e-4, 1e-4]) - - self.ocp.cost.W_e = self.Q - self.ocp.cost.W = lin.block_diag(self.Q, self.R) - - self.ocp.cost.cost_type = "LINEAR_LS" - self.ocp.cost.cost_type_e = "LINEAR_LS" - - self.ocp.cost.Vx = np.zeros((self.ny, self.nx)) - self.ocp.cost.Vx[: self.nx, :self.nx] = np.eye(self.nx) - self.ocp.cost.Vu = np.zeros((self.ny, self.nu)) - self.ocp.cost.Vu[self.nx:, :self.nu] = np.eye(self.nu) - self.ocp.cost.Vx_e = np.eye(self.nx) - - # set constraints - self.Cmax = 10. - self.thetamax = np.pi / 4 + np.pi - self.thetamin = - np.pi / 4 + np.pi - self.dthetamax = 10. - - self.yref = np.array([np.pi,np.pi,np.pi,0.,0.,0.,0.,0.,0.]) - - # reference - self.ocp.cost.yref = self.yref - self.ocp.cost.yref_e = self.yref[:self.nx] - - self.Cmax_limits = np.array([self.Cmax, self.Cmax, self.Cmax]) - self.Cmin_limits = np.array([-self.Cmax, -self.Cmax, -self.Cmax]) - self.Xmax_limits = np.array([self.thetamax, self.thetamax, self.thetamax, self.dthetamax, self.dthetamax, self.dthetamax]) - self.Xmin_limits = np.array([self.thetamin, self.thetamin, self.thetamin, -self.dthetamax, -self.dthetamax, -self.dthetamax]) - - self.ocp.constraints.lbu = self.Cmin_limits - self.ocp.constraints.ubu = self.Cmax_limits - self.ocp.constraints.idxbu = np.array([0, 1, 2]) - self.ocp.constraints.lbx = self.Xmin_limits - self.ocp.constraints.ubx = self.Xmax_limits - self.ocp.constraints.idxbx = np.array([0, 1, 2, 3, 4, 5]) - - self.ocp.constraints.lbx_e = self.Xmin_limits - self.ocp.constraints.ubx_e = self.Xmax_limits - self.ocp.constraints.idxbx_e = np.array([0, 1, 2, 3, 4, 5]) - - self.ocp.constraints.lbx_0 = self.Xmin_limits - self.ocp.constraints.ubx_0 = self.Xmax_limits - self.ocp.constraints.idxbx_0 = np.array([0, 1, 2, 3, 4, 5]) - - # options - self.ocp.solver_options.nlp_solver_type = nlp_solver_type - self.ocp.solver_options.qp_solver_iter_max = 100 - self.ocp.solver_options.nlp_solver_max_iter = 1000 - self.ocp.solver_options.globalization = "MERIT_BACKTRACKING" - self.ocp.solver_options.alpha_reduction = 0.3 - self.ocp.solver_options.alpha_min = 1e-2 - self.ocp.solver_options.levenberg_marquardt = 1e-2 - - def OCP_solve(self, x0, x_sol_guess, u_sol_guess, ref, joint): - - # Reset current iterate: - self.ocp_solver.reset() - - self.ocp_solver.constraints_set(0, "lbx", x0) - self.ocp_solver.constraints_set(0, "ubx", x0) - - yref = self.yref - yref[joint] = ref - Q = self.Q - Q[joint] = 1e4 - W = lin.block_diag(Q, self.R) - - # Set parameters, guesses and constraints: - for i in range(self.ocp.dims.N): - self.ocp_solver.set(i, 'x', x_sol_guess[i]) - self.ocp_solver.set(i, 'u', u_sol_guess[i]) - self.ocp_solver.cost_set(i, 'yref', yref, api='new') - self.ocp_solver.cost_set(i, 'W', W, api='new') - - self.ocp_solver.set(self.ocp.dims.N, 'x', x_sol_guess[self.ocp.dims.N]) - self.ocp_solver.cost_set(self.ocp.dims.N, 'yref', yref[:self.ocp.dims.nx], api='new') - self.ocp_solver.cost_set(self.ocp.dims.N, 'W', Q, api='new') - - # Solve the OCP: - status = self.ocp_solver.solve() - - return status - - -class OCPtriplependulumSTD(OCPtriplependulum): - def __init__(self, nlp_solver_type, time_step, tot_time, regenerate): - - # inherit initialization - super().__init__(nlp_solver_type, time_step, tot_time) - - # ocp model - self.ocp.model = self.model - - # solver - self.ocp_solver = AcadosOcpSolver(self.ocp, build=regenerate) - - -class OCPtriplependulumHardTerm(OCPtriplependulum): - def __init__(self, nlp_solver_type, time_step, tot_time, nn_params, mean, std, regenerate): - - # inherit initialization - super().__init__(nlp_solver_type, time_step, tot_time) - - # nonlinear constraints - self.model.con_h_expr_e = self.nn_decisionfunction(nn_params, mean, std, self.x) - - self.ocp.constraints.lh_e = np.array([0.]) - self.ocp.constraints.uh_e = np.array([1e6]) - - # ocp model - self.ocp.model = self.model - - # solver - self.ocp_solver = AcadosOcpSolver(self.ocp, build=regenerate) - - def nn_decisionfunction(self, params, mean, std, x): - - vel_norm = fmax(norm_2(x[2:]), 1e-3) - - mean = vertcat(mean,mean,mean,0.,0.,0.) - std = vertcat(std,std,std,vel_norm,vel_norm,vel_norm) - - out = (x - mean) / std - it = 0 - - for param in params: - - param = SX(param.tolist()) - - if it % 2 == 0: - out = param @ out - else: - out = param + out - - if it == 1 or it == 3: - out = fmax(0., out) - - it += 1 - - return out - vel_norm - - -class OCPtriplependulumHardTraj(OCPtriplependulum): - def __init__(self, nlp_solver_type, time_step, tot_time, nn_params, mean, std, regenerate): - - # inherit initialization - super().__init__(nlp_solver_type, time_step, tot_time) - - # nonlinear constraints - self.model.con_h_expr_e = self.nn_decisionfunction(nn_params, mean, std, self.x) - - self.ocp.constraints.lh_e = np.array([0.]) - self.ocp.constraints.uh_e = np.array([1e6]) - - self.model.con_h_expr = self.model.con_h_expr_e - - self.ocp.constraints.lh = np.array([0.]) - self.ocp.constraints.uh = np.array([1e6]) - - # ocp model - self.ocp.model = self.model - - # solver - self.ocp_solver = AcadosOcpSolver(self.ocp, build=regenerate) - - def nn_decisionfunction(self, params, mean, std, x): - - vel_norm = fmax(norm_2(x[2:]), 1e-3) - - mean = vertcat(mean,mean,mean,0.,0.,0.) - std = vertcat(std,std,std,vel_norm,vel_norm,vel_norm) - - out = (x - mean) / std - it = 0 - - for param in params: - - param = SX(param.tolist()) - - if it % 2 == 0: - out = param @ out - else: - out = param + out - - if it == 1 or it == 3: - out = fmax(0., out) - - it += 1 - - return out - vel_norm - - -class OCPtriplependulumSoftTerm(OCPtriplependulum): - def __init__(self, nlp_solver_type, time_step, tot_time, nn_params, mean, std, safety_margin, regenerate): - - # inherit initialization - super().__init__(nlp_solver_type, time_step, tot_time) - - # nonlinear constraints - self.model.con_h_expr_e = self.nn_decisionfunction_conservative(nn_params, mean, std, safety_margin, self.x) - - self.ocp.constraints.lh_e = np.array([0.]) - self.ocp.constraints.uh_e = np.array([1e6]) - - self.ocp.constraints.idxsh_e = np.array([0]) - - self.ocp.cost.zl_e = np.zeros((1,)) - self.ocp.cost.zu_e = np.zeros((1,)) - self.ocp.cost.Zu_e = np.zeros((1,)) - self.ocp.cost.Zl_e = np.zeros((1,)) - - # ocp model - self.ocp.model = self.model - - # solver - self.ocp_solver = AcadosOcpSolver(self.ocp, build=regenerate) - - def nn_decisionfunction_conservative(self, params, mean, std, safety_margin, x): - - vel_norm = fmax(norm_2(x[2:]), 1e-3) - - mean = vertcat(mean,mean,mean,0.,0.,0.) - std = vertcat(std,std,std,vel_norm,vel_norm,vel_norm) - - out = (x - mean) / std - it = 0 - - for param in params: - - param = SX(param.tolist()) - - if it % 2 == 0: - out = param @ out - else: - out = param + out - - if it == 1 or it == 3: - out = fmax(0., out) - - it += 1 - - return out*(100-safety_margin)/100 - vel_norm - - -class OCPtriplependulumSoftTraj(OCPtriplependulum): - def __init__(self, nlp_solver_type, time_step, tot_time, nn_params, mean, std, safety_margin, regenerate): - - # inherit initialization - super().__init__(nlp_solver_type, time_step, tot_time) - - # nonlinear constraints - self.model.con_h_expr_e = self.nn_decisionfunction_conservative(nn_params, mean, std, safety_margin, self.x) - - self.ocp.constraints.lh_e = np.array([0.]) - self.ocp.constraints.uh_e = np.array([1e6]) - - self.ocp.constraints.idxsh_e = np.array([0]) - - self.ocp.cost.zl_e = np.zeros((1,)) - self.ocp.cost.zu_e = np.zeros((1,)) - self.ocp.cost.Zu_e = np.zeros((1,)) - self.ocp.cost.Zl_e = np.zeros((1,)) - - self.model.con_h_expr = self.model.con_h_expr_e - - self.ocp.constraints.lh = np.array([0.]) - self.ocp.constraints.uh = np.array([1e6]) - - self.ocp.constraints.idxsh = np.array([0]) - - self.ocp.cost.zl = np.zeros((1,)) - self.ocp.cost.zu = np.zeros((1,)) - self.ocp.cost.Zu = np.zeros((1,)) - self.ocp.cost.Zl = np.zeros((1,)) - - # ocp model - self.ocp.model = self.model - - # solver - self.ocp_solver = AcadosOcpSolver(self.ocp, build=regenerate) - - def nn_decisionfunction_conservative(self, params, mean, std, safety_margin, x): - - vel_norm = fmax(norm_2(x[2:]), 1e-3) - - mean = vertcat(mean,mean,mean,0.,0.,0.) - std = vertcat(std,std,std,vel_norm,vel_norm,vel_norm) - - out = (x - mean) / std - it = 0 - - for param in params: - - param = SX(param.tolist()) - - if it % 2 == 0: - out = param @ out - else: - out = param + out - - if it == 1 or it == 3: - out = fmax(0., out) - - it += 1 - - return out*(100-safety_margin)/100 - vel_norm - \ No newline at end of file diff --git a/VBOC/UR5/my_nn.py b/VBOC/UR5/my_nn.py deleted file mode 100644 index 3535ca8..0000000 --- a/VBOC/UR5/my_nn.py +++ /dev/null @@ -1,55 +0,0 @@ -import torch.nn as nn - -# Fully connected neural network -class NeuralNetCLS(nn.Module): - def __init__(self, input_size, hidden_size, output_size): - super(NeuralNetCLS, self).__init__() - self.linear_relu_stack = nn.Sequential( - nn.Linear(input_size, hidden_size), - nn.ReLU(), - nn.Linear(hidden_size, hidden_size), - nn.ReLU(), - nn.Linear(hidden_size, output_size), - # nn.Sigmoid(), - ) - - def forward(self, x): - out = self.linear_relu_stack(x) - return out - -class NeuralNetDIR(nn.Module): - def __init__(self, input_size, hidden_size, output_size): - super(NeuralNetDIR, self).__init__() - self.linear_relu_stack = nn.Sequential( - nn.Linear(input_size, hidden_size), - nn.ReLU(), - nn.Linear(hidden_size, hidden_size), - nn.ReLU(), - nn.Linear(hidden_size, output_size), - nn.ReLU(), - ) - - def forward(self, x): - out = self.linear_relu_stack(x) - return out - -class NeuralNetDIRDeep(nn.Module): - def __init__(self, input_size, hidden_size, output_size): - super(NeuralNetDIRDeep, self).__init__() - self.linear_relu_stack = nn.Sequential( - nn.Linear(input_size, hidden_size), - nn.ReLU(), - nn.Linear(hidden_size, hidden_size), - nn.ReLU(), - nn.Linear(hidden_size, hidden_size), - nn.ReLU(), - nn.Linear(hidden_size, hidden_size), - nn.ReLU(), - nn.Linear(hidden_size, output_size), - nn.ReLU(), - ) - - def forward(self, x): - out = self.linear_relu_stack(x) - return out - diff --git a/VBOC/UR5/plots_2dof.py b/VBOC/UR5/plots_2dof.py deleted file mode 100644 index e84bd3a..0000000 --- a/VBOC/UR5/plots_2dof.py +++ /dev/null @@ -1,121 +0,0 @@ -import numpy as np -import matplotlib.pyplot as plt -from numpy.linalg import norm as norm -import warnings -warnings.filterwarnings("ignore") -import torch - -def plots_2dof(X_save, q_min, q_max, v_min, v_max, model_dir, mean_dir, std_dir, device): - - # Plot all training data: - plt.figure() - plt.scatter(X_save[:,0],X_save[:,1],s=0.1) - plt.legend(loc="best", shadow=False, scatterpoints=1) - plt.title("OCP dataset positions") - plt.figure() - plt.scatter(X_save[:,2],X_save[:,3],s=0.1) - plt.legend(loc="best", shadow=False, scatterpoints=1) - plt.title("OCP dataset velocities") - - # Show the resulting set approximation: - with torch.no_grad(): - h = 0.01 - xx, yy = np.meshgrid(np.arange(q_min, q_max, h), np.arange(v_min, v_max, h)) - xrav = xx.ravel() - yrav = yy.ravel() - - plt.figure() - inp = np.c_[ - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - xrav, - np.zeros(yrav.shape[0]), - yrav, - np.empty(yrav.shape[0]), - ] - for i in range(inp.shape[0]): - vel_norm = norm([inp[i][2],inp[i][3]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][2] = inp[i][2] / vel_norm - inp[i][3] = inp[i][3] / vel_norm - inp[i][4] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:4].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][4] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - xit = [] - yit = [] - for i in range(X_save.shape[0]): - if ( - norm(X_save[i][0] - (q_min + q_max) / 2) < 0.1 - and norm(X_save[i][2]) < 0.1 - ): - xit.append(X_save[i][1]) - yit.append(X_save[i][3]) - plt.plot( - xit, - yit, - "ko", - markersize=2 - ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_2$') - plt.xlabel('$q_2$') - plt.grid() - plt.title("Classifier section") - - plt.figure() - inp = np.c_[ - xrav, - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - yrav, - np.zeros(yrav.shape[0]), - np.empty(yrav.shape[0]), - ] - for i in range(inp.shape[0]): - vel_norm = norm([inp[i][2],inp[i][3]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][2] = inp[i][2] / vel_norm - inp[i][3] = inp[i][3] / vel_norm - inp[i][4] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:4].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][4] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - xit = [] - yit = [] - for i in range(X_save.shape[0]): - if ( - norm(X_save[i][1] - (q_min + q_max) / 2) < 0.1 - and norm(X_save[i][3]) < 0.1 - ): - xit.append(X_save[i][0]) - yit.append(X_save[i][2]) - plt.plot( - xit, - yit, - "ko", - markersize=2 - ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_1$') - plt.xlabel('$q_1$') - plt.grid() - plt.title("Classifier section") - - plt.show() diff --git a/VBOC/UR5/plots_3dof.py b/VBOC/UR5/plots_3dof.py deleted file mode 100644 index c8aacb2..0000000 --- a/VBOC/UR5/plots_3dof.py +++ /dev/null @@ -1,210 +0,0 @@ -import numpy as np -import matplotlib.pyplot as plt -from numpy.linalg import norm as norm -import warnings -warnings.filterwarnings("ignore") -import torch -import random - -def plots_3dof(X_save, q_min, q_max, v_min, v_max, model_dir, mean_dir, std_dir, device): - - # Plot all training data: - plt.figure() - ax = plt.axes(projection ="3d") - ax.scatter3D(X_save[:,0],X_save[:,1],X_save[:,2]) - plt.title("OCP dataset positions") - plt.figure() - ax = plt.axes(projection ="3d") - ax.scatter3D(X_save[:,3],X_save[:,4],X_save[:,5]) - plt.title("OCP dataset velocities") - - # Show the resulting set approximation: - with torch.no_grad(): - h = 0.01 - xx, yy = np.meshgrid(np.arange(q_min, q_max, h), np.arange(v_min, v_max, h)) - xrav = xx.ravel() - yrav = yy.ravel() - - plt.figure() - inp = np.float32( - np.c_[(q_min + q_max) / 2 * np.ones(xrav.shape[0]), - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - xrav, - np.zeros(yrav.shape[0]), - np.zeros(yrav.shape[0]), - yrav, - np.empty(yrav.shape[0]), - ] - ) - for i in range(inp.shape[0]): - vel_norm = norm([inp[i,3],inp[i,4],inp[i,5]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - inp[i][2] = (inp[i][2] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][4] = inp[i][4] / vel_norm - inp[i][3] = inp[i][3] / vel_norm - inp[i][5] = inp[i][5] / vel_norm - inp[i][6] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:6].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][6] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - xit = [] - yit = [] - for i in range(len(X_save)): - if ( - norm(X_save[i][0] - (q_min + q_max) / 2) < 0.1 and - norm(X_save[i][1] - (q_min + q_max) / 2) < 0.1 - and norm(X_save[i][3]) < 0.1 - and norm(X_save[i][4]) < 0.1 - ): - xit.append(X_save[i][2]) - yit.append(X_save[i][5]) - plt.plot( - xit, - yit, - "ko", - markersize=2 - ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_3$') - plt.xlabel('$q_3$') - plt.grid() - plt.title("Classifier section") - - # Plot the results: - plt.figure() - inp = np.float32( - np.c_[(q_min + q_max) / 2 * np.ones(xrav.shape[0]), - xrav, - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - np.zeros(yrav.shape[0]), - yrav, - np.zeros(yrav.shape[0]), - np.empty(yrav.shape[0]), - ] - ) - for i in range(inp.shape[0]): - vel_norm = norm([inp[i,3],inp[i,4],inp[i,5]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - inp[i][2] = (inp[i][2] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][4] = inp[i][4] / vel_norm - inp[i][3] = inp[i][3] / vel_norm - inp[i][5] = inp[i][5] / vel_norm - inp[i][6] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:6].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][6] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - xit = [] - yit = [] - for i in range(len(X_save)): - if ( - norm(X_save[i][0] - (q_min + q_max) / 2) < 0.1 and - norm(X_save[i][2] - (q_min + q_max) / 2) < 0.1 - and norm(X_save[i][3]) < 0.1 - and norm(X_save[i][5]) < 0.1 - ): - xit.append(X_save[i][1]) - yit.append(X_save[i][4]) - plt.plot( - xit, - yit, - "ko", - markersize=2 - ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_2$') - plt.xlabel('$q_2$') - plt.grid() - plt.title("Classifier section") - - # Plot the results: - plt.figure() - inp = np.float32( - np.c_[xrav, - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - yrav, - np.zeros(yrav.shape[0]), - np.zeros(yrav.shape[0]), - np.empty(yrav.shape[0]), - ] - ) - for i in range(inp.shape[0]): - vel_norm = norm([inp[i,3],inp[i,4],inp[i,5]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - inp[i][2] = (inp[i][2] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][4] = inp[i][4] / vel_norm - inp[i][3] = inp[i][3] / vel_norm - inp[i][5] = inp[i][5] / vel_norm - inp[i][6] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:6].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][6] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - xit = [] - yit = [] - for i in range(len(X_save)): - if ( - norm(X_save[i][1] - (q_min + q_max) / 2) < 0.1 and - norm(X_save[i][2] - (q_min + q_max) / 2) < 0.1 - and norm(X_save[i][4]) < 0.1 - and norm(X_save[i][5]) < 0.1 - ): - xit.append(X_save[i][0]) - yit.append(X_save[i][3]) - plt.plot( - xit, - yit, - "ko", - markersize=2 - ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_1$') - plt.xlabel('$q_1$') - plt.grid() - plt.title("Classifier section") - - # for _ in range(10): - # q1ran = q_min + random.random() * (q_max-q_min) - # q2ran = q_min + random.random() * (q_max-q_min) - # q3ran = q_min + random.random() * (q_max-q_min) - - # plt.figure() - # ax = plt.axes(projection ="3d") - # xit = np.zeros((1,6)) - # for i in range(len(X_save)): - # if ( - # norm(X_save[i][1] - q1ran) < 0.1 and - # norm(X_save[i][2] - q2ran) < 0.1 and - # norm(X_save[i][3] - q3ran) < 0.1 - # ): - # xit = np.append(xit,[X_save[i]], axis=0) - # ax.scatter3D(xit[:,3],xit[:,4],xit[:,5]) - # plt.title("q1="+str(q1ran)+" q2="+str(q2ran)+" q3="+str(q3ran)) - - plt.show() diff --git a/VBOC/UR5/plots_4dof.py b/VBOC/UR5/plots_4dof.py deleted file mode 100644 index cf37d3b..0000000 --- a/VBOC/UR5/plots_4dof.py +++ /dev/null @@ -1,262 +0,0 @@ -import numpy as np -import matplotlib.pyplot as plt -from numpy.linalg import norm as norm -import warnings -warnings.filterwarnings("ignore") -import torch -import random - -def plots_4dof(X_save, q_min, q_max, v_min, v_max, model_dir, mean_dir, std_dir, device): - - # Show the resulting set approximation: - with torch.no_grad(): - h = 0.01 - xx, yy = np.meshgrid(np.arange(q_min, q_max, h), np.arange(v_min, v_max, h)) - xrav = xx.ravel() - yrav = yy.ravel() - - # Plot the results: - plt.figure() - inp = np.float32( - np.c_[(q_min + q_max) / 2 * np.ones(xrav.shape[0]), - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - xrav, - np.zeros(yrav.shape[0]), - np.zeros(yrav.shape[0]), - np.zeros(yrav.shape[0]), - yrav, - np.empty(yrav.shape[0]), - ] - ) - for i in range(inp.shape[0]): - vel_norm = norm([inp[i,4],inp[i,5],inp[i,6],inp[i,7]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - inp[i][2] = (inp[i][2] - mean_dir) / std_dir - inp[i][3] = (inp[i][3] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][4] = inp[i][4] / vel_norm - inp[i][5] = inp[i][5] / vel_norm - inp[i][6] = inp[i][6] / vel_norm - inp[i][7] = inp[i][7] / vel_norm - inp[i][8] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:8].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][8] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - # xit = [] - # yit = [] - # for i in range(len(X_save)): - # if ( - # norm(X_save[i][0] - (q_min + q_max) / 2) < 0.1 and - # norm(X_save[i][1] - (q_min + q_max) / 2) < 0.1 and - # norm(X_save[i][2] - (q_min + q_max) / 2) < 0.1 - # and norm(X_save[i][4]) < 0.1 - # and norm(X_save[i][5]) < 0.1 - # and norm(X_save[i][6]) < 0.1 - # ): - # xit.append(X_save[i][3]) - # yit.append(X_save[i][7]) - # plt.plot( - # xit, - # yit, - # "ko", - # markersize=2 - # ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_4$') - plt.xlabel('$q_4$') - plt.grid() - plt.title("Classifier section") - - # Plot the results: - plt.figure() - inp = np.float32( - np.c_[(q_min + q_max) / 2 * np.ones(xrav.shape[0]), - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - xrav, - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - np.zeros(yrav.shape[0]), - np.zeros(yrav.shape[0]), - yrav, - np.zeros(yrav.shape[0]), - np.empty(yrav.shape[0]), - ] - ) - for i in range(inp.shape[0]): - vel_norm = norm([inp[i,4],inp[i,5],inp[i,6],inp[i,7]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - inp[i][2] = (inp[i][2] - mean_dir) / std_dir - inp[i][3] = (inp[i][3] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][4] = inp[i][4] / vel_norm - inp[i][5] = inp[i][5] / vel_norm - inp[i][6] = inp[i][6] / vel_norm - inp[i][7] = inp[i][7] / vel_norm - inp[i][8] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:8].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][8] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - # xit = [] - # yit = [] - # for i in range(len(X_save)): - # if ( - # norm(X_save[i][0] - (q_min + q_max) / 2) < 0.1 and - # norm(X_save[i][1] - (q_min + q_max) / 2) < 0.1 and - # norm(X_save[i][3] - (q_min + q_max) / 2) < 0.1 - # and norm(X_save[i][4]) < 0.1 - # and norm(X_save[i][5]) < 0.1 - # and norm(X_save[i][7]) < 0.1 - # ): - # xit.append(X_save[i][2]) - # yit.append(X_save[i][6]) - # plt.plot( - # xit, - # yit, - # "ko", - # markersize=2 - # ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_3$') - plt.xlabel('$q_3$') - plt.grid() - plt.title("Classifier section") - - # Plot the results: - plt.figure() - inp = np.float32( - np.c_[(q_min + q_max) / 2 * np.ones(xrav.shape[0]), - xrav, - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - np.zeros(yrav.shape[0]), - yrav, - np.zeros(yrav.shape[0]), - np.zeros(yrav.shape[0]), - np.empty(yrav.shape[0]), - ] - ) - for i in range(inp.shape[0]): - vel_norm = norm([inp[i,4],inp[i,5],inp[i,6],inp[i,7]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - inp[i][2] = (inp[i][2] - mean_dir) / std_dir - inp[i][3] = (inp[i][3] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][4] = inp[i][4] / vel_norm - inp[i][5] = inp[i][5] / vel_norm - inp[i][6] = inp[i][6] / vel_norm - inp[i][7] = inp[i][7] / vel_norm - inp[i][8] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:8].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][8] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - # xit = [] - # yit = [] - # for i in range(len(X_save)): - # if ( - # norm(X_save[i][0] - (q_min + q_max) / 2) < 0.1 and - # norm(X_save[i][2] - (q_min + q_max) / 2) < 0.1 and - # norm(X_save[i][3] - (q_min + q_max) / 2) < 0.1 - # and norm(X_save[i][4]) < 0.1 - # and norm(X_save[i][6]) < 0.1 - # and norm(X_save[i][7]) < 0.1 - # ): - # xit.append(X_save[i][1]) - # yit.append(X_save[i][5]) - # plt.plot( - # xit, - # yit, - # "ko", - # markersize=2 - # ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_2$') - plt.xlabel('$q_2$') - plt.grid() - plt.title("Classifier section") - - # Plot the results: - plt.figure() - inp = np.float32( - np.c_[xrav, - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - yrav, - np.zeros(yrav.shape[0]), - np.zeros(yrav.shape[0]), - np.zeros(yrav.shape[0]), - np.empty(yrav.shape[0]), - ] - ) - for i in range(inp.shape[0]): - vel_norm = norm([inp[i,4],inp[i,5],inp[i,6],inp[i,7]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - inp[i][2] = (inp[i][2] - mean_dir) / std_dir - inp[i][3] = (inp[i][3] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][4] = inp[i][4] / vel_norm - inp[i][5] = inp[i][5] / vel_norm - inp[i][6] = inp[i][6] / vel_norm - inp[i][7] = inp[i][7] / vel_norm - inp[i][8] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:8].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][8] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - # xit = [] - # yit = [] - # for i in range(len(X_save)): - # if ( - # norm(X_save[i][1] - (q_min + q_max) / 2) < 0.1 and - # norm(X_save[i][2] - (q_min + q_max) / 2) < 0.1 and - # norm(X_save[i][3] - (q_min + q_max) / 2) < 0.1 - # and norm(X_save[i][5]) < 0.1 - # and norm(X_save[i][6]) < 0.1 - # and norm(X_save[i][7]) < 0.1 - # ): - # xit.append(X_save[i][0]) - # yit.append(X_save[i][4]) - # plt.plot( - # xit, - # yit, - # "ko", - # markersize=2 - # ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_1$') - plt.xlabel('$q_1$') - plt.grid() - plt.title("Classifier section") - - plt.show() diff --git a/VBOC/UR5/ur5.urdf b/VBOC/UR5/ur5.urdf deleted file mode 100644 index 7059cdf..0000000 --- a/VBOC/UR5/ur5.urdf +++ /dev/null @@ -1,893 +0,0 @@ - - - - - - - - - transmission_interface/SimpleTransmission - - hardware_interface/EffortJointInterface - - - 1 - - - - transmission_interface/SimpleTransmission - - hardware_interface/EffortJointInterface - - - 1 - - - - transmission_interface/SimpleTransmission - - hardware_interface/EffortJointInterface - - - 1 - - - - transmission_interface/SimpleTransmission - - hardware_interface/EffortJointInterface - - - 1 - - - - transmission_interface/SimpleTransmission - - hardware_interface/EffortJointInterface - - - 1 - - - - transmission_interface/SimpleTransmission - - hardware_interface/EffortJointInterface - - - 1 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 10.0 - true - 0.05 - - - - 0.5 - 1e8 - - - - 10 - 10 - 1000000.0 - 10.0 - 0 0 1 - 0.001 - 0.0 - 1 - Gazebo/RedBright - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 10.0 - true - 0.05 - - - - 0.5 - 1e8 - - - - 10 - 10 - 1000000.0 - 10.0 - 0 0 1 - 0.001 - 0.0 - 1 - Gazebo/RedBright - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 10.0 - true - 0.05 - - - - 0.5 - 1e8 - - - - 10 - 10 - 1000000.0 - 10.0 - 0 0 1 - 0.001 - 0.0 - 1 - Gazebo/RedBright - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - true - 15.0 - false - - 1.7633 - - 1920 - 1080 - R8G8B8 - - - 0.02 - 300 - - - gaussian - 0.0 - 0.007 - - - - -0.043693598 - 0.0146164996 - -0.006573319 - -0.000216900 - 0.000084328 -
0.5 0.5
-
-
- - - zed_node/left_raw - image_raw_color - camera_info - zed2_left_camera_optical_frame - -0.043693598 - 0.0146164996 - -0.006573319 - -0.000216900 - 0.000084328 - 640 - 640 - 360 - 0.0 - -
- - - true - 15.0 - false - - 1.7633 - - 1920 - 1080 - R8G8B8 - - - 0.02 - 300 - - - gaussian - 0.0 - 0.007 - - - - - zed_node - left/image_rect_color - left/camera_info - depth/depth_registered - depth/camera_info - point_cloud/cloud_registered - 0.1 - 20.0 - zed2_left_camera_optical_frame - 0.12 - 640 - 640 - 360 - 0.0 - - -
- - - - - true - 15.0 - false - - 1.7633 - - 1920 - 1080 - R8G8B8 - - - 0.02 - 300 - - - gaussian - 0.0 - 0.007 - - - -0.040993299 - 0.009593590 - -0.004429849 - 0.000192024 - -0.000320880 -
0.5 0.5
-
-
- - - zed_node/right_raw - image_raw_color - camera_info - zed2_right_camera_optical_frame - -0.040993299 - 0.009593590 - -0.004429849 - 0.000192024 - -0.000320880 - 640 - 640 - 360 - 0.12 - -
- - - true - 15.0 - false - - 1.7633 - - 1920 - 1080 - R8G8B8 - - - 0.02 - 300 - - - gaussian - 0.0 - 0.007 - - - - - zed_node/right - image_rect_color - camera_info - zed2_right_camera_optical_frame - 640 - 640 - 360 - 0.12 - - -
- - - - 100 - - - /ur5/zed_node/imu/data - zed2_imu_link - 100.0 - 0.01 - false - - -0.002 -0.023 -0.002 - 0.0014025 -0.0000012 -0.0016915 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - true - - - true - - - true - - - true - - - true - - - true - - - true - - - - - - - ur5 - - gazebo_ros_control/DefaultRobotHWSim - ur5/robot_description - true - - -
diff --git a/VBOC/UR5/ur5_old.urdf b/VBOC/UR5/ur5_old.urdf deleted file mode 100644 index c03a279..0000000 --- a/VBOC/UR5/ur5_old.urdf +++ /dev/null @@ -1,453 +0,0 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - transmission_interface/SimpleTransmission - - PositionJointInterface - - - 1 - - - - transmission_interface/SimpleTransmission - - PositionJointInterface - - - 1 - - - - transmission_interface/SimpleTransmission - - PositionJointInterface - - - 1 - - - - transmission_interface/SimpleTransmission - - PositionJointInterface - - - 1 - - - - transmission_interface/SimpleTransmission - - PositionJointInterface - - - 1 - - - - transmission_interface/SimpleTransmission - - PositionJointInterface - - - 1 - - - - - - - - - - - - - - - - - - - - - - - - \ No newline at end of file diff --git a/VBOC/UR5/ur5reduced_class_fixedveldir.py b/VBOC/UR5/ur5reduced_class_fixedveldir.py deleted file mode 100644 index 1694345..0000000 --- a/VBOC/UR5/ur5reduced_class_fixedveldir.py +++ /dev/null @@ -1,203 +0,0 @@ -from acados_template import AcadosOcp, AcadosOcpSolver, AcadosSim, AcadosSimSolver -import numpy as np -from acados_template import AcadosModel -from casadi import SX, vertcat, dot -import urdf2casadi.urdfparser as u2c -import os - - -class OCPUR5: - def __init__(self): - - # --------------------SET MODEL-------------------- - # ------------------------------------------------- - model_name = "ur5_model" - - self.gravity = [0, 0, -9.81] - self.root = "base_link" - self.tip = "tool0" - - self.ur5 = u2c.URDFparser() - path_to_urdf = os.path.dirname( - os.path.abspath(__file__)) + '/ur5.urdf' - self.ur5.from_file(path_to_urdf) - self.n_joints = self.ur5.get_n_joints(self.root, self.tip) - - # states - q = SX.sym("qs", self.n_joints) - qdot = SX.sym("qsdot", self.n_joints) - q_dot = SX.sym("qs_dot", self.n_joints) - qdot_dot = SX.sym("qsdot_dot", self.n_joints) - self.x = vertcat(q, qdot) - xdot = vertcat(q_dot, qdot_dot) - - # controls - self.u = SX.sym("C", self.n_joints) - - # parameters - p = SX.sym("w", self.n_joints) - - # dynamics - func = self.ur5.get_forward_dynamics_aba(self.root, self.tip, gravity=self.gravity) - f_expl = vertcat(qdot, func(q, qdot, self.u)) - f_impl = xdot - f_expl - - self.inv_dyn = self.ur5.get_inverse_dynamics_rnea(self.root, self.tip, gravity=self.gravity) - - self.model = AcadosModel() - self.model.f_impl_expr = f_impl - self.model.f_expl_expr = f_expl - self.model.x = self.x - self.model.xdot = xdot - self.model.u = self.u - self.model.p = p - self.model.name = model_name - # ------------------------------------------------- - - # ---------------------SET OCP--------------------- - # ------------------------------------------------- - self.ocp = AcadosOcp() - - # times - self.Tf = 1. - self.ocp.solver_options.tf = self.Tf # prediction horizon - self.N = int(100 * self.Tf) - self.ocp.dims.N = self.N - - # ocp model - self.ocp.model = self.model - - # cost - self.ocp.cost.cost_type_0 = 'EXTERNAL' - - self.ocp.model.cost_expr_ext_cost_0 = dot(p,qdot) - self.ocp.parameter_values = np.zeros((self.n_joints)) - - # set constraints - u_limits = np.array([100., 80., 60., 1., 0.8, 0.6]) # np.array([150., 150., 150., 28., 28., 28.]) - x_limits = np.array([3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.]) # np.array([3.14, 3.14, 3.14, 3.14, 3.14, 3.14, 3.14, 3.14, 3.14, 3.14, 3.14, 3.14]) - - self.Cmax = u_limits[:self.n_joints] - self.Cmin = - self.Cmax - self.xmax = np.concatenate((x_limits[:self.n_joints],x_limits[self.n_joints*2:self.n_joints*3])) - self.xmin = - self.xmax - - self.xmax[1] = 0. - - self.ocp.constraints.lbu = np.copy(self.Cmin) - self.ocp.constraints.ubu = np.copy(self.Cmax) - self.ocp.constraints.idxbu = np.array(range(self.n_joints)) - self.ocp.constraints.lbx = np.copy(self.xmin) - self.ocp.constraints.ubx = np.copy(self.xmax) - self.ocp.constraints.idxbx = np.array(range(self.n_joints*2)) - self.ocp.constraints.lbx_e = np.copy(self.xmin) - self.ocp.constraints.ubx_e = np.copy(self.xmax) - self.ocp.constraints.idxbx_e = np.array(range(self.n_joints*2)) - self.ocp.constraints.lbx_0 = np.copy(self.xmin) - self.ocp.constraints.ubx_0 = np.copy(self.xmax) - self.ocp.constraints.idxbx_0 = np.array(range(self.n_joints*2)) - - self.ocp.constraints.C = np.zeros((self.n_joints,self.n_joints*2)) - self.ocp.constraints.D = np.zeros((self.n_joints,self.n_joints)) - self.ocp.constraints.lg = np.zeros((self.n_joints)) - self.ocp.constraints.ug = np.zeros((self.n_joints)) - - # Cartesian constraints: - # radius = 0.1 - # center = [0.2,0.2,0.2] - # self.kine = self.get_kinematics(q) - # self.model.con_h_expr = (self.kine[0]-center[0])**2 + (self.kine[1]-center[1])**2 + (self.kine[2]-center[2])**2 - radius**2 - # self.ocp.constraints.lh = np.array([0.]) - # self.ocp.constraints.uh = np.array([1e6]) - - # options - self.ocp.solver_options.nlp_solver_type = "SQP" - self.ocp.solver_options.hessian_approx = "EXACT" - self.ocp.solver_options.exact_hess_constr = 0 - # self.ocp.solver_options.exact_hess_cost = 0 - self.ocp.solver_options.exact_hess_dyn = 0 - self.ocp.solver_options.nlp_solver_tol_stat = 1e-3 - self.ocp.solver_options.qp_solver_tol_stat = 1e-3 - self.ocp.solver_options.qp_solver_iter_max = 100 - self.ocp.solver_options.nlp_solver_max_iter = 1000 - self.ocp.solver_options.globalization = "MERIT_BACKTRACKING" - self.ocp.solver_options.alpha_reduction = 0.3 - self.ocp.solver_options.alpha_min = 1e-2 - self.ocp.solver_options.levenberg_marquardt = 1e-2 - - # ------------------------------------------------- - - def get_inverse_dynamics(self, q, qdot): - - return self.inv_dyn(q, qdot, np.zeros((self.n_joints,))).toarray().reshape(self.n_joints,) - - def get_kinematics(self, q): - - dual_quaternion = self.ur5.get_forward_kinematics(self.root, self.tip)["dual_quaternion_fk"](q) - return 2*dual_quaternion[4:7]*dual_quaternion[:3] - - -class OCPUR5INIT(OCPUR5): - def __init__(self): - - # inherit initialization - super().__init__() - - # solver - self.ocp_solver = AcadosOcpSolver(self.ocp, json_file="acados_ocp.json", build=True) - - def OCP_solve(self, x_sol_guess, u_sol_guess, p, q_lb, q_ub, u_lb, u_ub, q_init_lb, q_init_ub, q_fin_lb, q_fin_ub): - - # Reset current iterate: - self.ocp_solver.reset() - - # Set parameters, guesses and constraints: - for i in range(self.N): - self.ocp_solver.set(i, 'x', x_sol_guess[i]) - self.ocp_solver.set(i, 'u', u_sol_guess[i]) - self.ocp_solver.set(i, 'p', p) - self.ocp_solver.constraints_set(i, 'lbx', q_lb) - self.ocp_solver.constraints_set(i, 'ubx', q_ub) - self.ocp_solver.constraints_set(i, 'lbu', u_lb) - self.ocp_solver.constraints_set(i, 'ubu', u_ub) - self.ocp_solver.constraints_set(i, 'C', np.zeros((self.n_joints,self.n_joints*2))) - self.ocp_solver.constraints_set(i, 'D', np.zeros((self.n_joints,self.n_joints))) - self.ocp_solver.constraints_set(i, 'lg', np.zeros((self.n_joints))) - self.ocp_solver.constraints_set(i, 'ug', np.zeros((self.n_joints))) - - C = np.zeros((self.n_joints,self.n_joints*2)) - d = np.array([p.tolist()]) - dt = np.transpose(d) - C[:,self.n_joints:] = np.identity(self.n_joints)-np.matmul(dt,d) - self.ocp_solver.constraints_set(0, "C", C, api='new') - - self.ocp_solver.constraints_set(0, "lbx", q_init_lb) - self.ocp_solver.constraints_set(0, "ubx", q_init_ub) - - self.ocp_solver.constraints_set(self.N, "lbx", q_fin_lb) - self.ocp_solver.constraints_set(self.N, "ubx", q_fin_ub) - self.ocp_solver.set(self.N, 'x', x_sol_guess[-1]) - self.ocp_solver.set(self.N, 'p', p) - - # Solve the OCP: - status = self.ocp_solver.solve() - - # print(status) - # print(self.ocp_solver.get_stats('residuals')) - - return status - - -class SYMUR5INIT(OCPUR5): - def __init__(self): - - # inherit initialization - super().__init__() - - self.model.p = [] - - sim = AcadosSim() - sim.model = self.model - sim.solver_options.T = 1e-2 - sim.solver_options.num_stages = 4 - self.acados_integrator = AcadosSimSolver(sim) diff --git a/VBOC/UR5/vboc_multiprocessing_ur5.py b/VBOC/UR5/vboc_multiprocessing_ur5.py deleted file mode 100644 index e837936..0000000 --- a/VBOC/UR5/vboc_multiprocessing_ur5.py +++ /dev/null @@ -1,639 +0,0 @@ -import numpy as np -import random -import matplotlib.pyplot as plt -from numpy.linalg import norm as norm -import time -from ur5reduced_class_fixedveldir import OCPUR5INIT, SYMUR5INIT -import warnings -warnings.filterwarnings("ignore") -import torch -import torch.nn as nn -from my_nn import NeuralNetDIRDeep -from multiprocessing import Pool -from torch.utils.data import DataLoader -import torch.nn.utils.prune as prune -from plots_2dof import plots_2dof -from plots_3dof import plots_3dof -from plots_4dof import plots_4dof - -def testing(v): - - valid_data = np.ndarray((0, ocp.ocp.dims.nx)) - - # Reset the time parameters: - N = N_start - ocp.N = N - ocp.ocp_solver.set_new_time_steps(np.full((N,), dt_sym)) - ocp.ocp_solver.update_qp_solver_cond_N(N) - - # Selection of the reference joint: - joint_sel = random.choice(range(ocp.ocp.dims.nu)) - - # Selection of the start position of the reference joint: - vel_sel = random.choice([-1, 1]) # -1 to maximise initial vel, + 1 to minimize it - - # Initial velocity optimization direction: - p = np.empty((ocp.ocp.dims.nu,)) - - for l in range(ocp.ocp.dims.nu): - if l == joint_sel: - p[l] = random.random() * vel_sel - else: - p[l] = random.random() * random.choice([-1, 1]) - - norm_weights = norm(p) - p = p/norm_weights - - # Bounds on the initial state: - q_init_lb = np.copy(ocp.xmin) - q_init_ub = np.copy(ocp.xmax) - - q_init_oth = np.empty((ocp.ocp.dims.nu,)) - - for l in range(ocp.ocp.dims.nu): - - if l == joint_sel: - - if vel_sel == -1: - q_init_oth[l] = x_min[l] + eps - q_fin_sel = x_max[l] - eps - else: - q_init_oth[l] = x_max[l] - eps - q_fin_sel = x_min[l] + eps - - q_init_lb[l] = q_init_oth[l] - q_init_ub[l] = q_init_oth[l] - - else: - q_init_oth[l] = x_min[l] + random.random() * (x_max[l]-x_min[l]) - - if q_init_oth[l] > x_max[l] - eps: - q_init_oth[l] = q_init_oth[l] - eps - if q_init_oth[l] < x_min[l] + eps: - q_init_oth[l] = q_init_oth[l] + eps - - q_init_lb[l] = q_init_oth[l] - q_init_ub[l] = q_init_oth[l] - - # State and input bounds: - q_lb = np.copy(ocp.xmin) - q_ub = np.copy(ocp.xmax) - u_lb = np.copy(ocp.Cmin) - u_ub = np.copy(ocp.Cmax) - - # Bounds on the final state: - q_fin_lb = np.copy(ocp.xmin) - q_fin_ub = np.copy(ocp.xmax) - - for l in range(ocp.ocp.dims.nu): - q_fin_lb[l+ocp.ocp.dims.nu] = 0. - q_fin_ub[l+ocp.ocp.dims.nu] = 0. - - # Guess: - x_sol_guess = np.empty((N, ocp.ocp.dims.nx)) - u_sol_guess = np.zeros((N, ocp.ocp.dims.nu)) - - for i, tau in enumerate(np.linspace(0, 1, N)): - - x_guess = np.zeros((ocp.ocp.dims.nx,)) - - for l in range(ocp.ocp.dims.nu): - if l == joint_sel: - x_guess[l] = (1-tau)*q_init_oth[l] + tau*q_fin_sel - x_guess[l+ocp.ocp.dims.nu] = 2*(1-tau)*(q_fin_sel-q_init_oth[l]) - else: - x_guess[l] = q_init_oth[l] - - x_sol_guess[i] = x_guess - - cost_old = 1e6 - all_ok = False - - while True: - - # Solve the OCP: - status = ocp.OCP_solve(x_sol_guess, u_sol_guess, p, q_lb, q_ub, u_lb, u_ub, q_init_lb, q_init_ub, q_fin_lb, q_fin_ub) - - if status == 0: # the solver has found a solution - - # Compare the current cost with the previous one: - cost_new = ocp.ocp_solver.get_cost() - - if cost_new > cost_old - tol: # the time is sufficient to have achieved an optimal solution - all_ok = True - break - - cost_old = cost_new - - # Update the guess with the current solution: - x_sol_guess = np.empty((N+1,ocp.ocp.dims.nx)) - u_sol_guess = np.empty((N+1,ocp.ocp.dims.nu)) - for i in range(N): - x_sol_guess[i] = ocp.ocp_solver.get(i, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i, "u") - x_sol_guess[N] = ocp.ocp_solver.get(N, "x") - u_sol_guess[N] = np.zeros((ocp.ocp.dims.nu,)) # ocp.get_inverse_dynamics(x_sol_guess[N,:ocp.ocp.dims.nu],x_sol_guess[N,ocp.ocp.dims.nu:]) - - # Increase the number of time steps: - N = N + 1 - ocp.N = N - ocp.ocp_solver.set_new_time_steps(np.full((N,), dt_sym)) - ocp.ocp_solver.update_qp_solver_cond_N(N) - - else: - - N = N_start - ocp.N = N - ocp.ocp_solver.set_new_time_steps(np.full((N,), dt_sym)) - ocp.ocp_solver.update_qp_solver_cond_N(N) - - for i in range(ocp.ocp.dims.nu): - - if i != joint_sel: - - p[i] = p[i] + random.random() * random.choice([-1, 1]) * 0.01 - q_init = q_init_lb[i] + random.random() * random.choice([-1, 1]) * 0.01 - q_init_lb[i] = q_init - q_init_ub[i] = q_init - - norm_weights = norm(p) - p = p/norm_weights - - # Guess: - x_sol_guess = np.empty((N, ocp.ocp.dims.nx)) - u_sol_guess = np.zeros((N, ocp.ocp.dims.nu)) - - for i, tau in enumerate(np.linspace(0, 1, N)): - - x_guess = np.zeros((ocp.ocp.dims.nx,)) - - for l in range(ocp.ocp.dims.nu): - if l == joint_sel: - x_guess[l] = (1-tau)*q_init_oth[l] + tau*q_fin_sel - x_guess[l+ocp.ocp.dims.nu] = 2*(1-tau)*(q_fin_sel-q_init_oth[l]) - else: - x_guess[l] = q_init_oth[l] - - x_sol_guess[i] = x_guess - - cost_old = 1e6 - - if all_ok: - - # Save the optimal trajectory: - x_sol = np.empty((N+1, ocp.ocp.dims.nx)) - u_sol = np.empty((N, ocp.ocp.dims.nu)) - - for i in range(N): - x_sol[i] = ocp.ocp_solver.get(i, "x") - u_sol[i] = ocp.ocp_solver.get(i, "u") - - x_sol[N] = ocp.ocp_solver.get(N, "x") - - # Generate the unviable sample in the cost direction: - x_sym = np.full((N+1,ocp.ocp.dims.nx), None) - - x_out = np.copy(x_sol[0]) - for l in range(ocp.ocp.dims.nu): - x_out[l+ocp.ocp.dims.nu] = x_out[l+ocp.ocp.dims.nu] - eps * p[l] - - # save the initial state: - valid_data = np.append(valid_data, [x_sol[0]], axis = 0) - - # Check if initial velocities lie on a limit: - if any(x_out[i] > x_max[i] or x_out[i] < x_min[i] for i in range(ocp.ocp.dims.nu,ocp.ocp.dims.nx)): - is_x_at_limit = True # the state is on dX - else: - is_x_at_limit = False # the state is on dV - x_sym[0] = x_out - - # Iterate through the trajectory to verify the location of the states with respect to V: - for f in range(1, N): - - if is_x_at_limit: - - # If the previous state was on a limit, the current state location cannot be identified using - # the corresponding unviable state but it has to rely on the proximity to the state limits - # (more restrictive): - if any(x_sol[f][i] > x_max[i] - eps or x_sol[f][i] < x_min[i] + eps for i in range(ocp.ocp.dims.nu)) or any(x_sol[f][i] > x_max[i] - tol or x_sol[f][i] < x_min[i] + tol for i in range(ocp.ocp.dims.nu,ocp.ocp.dims.nx)): - is_x_at_limit = True # the state is on dX - - else: - is_x_at_limit = False # the state is either on the interior of V or on dV - - # if the traj detouches from a position limit it usually enters V: - if any(x_sol[f-1][i] > x_max[i] - eps or x_sol[f-1][i] < x_min[i] + eps for i in range(ocp.ocp.dims.nu)): - break - - # Solve an OCP to verify whether the following part of the trajectory is on V or dV. To do so - # the initial joint positions are set to the current ones and the final state is fixed to the - # final state of the trajectory. The initial velocities are left free and maximized in the - # direction of the current joint velocities. - - N_test = N - f - ocp.N = N_test - ocp.ocp_solver.set_new_time_steps(np.full((N_test,), dt_sym)) - ocp.ocp_solver.update_qp_solver_cond_N(N_test) - - # Cost: - norm_weights = norm(x_sol[f][ocp.ocp.dims.nu:]) - p = np.zeros((ocp.ocp.dims.nu)) - for l in range(ocp.ocp.dims.nu): - p[l] = -x_sol[f][l+ocp.ocp.dims.nu]/norm_weights # the cost direction is based on the current velocity direction - - # Bounds on the initial state: - for l in range(ocp.ocp.dims.nu): - q_init_lb[l] = x_sol[f][l] - q_init_ub[l] = x_sol[f][l] - - # Guess: - x_sol_guess = np.empty((N_test+1, ocp.ocp.dims.nx)) - u_sol_guess = np.empty((N_test+1, ocp.ocp.dims.nu)) - for i in range(N_test): - x_sol_guess[i] = x_sol[i+f] - u_sol_guess[i] = u_sol[i+f] - x_sol_guess[N_test] = x_sol[N] - u_sol_guess[N_test] = np.zeros((ocp.ocp.dims.nu)) - - norm_old = norm_weights # velocity norm of the original solution - norm_bef = 0 - all_ok = False - - while True: - - # Solve the OCP: - status = ocp.OCP_solve(x_sol_guess, u_sol_guess, p, q_lb, q_ub, u_lb, u_ub, q_init_lb, q_init_ub, q_fin_lb, q_fin_ub) - - if status == 0: # the solver has found a solution - - # Compare the current cost with the previous one: - x0_new = ocp.ocp_solver.get(0, "x") - norm_new = norm(x0_new[ocp.ocp.dims.nu:]) - - if norm_new < norm_bef + tol: - all_ok = True - break - - norm_bef = norm_new - - # Update the guess with the current solution: - x_sol_guess = np.empty((N_test+1,ocp.ocp.dims.nx)) - u_sol_guess = np.empty((N_test+1,ocp.ocp.dims.nu)) - for i in range(N_test): - x_sol_guess[i] = ocp.ocp_solver.get(i, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i, "u") - x_sol_guess[N_test] = ocp.ocp_solver.get(N_test, "x") - u_sol_guess[N_test] = np.zeros((ocp.ocp.dims.nu)) - - # Increase the number of time steps: - N_test = N_test + 1 - ocp.N = N_test - ocp.ocp_solver.set_new_time_steps(np.full((N_test,), dt_sym)) - ocp.ocp_solver.update_qp_solver_cond_N(N_test) - - else: - break - - if all_ok: - - # Compare the old and new velocity norms: - if norm_new > norm_old + tol: # the state is inside V - - # Update the optimal solution: - for i in range(N-f): - x_sol[i+f] = ocp.ocp_solver.get(i, "x") - u_sol[i+f] = ocp.ocp_solver.get(i, "u") - - x_out = np.copy(x_sol[f]) - - for l in range(ocp.ocp.dims.nu): - x_out[l+ocp.ocp.dims.nu] = x_out[l+ocp.ocp.dims.nu] + eps * p[l] - - # Check if velocities lie on a limit: - if any(x_out[i] > x_max[i] or i < x_min[i] for i in range(ocp.ocp.dims.nu,ocp.ocp.dims.nx)): - is_x_at_limit = True # the state is on dX - else: - is_x_at_limit = False # the state is on dV - x_sym[f] = x_out - - else: - is_x_at_limit = False # the state is on dV - - # Generate the new corresponding unviable state in the cost direction: - x_out = np.copy(x_sol[f]) - - for l in range(ocp.ocp.dims.nu): - x_out[l+ocp.ocp.dims.nu] = x_out[l+ocp.ocp.dims.nu] - eps * p[l] - - x_sym[f] = x_out - - else: # we cannot say whether the state is on dV or inside V - - for r in range(f, N): - - if any(x_sol[r][i] > x_max[i] - eps or x_sol[r][i] < x_min[i] + eps for i in range(ocp.ocp.dims.nu,ocp.ocp.dims.nx)): - - # Save the viable states at velocity limits: - valid_data = np.append(valid_data, [x_sol[r]], axis = 0) - - break - - else: - - # Simulate next unviable state: - u_sym = np.copy(u_sol[f-1]) - sim.acados_integrator.set("u", u_sym) - sim.acados_integrator.set("x", x_sym[f-1]) - # sim.acados_integrator.set("T", dt_sym) - status = sim.acados_integrator.solve() - x_out = sim.acados_integrator.get("x") - - # When the state of the unviable simulated trajectory violates a limit, the corresponding viable state - # of the optimal trajectory is on dX: - if any(x_out[i] > x_max[i] or x_out[i] < x_min[i] for i in range(ocp.ocp.dims.nx)): - is_x_at_limit = True # the state is on dX - else: - is_x_at_limit = False # the state is on dV - x_sym[f] = x_out - - if all(x_sol[f][i] < x_max[i] - eps and x_sol[f][i] > x_min[i] + eps for i in range(ocp.ocp.dims.nu)) and all(abs(x_sol[f][i]) > tol for i in range(ocp.ocp.dims.nu,ocp.ocp.dims.nx)): - - # Save the viable and unviable states: - valid_data = np.append(valid_data, [x_sol[f]], axis = 0) - - return valid_data.tolist() - - else: - return None - -def testing_test(v): - - # Reset the time parameters: - N = N_start - ocp.N = N - ocp.ocp_solver.set_new_time_steps(np.full((N,), dt_sym)) - ocp.ocp_solver.update_qp_solver_cond_N(N) - - # Initial velocity optimization direction: - p = np.empty((ocp.ocp.dims.nu,)) - - for l in range(ocp.ocp.dims.nu): - p[l] = random.random() * random.choice([-1, 1]) - - norm_weights = norm(p) - p = p/norm_weights - - # Bounds on the initial state: - q_init_lb = np.copy(ocp.xmin) - q_init_ub = np.copy(ocp.xmax) - - q_init_oth = np.empty((ocp.ocp.dims.nu,)) - - for l in range(ocp.ocp.dims.nu): - - q_init_oth[l] = x_min[l] + random.random() * (x_max[l]-x_min[l]) - - q_init_lb[l] = q_init_oth[l] - q_init_ub[l] = q_init_oth[l] - - # State and input bounds: - q_lb = np.copy(ocp.xmin) - q_ub = np.copy(ocp.xmax) - u_lb = np.copy(ocp.Cmin) - u_ub = np.copy(ocp.Cmax) - - # Bounds on the final state: - q_fin_lb = np.copy(ocp.xmin) - q_fin_ub = np.copy(ocp.xmax) - - for l in range(ocp.ocp.dims.nu): - q_fin_lb[l+ocp.ocp.dims.nu] = 0. - q_fin_ub[l+ocp.ocp.dims.nu] = 0. - - # Guess: - x_sol_guess = np.empty((N, ocp.ocp.dims.nx)) - u_sol_guess = np.zeros((N, ocp.ocp.dims.nu)) - - x_guess = np.zeros((ocp.ocp.dims.nx,)) - - for l in range(ocp.ocp.dims.nu): - x_guess[l] = q_init_oth[l] - - x_sol_guess[:] = x_guess - - cost_old = 1e6 - all_ok = False - - # Iteratively solve the OCP with an increased number of time steps until the solution does not change. - # If the solver fails, try with a slightly different initial condition: - # for _ in range(10): - while True: - - # Solve the OCP: - status = ocp.OCP_solve(x_sol_guess, u_sol_guess, p, q_lb, q_ub, u_lb, u_ub, q_init_lb, q_init_ub, q_fin_lb, q_fin_ub) - - if status == 0: # the solver has found a solution - - # Compare the current cost with the previous one: - cost_new = ocp.ocp_solver.get_cost() - - if cost_new > cost_old - tol: # the time is sufficient to have achieved an optimal solution - all_ok = True - break - - cost_old = cost_new - - # Update the guess with the current solution: - x_sol_guess = np.empty((N+1,ocp.ocp.dims.nx)) - u_sol_guess = np.empty((N+1,ocp.ocp.dims.nu)) - for i in range(N): - x_sol_guess[i] = ocp.ocp_solver.get(i, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i, "u") - x_sol_guess[N] = ocp.ocp_solver.get(N, "x") - u_sol_guess[N] = np.zeros((ocp.ocp.dims.nu,)) # ocp.get_inverse_dynamics(x_sol_guess[N,:ocp.ocp.dims.nu],x_sol_guess[N,ocp.ocp.dims.nu:]) - - # Increase the number of time steps: - N = N + 1 - ocp.N = N - ocp.ocp_solver.set_new_time_steps(np.full((N,), dt_sym)) - ocp.ocp_solver.update_qp_solver_cond_N(N) - - else: - break - - if all_ok: - return [ocp.ocp_solver.get(0, "x")] - else: - return None - -start_time = time.time() - -# Ocp initialization: -ocp = OCPUR5INIT() -sim = SYMUR5INIT() - -# State bounds: -x_max = ocp.xmax -x_min = ocp.xmin - -device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # pytorch device - -dt_sym = 1e-2 # time step duration -N_start = 100 # initial number of timesteps -tol = ocp.ocp.solver_options.nlp_solver_tol_stat # OCP cost tolerance -eps = tol*10 # unviable data generation parameter - -# print('Start test data generation') - -# # Testing data generation: -# cpu_num = 30 -# num_prob = 10000 -# with Pool(cpu_num) as p: -# traj = p.map(testing_test, range(num_prob)) - -# X_temp = [i for i in traj if i is not None] -# X_test = np.array([i for f in X_temp for i in f]) - -# # Save training data: -# np.save('data_' + str(ocp.ocp.dims.nu) + 'dof_vboc_test', np.asarray(X_test)) - -X_test = np.load('data_' + str(ocp.ocp.dims.nu) + 'dof_vboc_test.npy') -X_old = np.load('data_' + str(ocp.ocp.dims.nu) + 'dof_vboc_train.npy') - -print('Start data generation') - -time_start = time.time() - -# Data generation: -cpu_num = 30 -num_prob = 1000000 -with Pool(cpu_num) as p: - traj = p.map(testing_test, range(num_prob)) - -print(time.time() - time_start) - -# traj, statpos, statneg = zip(*temp) -X_temp = [i for i in traj if i is not None] -print('Data generation completed') - -# Print data generations statistics: -solved=len(X_temp) -print('Solved/tot', len(X_temp)/num_prob) -X_new = np.array([i for f in X_temp for i in f]) -print('Saved/tot', len(X_new)/(solved*100)) - -X_save = np.concatenate((X_old,X_new)) - -# Save training data: -np.save('data_' + str(ocp.ocp.dims.nu) + 'dof_vboc_train', np.asarray(X_save)) - -# X_save = np.load('data_' + str(ocp.ocp.dims.nu) + 'dof_vboc_train.npy') - -# Pytorch params: -input_layers = ocp.ocp.dims.nx -hidden_layers = 300 -output_layers = 1 -learning_rate = 1e-3 - -# Model and optimizer: -model_dir = NeuralNetDIRDeep(input_layers, hidden_layers, output_layers).to(device) -criterion_dir = nn.MSELoss() -optimizer_dir = torch.optim.Adam(model_dir.parameters(), lr=learning_rate) - -# # Joint positions mean and variance: -# mean_dir, std_dir = torch.mean(torch.tensor(X_save[:,:ocp.ocp.dims.nu].tolist())).to(device).item(), torch.std(torch.tensor(X_save[:,:ocp.ocp.dims.nu].tolist())).to(device).item() -# torch.save(mean_dir, 'mean_' + str(ocp.ocp.dims.nu) + 'dof_vboc') -# torch.save(std_dir, 'std_' + str(ocp.ocp.dims.nu) + 'dof_vboc') - -mean_dir = torch.load('mean_' + str(ocp.ocp.dims.nu) + 'dof_vboc') -std_dir = torch.load('std_' + str(ocp.ocp.dims.nu) + 'dof_vboc') -model_dir.load_state_dict(torch.load('model_' + str(ocp.ocp.dims.nu) + 'dof_vboc')) - -# Rewrite data in the form [normalized positions, velocity direction, velocity norm]: -X_train_dir = np.empty((X_save.shape[0],ocp.ocp.dims.nx+1)) - -for i in range(X_train_dir.shape[0]): - vel_norm = norm(X_save[i][ocp.ocp.dims.nu:]) - X_train_dir[i][ocp.ocp.dims.nx] = vel_norm - for l in range(ocp.ocp.dims.nu): - X_train_dir[i][l] = (X_save[i][l] - mean_dir) / std_dir - X_train_dir[i][l+ocp.ocp.dims.nu] = X_save[i][l+ocp.ocp.dims.nu] / vel_norm - -beta = 0.95 -n_minibatch = pow(2,15) -B = int(X_save.shape[0]*100/n_minibatch) # number of iterations for 100 epoch -it_max = B * 10 - -print('Start model training') - -it = 1 -val = max(X_train_dir[:,ocp.ocp.dims.nx]) - -# Train the model -while val > 1e-3 and it < it_max: - ind = random.sample(range(len(X_train_dir)), n_minibatch) - - X_iter_tensor = torch.Tensor([X_train_dir[i][:ocp.ocp.dims.nx] for i in ind]).to(device) - y_iter_tensor = torch.Tensor([[X_train_dir[i][ocp.ocp.dims.nx]] for i in ind]).to(device) - - # Forward pass - outputs = model_dir(X_iter_tensor) - loss = criterion_dir(outputs, y_iter_tensor) - - # Backward and optimize - loss.backward() - optimizer_dir.step() - optimizer_dir.zero_grad() - - val = beta * val + (1 - beta) * loss.item() - it += 1 - - if it % B == 0: - print(val) - -print('Model training completed') - -# Show the resulting RMSE on the training data: -outputs = np.empty((len(X_train_dir),1)) -n_minibatch_model = pow(2,16) -with torch.no_grad(): - X_iter_tensor = torch.Tensor(X_train_dir[:,:ocp.ocp.dims.nx]).to(device) - y_iter_tensor = torch.Tensor(X_train_dir[:,ocp.ocp.dims.nx:]).to(device) - my_dataloader = DataLoader(X_iter_tensor,batch_size=n_minibatch_model,shuffle=False) - for (idx, batch) in enumerate(my_dataloader): - if n_minibatch_model*(idx+1) > len(X_train_dir): - outputs[n_minibatch_model*idx:len(X_train_dir)] = model_dir(batch).cpu().numpy() - else: - outputs[n_minibatch_model*idx:n_minibatch_model*(idx+1)] = model_dir(batch).cpu().numpy() - outputs_tensor = torch.Tensor(outputs).to(device) - print('RMSE train data: ', torch.sqrt(criterion_dir(outputs_tensor, y_iter_tensor))) - -# Compute resulting RMSE wrt testing data: -X_test_dir = np.empty((X_test.shape[0],ocp.ocp.dims.nx+1)) -for i in range(X_test_dir.shape[0]): - vel_norm = norm(X_test[i][ocp.ocp.dims.nu:ocp.ocp.dims.nx]) - X_test_dir[i][ocp.ocp.dims.nx] = vel_norm - for l in range(ocp.ocp.dims.nu): - X_test_dir[i][l] = (X_test[i][l] - mean_dir) / std_dir - X_test_dir[i][l+ocp.ocp.dims.nu] = X_test[i][l+ocp.ocp.dims.nu] / vel_norm - -with torch.no_grad(): - X_iter_tensor = torch.Tensor(X_test_dir[:,:ocp.ocp.dims.nx]).to(device) - y_iter_tensor = torch.Tensor(X_test_dir[:,ocp.ocp.dims.nx:]).to(device) - outputs = model_dir(X_iter_tensor) - print('RMSE test data: ', torch.sqrt(criterion_dir(outputs, y_iter_tensor))) - -# Save the model: -torch.save(model_dir.state_dict(), 'model_' + str(ocp.ocp.dims.nu) + 'dof_vboc') - -print("Execution time: %s seconds" % (time.time() - time_start + 143651)) - -# Show training data and resulting set approximation: -if ocp.ocp.dims.nu == 3: - plots_3dof(X_save, -3.14, 3.14, -3.15, 3.15, model_dir, mean_dir, std_dir, device) -elif ocp.ocp.dims.nu == 2: - plots_2dof(X_save, -3.14, 3.14, -3.15, 3.15, model_dir, mean_dir, std_dir, device) -elif ocp.ocp.dims.nu == 4: - plots_4dof(X_save, -3.14, 3.14, -3.15, 3.15, model_dir, mean_dir, std_dir, device) - -plt.show() \ No newline at end of file diff --git a/VBOC/doublependulum_class_vboc.py b/VBOC/doublependulum_class_vboc.py deleted file mode 100644 index 650c8e6..0000000 --- a/VBOC/doublependulum_class_vboc.py +++ /dev/null @@ -1,304 +0,0 @@ -from acados_template import AcadosOcp, AcadosOcpSolver, AcadosSim, AcadosSimSolver -import numpy as np -from acados_template import AcadosModel -from casadi import SX, vertcat, cos, sin - -class OCPdoublependulum: - def __init__(self): - - # --------------------SET MODEL-------------------- - # ------------------------------------------------- - model_name = "double_pendulum_ode" - - # constants - self.m1 = 0.4 # mass of the first link [kself.g] - self.m2 = 0.4 # mass of the second link [kself.g] - self.g = 9.81 # self.gravity constant [m/s^2] - self.l1 = 0.8 # lenself.gth of the first link [m] - self.l2 = 0.8 # lenself.gth of the second link [m] - - # states - theta1 = SX.sym("theta1") - theta2 = SX.sym("theta2") - dtheta1 = SX.sym("dtheta1") - dtheta2 = SX.sym("dtheta2") - dt = SX.sym('dt') - self.x = vertcat(theta1, theta2, dtheta1, dtheta2, dt) - - # controls - C1 = SX.sym("C1") - C2 = SX.sym("C2") - u = vertcat(C1, C2) - - # parameters - w1 = SX.sym("w1") - w2 = SX.sym("w2") - wt = SX.sym("wt") - p = vertcat(w1, w2, wt) - - # dynamics - f_expl = dt*vertcat( - dtheta1, - dtheta2, - ( - self.l1**2 - * self.l2 - * self.m2 - * dtheta1**2 - * sin(-2 * theta2 + 2 * theta1) - + 2 * C2 * cos(-theta2 + theta1) * self.l1 - + 2 - * ( - self.g * sin(-2 * theta2 + theta1) * self.l1 * self.m2 / 2 - + sin(-theta2 + theta1) * dtheta2**2 * self.l1 * self.l2 * self.m2 - + self.g * self.l1 * (self.m1 + self.m2 / 2) * sin(theta1) - - C1 - ) - * self.l2 - ) - / self.l1**2 - / self.l2 - / (self.m2 * cos(-2 * theta2 + 2 * theta1) - 2 * self.m1 - self.m2), - ( - -self.g - * self.l1 - * self.l2 - * self.m2 - * (self.m1 + self.m2) - * sin(-theta2 + 2 * theta1) - - self.l1 - * self.l2**2 - * self.m2**2 - * dtheta2**2 - * sin(-2 * theta2 + 2 * theta1) - - 2 - * dtheta1**2 - * self.l1**2 - * self.l2 - * self.m2 - * (self.m1 + self.m2) - * sin(-theta2 + theta1) - + 2 * C1 * cos(-theta2 + theta1) * self.l2 * self.m2 - + self.l1 - * (self.m1 + self.m2) - * (sin(theta2) * self.g * self.l2 * self.m2 - 2 * C2) - ) - / self.l2**2 - / self.l1 - / self.m2 - / (self.m2 * cos(-2 * theta2 + 2 * theta1) - 2 * self.m1 - self.m2), - 0. - ) - - self.model = AcadosModel() - - self.model.f_expl_expr = f_expl - self.model.x = self.x - self.model.u = u - self.model.p = p - self.model.name = model_name - # ------------------------------------------------- - - # ---------------------SET OCP--------------------- - # ------------------------------------------------- - self.ocp = AcadosOcp() - - # times - self.N = 100 - self.ocp.solver_options.tf = self.N - self.ocp.dims.N = self.N - - # ocp model - self.ocp.model = self.model - - # set constraints - self.Cmax = 10. - self.thetamax = np.pi / 4 + np.pi - self.thetamin = - np.pi / 4 + np.pi - self.dthetamax = 10. - - # cost - self.ocp.cost.cost_type_0 = 'EXTERNAL' - self.ocp.cost.cost_type = 'EXTERNAL' - - self.ocp.model.cost_expr_ext_cost_0 = w1 * dtheta1 + w2 * dtheta2 + wt * dt - self.ocp.model.cost_expr_ext_cost = wt * dt - self.ocp.parameter_values = np.array([0., 0., 0.]) - - self.ocp.constraints.lbu = np.array([-self.Cmax, -self.Cmax]) - self.ocp.constraints.ubu = np.array([self.Cmax, self.Cmax]) - self.ocp.constraints.idxbu = np.array([0, 1]) - self.ocp.constraints.lbx = np.array( - [self.thetamin, self.thetamin, -self.dthetamax, -self.dthetamax, 0.] - ) - self.ocp.constraints.ubx = np.array( - [self.thetamax, self.thetamax, self.dthetamax, self.dthetamax, 1e-2] - ) - self.ocp.constraints.idxbx = np.array([0, 1, 2, 3, 4]) - self.ocp.constraints.lbx_e = np.array( - [self.thetamin, self.thetamin, -self.dthetamax, -self.dthetamax, 0.] - ) - self.ocp.constraints.ubx_e = np.array( - [self.thetamax, self.thetamax, self.dthetamax, self.dthetamax, 1e-2] - ) - self.ocp.constraints.idxbx_e = np.array([0, 1, 2, 3, 4]) - self.ocp.constraints.lbx_0 = np.array([self.thetamin, self.thetamin, -self.dthetamax, -self.dthetamax, 0.]) - self.ocp.constraints.ubx_0 = np.array([self.thetamax, self.thetamax, self.dthetamax, self.dthetamax, 1e-2]) - self.ocp.constraints.idxbx_0 = np.array([0, 1, 2, 3, 4]) - - self.ocp.constraints.C = np.zeros((2,5)) - self.ocp.constraints.D = np.zeros((2,2)) - self.ocp.constraints.lg = np.zeros((2)) - self.ocp.constraints.ug = np.zeros((2)) - - # self.model.con_h_expr = w2 * dtheta1 - w1 * dtheta2 - # self.ocp.constraints.lh = np.array([-2 * self.dthetamax]) - # self.ocp.constraints.uh = np.array([2 * self.dthetamax]) - - # ------------------------------------------------- - - self.ocp.solver_options.nlp_solver_type = "SQP" - self.ocp.solver_options.hessian_approx = 'EXACT' - self.ocp.solver_options.exact_hess_constr = 0 - # self.ocp.solver_options.exact_hess_cost = 0 - self.ocp.solver_options.exact_hess_dyn = 0 - self.ocp.solver_options.nlp_solver_tol_stat = 1e-3 - self.ocp.solver_options.qp_solver_tol_stat = 1e-3 - self.ocp.solver_options.qp_solver_iter_max = 100 - self.ocp.solver_options.nlp_solver_max_iter = 1000 - self.ocp.solver_options.globalization = "MERIT_BACKTRACKING" - self.ocp.solver_options.alpha_reduction = 0.3 - self.ocp.solver_options.alpha_min = 1e-2 - self.ocp.solver_options.levenberg_marquardt = 1e-5 - -class OCPdoublependulumINIT(OCPdoublependulum): - def __init__(self): - - # inherit initialization - super().__init__() - - # solver - self.ocp_solver = AcadosOcpSolver(self.ocp, json_file="acados_ocp.json") - - def OCP_solve(self, x_sol_guess, u_sol_guess, p, q_lb, q_ub, u_lb, u_ub, q_init_lb, q_init_ub, q_fin_lb, q_fin_ub): - - # Reset current iterate: - self.ocp_solver.reset() - - # Set parameters, guesses and constraints: - for i in range(self.N): - self.ocp_solver.set(i, 'x', x_sol_guess[i]) - self.ocp_solver.set(i, 'u', u_sol_guess[i]) - self.ocp_solver.set(i, 'p', p) - self.ocp_solver.constraints_set(i, 'lbx', q_lb) - self.ocp_solver.constraints_set(i, 'ubx', q_ub) - self.ocp_solver.constraints_set(i, 'lbu', u_lb) - self.ocp_solver.constraints_set(i, 'ubu', u_ub) - self.ocp_solver.constraints_set(i, 'C', np.zeros((2,5))) - self.ocp_solver.constraints_set(i, 'D', np.zeros((2,2))) - self.ocp_solver.constraints_set(i, 'lg', np.zeros((2))) - self.ocp_solver.constraints_set(i, 'ug', np.zeros((2))) - - C = np.zeros((2,5)) - d = np.array([p[:2].tolist()]) - dt = np.transpose(d) - C[:,2:4] = np.identity(2)-np.matmul(dt,d) - self.ocp_solver.constraints_set(0, "C", C, api='new') - - self.ocp_solver.constraints_set(0, "lbx", q_init_lb) - self.ocp_solver.constraints_set(0, "ubx", q_init_ub) - - self.ocp_solver.constraints_set(self.N, "lbx", q_fin_lb) - self.ocp_solver.constraints_set(self.N, "ubx", q_fin_ub) - self.ocp_solver.set(self.N, 'x', x_sol_guess[-1]) - self.ocp_solver.set(self.N, 'p', p) - - # Solve the OCP: - status = self.ocp_solver.solve() - - return status - - -class SYMdoublependulumINIT(OCPdoublependulum): - def __init__(self): - - # inherit initialization - super().__init__() - - # states - theta1 = SX.sym("theta1") - theta2 = SX.sym("theta2") - dtheta1 = SX.sym("dtheta1") - dtheta2 = SX.sym("dtheta2") - self.x = vertcat(theta1, theta2, dtheta1, dtheta2) - - # controls - C1 = SX.sym("C1") - C2 = SX.sym("C2") - u = vertcat(C1, C2) - - p = [] - - # dynamics - f_expl = vertcat( - dtheta1, - dtheta2, - ( - self.l1**2 - * self.l2 - * self.m2 - * dtheta1**2 - * sin(-2 * theta2 + 2 * theta1) - + 2 * C2 * cos(-theta2 + theta1) * self.l1 - + 2 - * ( - self.g * sin(-2 * theta2 + theta1) * self.l1 * self.m2 / 2 - + sin(-theta2 + theta1) * dtheta2**2 * self.l1 * self.l2 * self.m2 - + self.g * self.l1 * (self.m1 + self.m2 / 2) * sin(theta1) - - C1 - ) - * self.l2 - ) - / self.l1**2 - / self.l2 - / (self.m2 * cos(-2 * theta2 + 2 * theta1) - 2 * self.m1 - self.m2), - ( - -self.g - * self.l1 - * self.l2 - * self.m2 - * (self.m1 + self.m2) - * sin(-theta2 + 2 * theta1) - - self.l1 - * self.l2**2 - * self.m2**2 - * dtheta2**2 - * sin(-2 * theta2 + 2 * theta1) - - 2 - * dtheta1**2 - * self.l1**2 - * self.l2 - * self.m2 - * (self.m1 + self.m2) - * sin(-theta2 + theta1) - + 2 * C1 * cos(-theta2 + theta1) * self.l2 * self.m2 - + self.l1 - * (self.m1 + self.m2) - * (sin(theta2) * self.g * self.l2 * self.m2 - 2 * C2) - ) - / self.l2**2 - / self.l1 - / self.m2 - / (self.m2 * cos(-2 * theta2 + 2 * theta1) - 2 * self.m1 - self.m2), - ) - - self.model.f_expl_expr = f_expl - self.model.x = self.x - self.model.p = p - self.model.u = u - - sim = AcadosSim() - sim.model = self.model - sim.solver_options.T = 1e-2 - sim.solver_options.num_stages = 4 - self.acados_integrator = AcadosSimSolver(sim) diff --git a/VBOC/doublependulum_vboc.py b/VBOC/doublependulum_vboc.py deleted file mode 100644 index 386b3c6..0000000 --- a/VBOC/doublependulum_vboc.py +++ /dev/null @@ -1,768 +0,0 @@ -import os -import sys -sys.path.insert(1, os.getcwd() + '/..') - -import numpy as np -import random -import matplotlib.pyplot as plt -from numpy.linalg import norm as norm -import time -from doublependulum_class_vboc import OCPdoublependulumINIT, SYMdoublependulumINIT -import warnings -warnings.filterwarnings("ignore") -import torch -import torch.nn as nn -from my_nn import NeuralNetDIR -import math -from multiprocessing import Pool - -def data_generation(v): - - valid_data = np.ndarray((0, 4)) - - # Reset the time parameters: - N = ocp.N # number of time steps - ocp.ocp_solver.set_new_time_steps(np.full((N,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N) - - # Initialization of the OCP: The OCP is set to find an extreme trajectory. The initial joint positions - # are set to random values, except for the reference joint whose position is set to an extreme value. - # The final velocities are all set to 0. The OCP has to maximise - # the initial velocity norm in a predefined direction. - - # Selection of the reference joint: - joint_sel = random.choice([0, 1]) # 0 to select first joint as reference, 1 to select second joint - joint_oth = int(1 - joint_sel) - - # Selection of the start position of the reference joint: - vel_sel = random.choice([-1, 1]) # -1 to maximise initial vel, + 1 to minimize it - if vel_sel == -1: - q_init_sel = q_min - q_fin_sel = q_max - else: - q_init_sel = q_max - q_fin_sel = q_min - - # Initial velocity optimization direction (the cost is in the form p[0] * dtheta1 + p[1] * dtheta2 + p[2] * dt): - ran1 = vel_sel * random.random() - ran2 = random.choice([-1, 1]) * random.random() - norm_weights = norm(np.array([ran1, ran2])) - if joint_sel == 0: - p = np.array([ran1/norm_weights, ran2/norm_weights, 0.]) - else: - p = np.array([ran2/norm_weights, ran1/norm_weights, 0.]) - - # Initial position of the other joint: - q_init_oth = q_min + random.random() * (q_max-q_min) - if q_init_oth > q_max - eps: - q_init_oth = q_init_oth - eps - if q_init_oth < q_min + eps: - q_init_oth = q_init_oth + eps - - OCP_store_ic = [vel_sel+1+joint_sel, ran1, ran2, q_init_oth] - - # Bounds on the initial state: - q_init_lb = np.array([q_min, q_min, v_min, v_min, dt_sym]) - q_init_ub = np.array([q_max, q_max, v_max, v_max, dt_sym]) - if q_init_sel == q_min: - q_init_lb[joint_sel] = q_min + eps - q_init_ub[joint_sel] = q_min + eps - else: - q_init_lb[joint_sel] = q_max - eps - q_init_ub[joint_sel] = q_max - eps - q_init_lb[joint_oth] = q_init_oth - q_init_ub[joint_oth] = q_init_oth - - # Guess: - x_sol_guess = np.empty((N, 5)) - u_sol_guess = np.empty((N, 2)) - for i, tau in enumerate(np.linspace(0, 1, N)): - x_guess = np.array([q_init_oth, q_init_oth, 0., 0., dt_sym]) - x_guess[joint_sel] = (1-tau)*q_init_sel + tau*q_fin_sel - x_guess[joint_sel+2] = 2*(1-tau)*(q_fin_sel-q_init_sel) - x_sol_guess[i] = x_guess - u_sol_guess[i] = np.array([ocp.g*ocp.l1*(ocp.m1+ocp.m2)*math.sin(x_guess[0]),ocp.g*ocp.l2*ocp.m2*math.sin(x_guess[1])]) - - cost = 1e6 - all_ok = False - - q_lb = np.array([q_min, q_min, v_min, v_min, dt_sym]) - q_ub = np.array([q_max, q_max, v_max, v_max, dt_sym]) - u_lb = np.array([-tau_max, -tau_max]) - u_ub = np.array([tau_max, tau_max]) - q_fin_lb = np.array([q_min, q_min, 0., 0., dt_sym]) - q_fin_ub = np.array([q_max, q_max, 0., 0., dt_sym]) - - # Iteratively solve the OCP with an increased number of time steps until the solution does not change. - # If the solver fails, try with a slightly different initial condition: - for _ in range(10): - - # Solve the OCP: - status = ocp.OCP_solve(x_sol_guess, u_sol_guess, p, q_lb, q_ub, u_lb, u_ub, q_init_lb, q_init_ub, q_fin_lb, q_fin_ub) - - if status == 0: # the solver has found a solution - - # Compare the current cost with the previous one: - cost_new = ocp.ocp_solver.get_cost() - - if cost_new > cost - tol: # the time is sufficient to have achieved an optimal solution - all_ok = True - break - - cost = cost_new - - # Update the guess with the current solution: - x_sol_guess = np.empty((N+1,5)) - u_sol_guess = np.empty((N+1,2)) - for i in range(N): - x_sol_guess[i] = ocp.ocp_solver.get(i, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i, "u") - x_sol_guess[N] = ocp.ocp_solver.get(N, "x") - u_sol_guess[N] = np.array([ocp.g*ocp.l1*(ocp.m1+ocp.m2)*math.sin(x_sol_guess[N][0]),ocp.g*ocp.l2*ocp.m2*math.sin(x_sol_guess[N][1])]) - - # Increase the number of time steps: - N = N + 1 - ocp.N = N - ocp.ocp_solver.set_new_time_steps(np.full((N,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N) - - else: - - # Reset the number of steps used in the OCP: - N = ocp.N - ocp.ocp_solver.set_new_time_steps(np.full((N,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N) - - # Initial velocity optimization direction (the cost is in the form p[0] * dtheta1 + p[1] * dtheta2 + p[2] * dt): - ran1 = ran1 + random.random() * random.choice([-1, 1]) * 0.01 - ran2 = ran2 + random.random() * random.choice([-1, 1]) * 0.01 - norm_weights = norm(np.array([ran1, ran2])) - if joint_sel == 0: - p = np.array([ran1/norm_weights, ran2/norm_weights, 0.]) - else: - p = np.array([ran2/norm_weights, ran1/norm_weights, 0.]) - - # Initial position of the other joint: - q_init_oth = q_init_oth + random.random() * random.choice([-1, 1]) * 0.01 - if q_init_oth > q_max - eps: - q_init_oth = q_init_oth - eps - if q_init_oth < q_min + eps: - q_init_oth = q_init_oth + eps - - # Bounds on the initial state: - q_init_lb[joint_oth] = q_init_oth - q_init_ub[joint_oth] = q_init_oth - - OCP_store_ic = [vel_sel+1+joint_sel, ran1, ran2, q_init_oth] - - # Guess: - x_sol_guess = np.empty((N, 5)) - u_sol_guess = np.empty((N, 2)) - for i, tau in enumerate(np.linspace(0, 1, N)): - x_guess = np.array([q_init_oth, q_init_oth, 0., 0., dt_sym]) - x_guess[joint_sel] = (1-tau)*q_init_sel + tau*q_fin_sel - x_guess[joint_sel+2] = 2*(1-tau)*(q_fin_sel-q_init_sel) - x_sol_guess[i] = x_guess - u_sol_guess[i] = np.array([ocp.g*ocp.l1*(ocp.m1+ocp.m2)*math.sin(x_guess[0]),ocp.g*ocp.l2*ocp.m2*math.sin(x_guess[1])]) - - cost = 1e6 - - if all_ok: - - # Save the optimal trajectory: - x_sol = np.empty((N+1,5)) - u_sol = np.empty((N,2)) - for i in range(N): - x_sol[i] = ocp.ocp_solver.get(i, "x") - u_sol[i] = ocp.ocp_solver.get(i, "u") - x_sol[N] = ocp.ocp_solver.get(N, "x") - - # # Minimum time OCP: - # if min_time: - # # Reset current iterate: - # ocp.ocp_solver.reset() - - # # Cost: - # p_mintime = np.array([0., 0., 1.]) # p[2] = 1 corresponds to the minimization of time - - # # Set parameters, guesses and constraints: - # for i in range(N): - # ocp.ocp_solver.set(i, 'x', x_sol[i]) - # ocp.ocp_solver.set(i, 'u', u_sol[i]) - # ocp.ocp_solver.set(i, 'p', p_mintime) - # ocp.ocp_solver.constraints_set(i, 'lbx', np.array([q_min, q_min, v_min, v_min, 0.])) - # ocp.ocp_solver.constraints_set(i, 'ubx', np.array([q_max, q_max, v_max, v_max, dt_sym])) - - # # Initial and final states are fixed (except for the time step duration): - # ocp.ocp_solver.constraints_set(N, "lbx", np.array([q_min, q_min, 0., 0., 0.])) - # ocp.ocp_solver.constraints_set(N, "ubx", np.array([q_max, q_max, 0., 0., dt_sym])) - # ocp.ocp_solver.set(N, 'x', x_sol[N]) - # ocp.ocp_solver.set(N, 'p', p_mintime) - - # ocp.ocp_solver.constraints_set(0, 'lbx', np.array([x_sol[0][0], x_sol[0][1], x_sol[0][2], x_sol[0][3], 0.])) - # ocp.ocp_solver.constraints_set(0, 'ubx', np.array([x_sol[0][0], x_sol[0][1], x_sol[0][2], x_sol[0][3], dt_sym])) - # ocp.ocp_solver.constraints_set(0, "C", np.array([[0., 0., 0., 0., 0.]])) - - # # Solve the OCP: - # status = ocp.ocp_solver.solve() - - # if status == 0: - # # Save the new optimal trajectory: - # for i in range(N): - # x_sol[i] = ocp.ocp_solver.get(i, "x") - # u_sol[i] = ocp.ocp_solver.get(i, "u") - # x_sol[N] = ocp.ocp_solver.get(N, "x") - - # # Save the optimized time step duration: - # dt_sym = ocp.ocp_solver.get(0, "x")[4] - - x_sym = np.full((N+1,4), None) - - # Generate the unviable sample in the cost direction: - x_out = np.copy(x_sol[0][:4]) - x_out[2] = x_out[2] - eps * p[0] - x_out[3] = x_out[3] - eps * p[1] - - # save the initial state: - valid_data = np.append(valid_data, [[x_sol[0][0], x_sol[0][1], x_sol[0][2], x_sol[0][3]]], axis = 0) - - # Check if initial velocities lie on a limit: - if x_out[2] > v_max or x_out[2] < v_min or x_out[3] > v_max or x_out[3] < v_min: - is_x_at_limit = True # the state is on dX - else: - is_x_at_limit = False # the state is on dV - x_sym[0] = x_out - - # Iterate through the trajectory to verify the location of the states with respect to V: - for f in range(1, N): - - if is_x_at_limit: - - # Generate the unviable sample in the cost direction: - x_out = np.copy(x_sol[f][:4]) - norm_vel = norm(x_out[2:]) - x_out[2] = x_out[2] + eps * x_out[2]/norm_vel - x_out[3] = x_out[3] + eps * x_out[3]/norm_vel - - # If the previous state was on a limit, the current state location cannot be identified using - # the corresponding unviable state but it has to rely on the proximity to the state limits - # (more restrictive): - if x_sol[f][0] > q_max - eps or x_sol[f][0] < q_min + eps or x_sol[f][1] > q_max - eps or x_sol[f][1] < q_min + eps or x_out[2] > v_max or x_out[2] < v_min or x_out[3] > v_max or x_out[3] < v_min: - is_x_at_limit = True # the state is on dX - else: - is_x_at_limit = False # the state is either on the interior of V or on dV - - # if the traj detouches from a position limit it usually enters V: - if x_sol[f-1][0] > q_max - eps or x_sol[f-1][0] < q_min + eps or x_sol[f-1][1] > q_max - eps or x_sol[f-1][1] < q_min + eps: - break - - # Solve an OCP to verify whether the following part of the trajectory is on V or dV. To do so - # the initial joint positions are set to the current ones and the final state is fixed to the - # final state of the trajectory. The initial velocities are left free and maximized in the - # direction of the current joint velocities. - - # Reset the number of steps used in the OCP: - N_test = N - f - ocp.N = N_test - ocp.ocp_solver.set_new_time_steps(np.full((N_test,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N_test) - - # Cost: - norm_weights = norm(np.array([x_sol[f][2], x_sol[f][3]])) - p = np.array([-x_sol[f][2]/norm_weights, -x_sol[f][3]/norm_weights, 0.]) # the cost direction is based on the current velocity direction - - # Bounds on the initial state: - q_init_lb = np.array([x_sol[f][0], x_sol[f][1], v_min, v_min, dt_sym]) - q_init_ub = np.array([x_sol[f][0], x_sol[f][1], v_max, v_max, dt_sym]) - - # Guess: - x_sol_guess = np.empty((N_test+1, 5)) - u_sol_guess = np.empty((N_test+1, 2)) - for i in range(N_test): - x_sol_guess[i] = x_sol[i+f] - u_sol_guess[i] = u_sol[i+f] - x_sol_guess[N_test] = x_sol[N] - u_sol_guess[N_test] = np.array([ocp.g*ocp.l1*(ocp.m1+ocp.m2)*math.sin(x_sol[N][0]),ocp.g*ocp.l2*ocp.m2*math.sin(x_sol[N][1])]) - - norm_old = norm(np.array([x_sol[f][2:4]])) # velocity norm of the original solution - - norm_prev = 0 - all_ok = False - - for _ in range(5): - - # Solve the OCP: - status = ocp.OCP_solve(x_sol_guess, u_sol_guess, p, q_lb, q_ub, u_lb, u_ub, q_init_lb, q_init_ub, q_fin_lb, q_fin_ub) - - if status == 0: # the solver has found a solution - - # Compare the current cost with the previous one: - x0_new = ocp.ocp_solver.get(0, "x") - norm_new = norm(np.array([x0_new[2:4]])) - - if norm_new < norm_prev + tol: - all_ok = True - break - - norm_prev = norm_new - - # Update the guess with the current solution: - x_sol_guess = np.empty((N_test+1,5)) - u_sol_guess = np.empty((N_test+1,2)) - for i in range(N_test): - x_sol_guess[i] = ocp.ocp_solver.get(i, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i, "u") - x_sol_guess[N_test] = ocp.ocp_solver.get(N_test, "x") - u_sol_guess[N_test] = np.array([ocp.g*ocp.l1*(ocp.m1+ocp.m2)*math.sin(x_sol_guess[N_test][0]),ocp.g*ocp.l2*ocp.m2*math.sin(x_sol_guess[N_test][1])]) - - # Increase the number of time steps: - N_test = N_test + 1 - ocp.N = N_test - ocp.ocp_solver.set_new_time_steps(np.full((N_test,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N_test) - - else: - break - - if all_ok: - - # Compare the old and new velocity norms: - if norm_new > norm_old + tol: # the state was inside V - - # Update the optimal solution: - for i in range(N-f): - x_sol[i+f] = ocp.ocp_solver.get(i, "x") - u_sol[i+f] = ocp.ocp_solver.get(i, "u") - - # Generate the unviable sample in the cost direction: - x_out = np.copy(x_sol[f][:4]) - x_out[2] = x_out[2] + eps * x_out[2]/norm_new - x_out[3] = x_out[3] + eps * x_out[3]/norm_new - - # Check if velocities lie on a limit: - if x_out[2] > v_max or x_out[2] < v_min or x_out[3] > v_max or x_out[3] < v_min: - is_x_at_limit = True # the state is on dX - else: - is_x_at_limit = False # the state is on dV - x_sym[f] = x_out - - else: - - is_x_at_limit = False # the state is on dV - - # Generate the new corresponding unviable state in the cost direction: - x_out = np.copy(x_sol[f][:4]) - x_out[2] = x_out[2] - eps * p[0] - x_out[3] = x_out[3] - eps * p[1] - if x_out[joint_sel+2] > v_max: - x_out[joint_sel+2] = v_max - if x_out[joint_sel+2] < v_min: - x_out[joint_sel+2] = v_min - x_sym[f] = x_out - - else: # we cannot say whether the state is on dV or inside V - - for r in range(f, N): - if abs(x_sol[r][2]) > v_max - eps or abs(x_sol[r][3]) > v_max - eps: - - # Save the viable states at velocity limits: - valid_data = np.append(valid_data, [[x_sol[f][0], x_sol[f][1], x_sol[f][2], x_sol[f][3]]], axis = 0) - - break - - else: - - # If the previous state was not on a limit, the current state location can be identified using - # the corresponding unviable state which can be computed by simulating the system starting from - # the previous unviable state. - - # Simulate next unviable state: - u_sym = np.copy(u_sol[f-1]) - sim.acados_integrator.set("u", u_sym) - sim.acados_integrator.set("x", x_sym[f-1]) - status = sim.acados_integrator.solve() - x_out = sim.acados_integrator.get("x") - x_sym[f] = x_out - - # When the state of the unviable simulated trajectory violates a limit, the corresponding viable state - # of the optimal trajectory is on dX: - if x_out[0] > q_max or x_out[0] < q_min or x_out[1] > q_max or x_out[1] < q_min or x_out[2] > v_max or x_out[2] < v_min or x_out[3] > v_max or x_out[3] < v_min: - is_x_at_limit = True # the state is on dX - else: - is_x_at_limit = False # the state is on dV - - if x_sol[f][0] < q_max - eps and x_sol[f][0] > q_min + eps and x_sol[f][1] < q_max - eps and x_sol[f][1] > q_min + eps and abs(x_sol[f][2]) > tol and abs(x_sol[f][3]) > tol: - - # Save the viable and unviable states: - valid_data = np.append(valid_data, [[x_sol[f][0], x_sol[f][1], x_sol[f][2], x_sol[f][3]]], axis = 0) - - return valid_data.tolist(), OCP_store_ic, None - - else: - return None, None, OCP_store_ic - -start_time = time.time() - -X_test = np.load('../data2_test.npy') - -# Ocp initialization: -ocp = OCPdoublependulumINIT() -sim = SYMdoublependulumINIT() - -# Position, velocity and torque bounds: -v_max = ocp.dthetamax -v_min = - ocp.dthetamax -q_max = ocp.thetamax -q_min = ocp.thetamin -tau_max = ocp.Cmax - -stop_time = 3600 # total computational time -dt_sym = 1e-2 # time step duration - -device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # pytorch device - -tol = ocp.ocp.solver_options.nlp_solver_tol_stat # OCP cost tolerance -eps = tol*10 # unviable data generation parameter - -iteration = 0 -print('Start data generation, iteration:', iteration) - -# Data generation: -cpu_num = 30 -num_prob = 1000 -with Pool(cpu_num) as p: - temp = p.map(data_generation, range(num_prob)) - -traj, statpos, statneg = zip(*temp) -X_save = [i for i in traj if i is not None] -X_save = np.array([i for f in X_save for i in f]) - -print('Data generation completed') - -# Pytorch params: -input_layers = 4 -hidden_layers = 300 -output_layers = 1 -learning_rate = 1e-3 - -# Model and optimizer: -model_dir = NeuralNetDIR(input_layers, hidden_layers, output_layers).to(device) -criterion_dir = nn.MSELoss() -optimizer_dir = torch.optim.Adam(model_dir.parameters(), lr=learning_rate) - -# Joint positions mean and variance: -mean_dir, std_dir = torch.mean(torch.tensor(X_save[:,:2].tolist())).to(device).item(), torch.std(torch.tensor(X_save[:,:2].tolist())).to(device).item() -torch.save(mean_dir, 'mean_2dof_vboc') -torch.save(std_dir, 'std_2dof_vboc') - -# Rewrite data in the form [normalized positions, velocity direction, velocity norm]: -X_train_dir = np.empty((X_save.shape[0],5)) -for i in range(X_train_dir.shape[0]): - X_train_dir[i][0] = (X_save[i][0] - mean_dir) / std_dir - X_train_dir[i][1] = (X_save[i][1] - mean_dir) / std_dir - vel_norm = norm([X_save[i][2], X_save[i][3]]) - if vel_norm != 0: - X_train_dir[i][2] = X_save[i][2] / vel_norm - X_train_dir[i][3] = X_save[i][3] / vel_norm - X_train_dir[i][4] = vel_norm - -beta = 0.95 -n_minibatch = 4096 -B = int(X_save.shape[0]*100/n_minibatch) # number of iterations for 100 epoch -it_max = B * 10 - -print('Start model training') - -it = 1 -val = max(X_train_dir[:,4]) - -# Train the model: -while val > 1e-3 and it < it_max: - ind = random.sample(range(len(X_train_dir)), n_minibatch) - - X_iter_tensor = torch.Tensor([X_train_dir[i][:4] for i in ind]).to(device) - y_iter_tensor = torch.Tensor([[X_train_dir[i][4]] for i in ind]).to(device) - - # Forward pass - outputs = model_dir(X_iter_tensor) - loss = criterion_dir(outputs, y_iter_tensor) - - # Backward and optimize - loss.backward() - optimizer_dir.step() - optimizer_dir.zero_grad() - - val = beta * val + (1 - beta) * loss.item() - it += 1 - - if it % B == 0: - print(val) - -print('Model training completed') - -# Store time: -times = np.array([time.time() - start_time]) - -# Store resulting RMSE wrt testing data: -X_test_dir = np.empty((X_test.shape[0],5)) -for i in range(X_test_dir.shape[0]): - X_test_dir[i][0] = (X_test[i][0] - mean_dir) / std_dir - X_test_dir[i][1] = (X_test[i][1] - mean_dir) / std_dir - vel_norm = norm([X_test[i][2],X_test[i][3]]) - if vel_norm != 0: - X_test_dir[i][2] = X_test[i][2] / vel_norm - X_test_dir[i][3] = X_test[i][3] / vel_norm - X_test_dir[i][4] = vel_norm - -with torch.no_grad(): - X_iter_tensor = torch.Tensor(X_test_dir[:,:4]).to(device) - y_iter_tensor = torch.Tensor(X_test_dir[:,4:]).to(device) - rmse = np.array([torch.sqrt(criterion_dir(model_dir(X_iter_tensor), y_iter_tensor)).item()]) - -while time.time() - start_time < stop_time: - - iteration += 1 - print('Start data generation, iteration:', iteration) - - # Data generation: - with Pool(cpu_num) as p: - temp = p.map(data_generation, range(num_prob)) - - traj, statpos, statneg = zip(*temp) - samples = [i for i in traj if i is not None] - - # Add new training data to the dataset: - X_new = np.array([i for f in samples for i in f]) - X_save = np.concatenate((X_save, X_new)) - - print('Data generation completed') - - # Rewrite data in the form [normalized positions, velocity direction, velocity norm]: - X_new_dir = np.empty((X_new.shape[0],5)) - for i in range(X_new_dir.shape[0]): - X_new_dir[i][0] = (X_new[i][0] - mean_dir) / std_dir - X_new_dir[i][1] = (X_new[i][1] - mean_dir) / std_dir - vel_norm = norm([X_new[i][2], X_new[i][3]]) - if vel_norm != 0: - X_new_dir[i][2] = X_new[i][2] / vel_norm - X_new_dir[i][3] = X_new[i][3] / vel_norm - X_new_dir[i][4] = vel_norm - X_train_dir = np.concatenate((X_train_dir, X_new_dir)) - - print('Start model training') - - it = 1 - val = max(X_train_dir[:,4]) - - # Train the model - while val > 1e-3 and it < it_max: - # ind = random.sample(range(len(X_train_dir)), n_minibatch) - - ind = random.sample(range(len(X_train_dir) - len(X_new_dir)), int(n_minibatch / 2)) - ind.extend( - random.sample( - range(len(X_train_dir) - len(X_new_dir), len(X_train_dir)), - int(n_minibatch / 2), - ) - ) - - X_iter_tensor = torch.Tensor([X_train_dir[i][:4] for i in ind]).to(device) - y_iter_tensor = torch.Tensor([[X_train_dir[i][4]] for i in ind]).to(device) - - # Forward pass - outputs = model_dir(X_iter_tensor) - loss = criterion_dir(outputs, y_iter_tensor) - - # Backward and optimize - loss.backward() - optimizer_dir.step() - optimizer_dir.zero_grad() - - val = beta * val + (1 - beta) * loss.item() - it += 1 - - if it % B == 0: - print(val) - - print('Model training completed') - - # Store time: - times = np.append(times, [time.time() - start_time]) - - # Store resulting RMSE wrt testing data: - with torch.no_grad(): - X_iter_tensor = torch.Tensor(X_test_dir[:,:4]).to(device) - y_iter_tensor = torch.Tensor(X_test_dir[:,4:]).to(device) - rmse = np.append(rmse, [torch.sqrt(criterion_dir(model_dir(X_iter_tensor), y_iter_tensor)).item()]) - -# Show total time and resulting RMSE: -print('Iterations completed with total time:', time.time() - start_time) -print('RMSE test data: ', torch.sqrt(criterion_dir(model_dir(X_iter_tensor), y_iter_tensor)).item()) - -# Store times and RMSE: -np.save('times_2dof_vboc.npy', np.asarray(times)) -np.save('rmse_2dof_vboc.npy', np.asarray(rmse)) - -# Show the RMSE evolution over time: -plt.figure() -plt.plot(times, rmse) - -# Save the training data and the model: -np.save('data_2dof_vboc.npy', np.asarray(X_save)) -torch.save(model_dir.state_dict(), 'model_2dof_vboc.npy') - -# Show the resulting set approximation: -with torch.no_grad(): - h = 0.01 - xx, yy = np.meshgrid(np.arange(q_min, q_max+h, h), np.arange(v_min, v_max+h, h)) - xrav = xx.ravel() - yrav = yy.ravel() - - # Plot the results: - plt.figure() - inp = np.c_[ - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - xrav, - np.zeros(yrav.shape[0]), - yrav, - np.empty(yrav.shape[0]), - ] - for i in range(inp.shape[0]): - vel_norm = norm([inp[i][2],inp[i][3]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][2] = inp[i][2] / vel_norm - inp[i][3] = inp[i][3] / vel_norm - inp[i][4] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:4].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][4] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - # xit = [] - # yit = [] - # for i in range(X_test.shape[0]): - # if ( - # norm(X_test[i][0] - (q_min + q_max) / 2) < 0.01 - # and norm(X_test[i][2]) < 0.1 - # ): - # xit.append(X_test[i][1]) - # yit.append(X_test[i][3]) - # plt.plot( - # xit, - # yit, - # "ko", - # markersize=2 - # ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_2$') - plt.xlabel('$q_2$') - plt.grid() - plt.title("Classifier section") - - plt.figure() - inp = np.c_[ - xrav, - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - yrav, - np.zeros(yrav.shape[0]), - np.empty(yrav.shape[0]), - ] - for i in range(inp.shape[0]): - vel_norm = norm([inp[i][2],inp[i][3]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][2] = inp[i][2] / vel_norm - inp[i][3] = inp[i][3] / vel_norm - inp[i][4] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:4].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][4] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - # xit = [] - # yit = [] - # for i in range(X_test.shape[0]): - # if ( - # norm(X_test[i][1] - (q_min + q_max) / 2) < 0.01 - # and norm(X_test[i][3]) < 0.1 - # ): - # xit.append(X_test[i][0]) - # yit.append(X_test[i][2]) - # plt.plot( - # xit, - # yit, - # "ko", - # markersize=2 - # ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_1$') - plt.xlabel('$q_1$') - plt.grid() - plt.title("Classifier section") - - # for l in range(10): - # q1ran = q_min + random.random() * (q_max-q_min) - # q2ran = q_min + random.random() * (q_max-q_min) - - # # Plot the results: - # plt.figure() - # inp = np.float32( - # np.c_[ - # q1ran * np.ones(xrav.shape[0]), - # q2ran * np.ones(xrav.shape[0]), - # xrav, - # yrav, - # np.empty(yrav.shape[0]), - # ] - # ) - # for i in range(inp.shape[0]): - # vel_norm = norm([inp[i][2],inp[i][3]]) - # inp[i][0] = (inp[i][0] - mean_dir) / std_dir - # inp[i][1] = (inp[i][1] - mean_dir) / std_dir - # if vel_norm != 0: - # inp[i][2] = inp[i][2] / vel_norm - # inp[i][3] = inp[i][3] / vel_norm - # inp[i][4] = vel_norm - # out = (model_dir(torch.Tensor(inp[:,:4]).to(device))).cpu().numpy() - # y_pred = np.empty(out.shape) - # for i in range(len(out)): - # if inp[i][4] > out[i]: - # y_pred[i] = 0 - # else: - # y_pred[i] = 1 - # Z = y_pred.reshape(yy.shape) - # plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - # xit = [] - # yit = [] - # for i in range(X_save.shape[0]): - # if ( - # norm(X_save[i][0] - q1ran) < 0.01 - # and norm(X_save[i][1] - q2ran) < 0.01 - # ): - # xit.append(X_save[i][2]) - # yit.append(X_save[i][3]) - # plt.plot( - # xit, - # yit, - # "ko", - # markersize=2 - # ) - # plt.xlim([v_min, v_max]) - # plt.ylim([v_min, v_max]) - # plt.grid() - # plt.title("q1="+str(q1ran)+" q2="+str(q2ran)) - -plt.show() diff --git a/VBOC/pendulum_class_vboc.py b/VBOC/pendulum_class_vboc.py deleted file mode 100644 index dfb83f3..0000000 --- a/VBOC/pendulum_class_vboc.py +++ /dev/null @@ -1,130 +0,0 @@ -from acados_template import AcadosOcp, AcadosOcpSolver -import numpy as np -from acados_template import AcadosModel -from casadi import SX, vertcat, sin - -class OCPpendulum: - def __init__(self): - - # --------------------SET MODEL-------------------- - # ------------------------------------------------- - model_name = "pendulum_ode" - - # constants - self.m = 0.5 # mass of the ball [kg] - self.g = 9.81 # gravity constant [m/s^2] - self.d = 0.3 # length of the rod [m] - self.b = 0.01 # damping - - # states - theta = SX.sym("theta") - dtheta = SX.sym("dtheta") - dt = SX.sym('dt') - self.x = vertcat(theta, dtheta, dt) - - # controls - F = SX.sym("F") - u = vertcat(F) - - # parameters - w1 = SX.sym("w1") - wt = SX.sym("wt") - p = vertcat(w1, wt) - - # dynamics - f_expl = dt*vertcat( - dtheta, - (self.m * self.g * self.d * sin(theta) + F - self.b * dtheta) - / (self.d * self.d * self.m), 0., - ) - - self.model = AcadosModel() - - self.model.f_expl_expr = f_expl - self.model.x = self.x - self.model.u = u - self.model.p = p - self.model.name = model_name - # ------------------------------------------------- - - # ---------------------SET OCP--------------------- - # ------------------------------------------------- - self.ocp = AcadosOcp() - - # times - Tf = 50 - self.N = Tf - self.ocp.solver_options.tf = Tf - self.ocp.dims.N = self.N - - # ocp model - self.ocp.model = self.model - - # constraints - self.Fmax = 3 - self.thetamax = np.pi / 4 + np.pi - self.thetamin = - np.pi / 4 + np.pi - self.dthetamax = 10.0 - - # cost - self.ocp.cost.cost_type_0 = 'EXTERNAL' - self.ocp.cost.cost_type = 'EXTERNAL' - - self.ocp.model.cost_expr_ext_cost_0 = w1 * dtheta + wt * dt - self.ocp.model.cost_expr_ext_cost = wt * dt - self.ocp.parameter_values = np.array([0., 1.]) - - self.ocp.constraints.lbu = np.array([-self.Fmax]) - self.ocp.constraints.ubu = np.array([+self.Fmax]) - self.ocp.constraints.idxbu = np.array([0]) - self.ocp.constraints.lbx = np.array([self.thetamin, -self.dthetamax, 0.]) - self.ocp.constraints.ubx = np.array([self.thetamax, self.dthetamax, 1e-2]) - self.ocp.constraints.idxbx = np.array([0, 1, 2]) - self.ocp.constraints.lbx_e = np.array([self.thetamin, -self.dthetamax, 0.]) - self.ocp.constraints.ubx_e = np.array([self.thetamax, self.dthetamax, 1e-2]) - self.ocp.constraints.idxbx_e = np.array([0, 1, 2]) - self.ocp.constraints.lbx_0 = np.array([self.thetamin, -self.dthetamax, 0.]) - self.ocp.constraints.ubx_0 = np.array([self.thetamax, self.dthetamax, 1e-2]) - self.ocp.constraints.idxbx_0 = np.array([0, 1, 2]) - - # options - self.ocp.solver_options.nlp_solver_type = "SQP" - self.ocp.solver_options.hessian_approx = "EXACT" - self.ocp.solver_options.exact_hess_constr = 0 - # self.ocp.solver_options.exact_hess_cost = 0 - self.ocp.solver_options.exact_hess_dyn = 0 - self.ocp.solver_options.nlp_solver_tol_stat = 1e-3 - self.ocp.solver_options.qp_solver_tol_stat = 1e-3 - self.ocp.solver_options.qp_solver_iter_max = 100 - self.ocp.solver_options.nlp_solver_max_iter = 1000 - self.ocp.solver_options.globalization = "MERIT_BACKTRACKING" - self.ocp.solver_options.alpha_reduction = 0.3 - self.ocp.solver_options.alpha_min = 1e-2 - self.ocp.solver_options.levenberg_marquardt = 1e-5 - - self.ocp_solver = AcadosOcpSolver(self.ocp, json_file="acados_ocp.json") - - def OCP_solve(self, x_sol_guess, u_sol_guess, cost_dir, q_lb, q_ub, q_init, q_fin): - - # Reset solver: - self.ocp_solver.reset() - - # Constraints and guess: - for i in range(self.N): - self.ocp_solver.set(i, "x", np.array(x_sol_guess[i])) - self.ocp_solver.set(i, "u", np.array(u_sol_guess[i])) - self.ocp_solver.set(i, 'p', np.array([cost_dir, 1.])) - self.ocp_solver.constraints_set(i, "lbx", q_lb) - self.ocp_solver.constraints_set(i, "ubx", q_ub) - - self.ocp_solver.constraints_set(0, "lbx", np.array([q_init, -self.dthetamax, 0.])) - self.ocp_solver.constraints_set(0, "ubx", np.array([q_init, self.dthetamax, 1e-2])) - self.ocp_solver.constraints_set(self.N, "lbx", np.array([q_fin, 0., 0.])) - self.ocp_solver.constraints_set(self.N, "ubx", np.array([q_fin, 0., 1e-2])) - self.ocp_solver.set(self.N, "x", np.array(x_sol_guess[self.N])) - self.ocp_solver.set(self.N, 'p', np.array([cost_dir, 1.])) - - # Solve the OCP: - status = self.ocp_solver.solve() - - return status diff --git a/VBOC/pendulum_vboc.py b/VBOC/pendulum_vboc.py deleted file mode 100644 index a49e54c..0000000 --- a/VBOC/pendulum_vboc.py +++ /dev/null @@ -1,350 +0,0 @@ -import os -import sys -sys.path.insert(1, os.getcwd() + '/..') - -import numpy as np -import matplotlib.pyplot as plt -import time -from pendulum_class_vboc import OCPpendulum -import warnings -import random -import torch -import torch.nn as nn -from my_nn import NeuralNetDIR - -warnings.filterwarnings("ignore") - -if __name__ == "__main__": - - start_time = time.time() - - N_start = 50 - - # Ocp initialization: - ocp = OCPpendulum() - - # Position and velocity bounds: - v_max = ocp.dthetamax - v_min = - ocp.dthetamax - q_max = ocp.thetamax - q_min = ocp.thetamin - - # Pytorch params: - device = torch.device("cuda" if torch.cuda.is_available() else "cpu") - - input_layers = 2 - hidden_layers = 100 - output_layers = 1 - learning_rate = 1e-3 - - # Model and optimizer: - model_dir = NeuralNetDIR(input_layers, hidden_layers, output_layers).to(device) - criterion_dir = nn.MSELoss() - optimizer_dir = torch.optim.Adam(model_dir.parameters(), lr=learning_rate) - - # Unviable data generation param: - eps = 1e-3 - - # Initialize training dataset: - X_save = np.empty((0,2)) - - print('Start data generation') - - # Data generation (simplified version): - for v_sel in [v_min, v_max]: - - # Reset the time parameters: - N = N_start - ocp.N = N - ocp.ocp_solver.set_new_time_steps(np.full((N,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N) - - dt = 1e-2 # time step duration - - # Constraints and cost parameters: - if v_sel == v_min: - q_init = q_max - q_fin = q_min - - lb = np.array([q_min, v_min, 0.]) - ub = np.array([q_max, 0., 1e-2]) - - cost_dir = 1. - else: - q_init = q_min - q_fin = q_max - - lb = np.array([q_min, 0., 0.]) - ub = np.array([q_max, v_max, 1e-2]) - - cost_dir = -1. - - # Initial guess: - x_guess = np.empty((N+1, 3)) - u_guess = np.zeros((N, 1)) - q_guess = np.linspace(q_init, q_fin, N+1, endpoint=True) - for i in range(N+1): - x_guess[i] = [q_guess[i], v_sel, dt] - - norm_old = v_max - - while True: - - # Solve the OCP: - status = ocp.OCP_solve(x_guess, u_guess, cost_dir, lb, ub, q_init, q_fin) - - # Print statistics: - # ocp.ocp_solver.print_statistics() - - if status == 0: - - # Extract initial optimal state: - x0 = ocp.ocp_solver.get(0, "x") - - # New velocity norm: - norm_new = np.linalg.norm(x0[1]) - - # Compare the current norm with the previous one: - if norm_new > norm_old + 1e-4: # solve a new OCP - - norm_old = norm_new - - # Update the guess with the current solution: - x_guess = np.empty((N+1,3)) - u_guess = np.empty((N+1,1)) - for i in range(N): - x_guess[i] = ocp.ocp_solver.get(i, "x") - u_guess[i] = ocp.ocp_solver.get(i, "u") - x_guess[N] = ocp.ocp_solver.get(N, "x") - u_guess[N] = np.array([0.]) - - # Increase the number of time steps: - N = N + 1 - ocp.ocp_solver.set_new_time_steps(np.full((N,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N) - - else: - # Extract the optimal trajectory: - x_sol = np.empty((N+1,3)) - u_sol = np.empty((N,1)) - for i in range(N): - x_sol[i] = ocp.ocp_solver.get(i, "x") - u_sol[i] = ocp.ocp_solver.get(i, "u") - x_sol[N] = ocp.ocp_solver.get(N, "x") - - dt = x_sol[0][2] - - break - - else: - raise Exception("Sorry, the solver failed") - - # Save the initial optimal state: - x0 = x_sol[0][:2] - X_save = np.append(X_save, [x0], axis = 0) - - # Generate the corresponding unviable state: - x_out = np.copy(x0) - x_out[1] = x_out[1] - eps * cost_dir - - # Check if the optimal state is at velocity limit: - if x_out[1] > v_max or x_out[1] < v_min: - x_at_limit = True - else: - x_at_limit = False - - # Iterate through the trajectory to verify the location of the states with respect to V (simplified version): - for f in range(1, N): - - # Check if the optimal state is at a velocity limit: - if x_at_limit: # the previous state was on dX -> if also the current state is on dX then save the state, - # alternatively solve a new OCP to verify if it is on dV or inside V - - v_out = x_sol[f][1] - eps * cost_dir - if v_out > v_max or v_out < v_min: - - # Save the optimal state: - X_save = np.append(X_save, [x_sol[f][:2]], axis = 0) - - else: - - # Original velocity norm: - norm_old = np.linalg.norm(x_sol[f][1]) - - # Reset the time settings: - N_test = N - f - ocp.N = N_test - ocp.ocp_solver.set_new_time_steps(np.full((N_test,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N_test) - - # Solve the OCP: - ocp.OCP_solve( x_sol, u_sol, cost_dir, lb, ub, x_sol[f][0], q_fin) - - if status == 0: - # Extract the new optimal state: - x0_new = ocp.ocp_solver.get(0, "x") - - # New velocity norm: - norm_new = np.linalg.norm(x0_new[1]) - - # Compare the current norm with the previous one: - if norm_new > norm_old + 1e-4: - - # Update the solution: - for i in range(N-f): - x_sol[i+f] = ocp.ocp_solver.get(i, "x") - u_sol[i+f] = ocp.ocp_solver.get(i, "u") - - # Generate the corresponding unviable state: - x_out = np.copy(x0_new[:2]) - x_out[1] = x_out[1] - eps * cost_dir - - # Check if the optimal state is at velocity limit: - if x_out[1] > v_max or x_out[1] < v_min: - x_at_limit = True - else: - x_at_limit = False - - else: - x_at_limit = False - - # Save the initial optimal state: - X_save = np.append(X_save, [x_sol[f][:2]], axis = 0) - - else: - raise Exception("Sorry, the solver failed") - - else: # the previous state was on dV -> save the current state - - if abs(x_sol[f][0] - q_fin) <= 1e-3: - break - - # Save valid state: - X_save = np.append(X_save, [x_sol[f][:2]], axis = 0) - - print('Data generation completed') - - # Joint positions mean and variance: - mean_dir, std_dir = torch.mean(torch.tensor(X_save[:,:1].tolist())).to(device).item(), torch.std(torch.tensor(X_save[:,:1].tolist())).to(device).item() - - # Rewrite data in the form [normalized position, velocity direction, velocity norm]: - X_save_dir = np.empty((X_save.shape[0],3)) - for i in range(X_save_dir.shape[0]): - X_save_dir[i][0] = (X_save[i][0] - mean_dir) / std_dir - vel_norm = abs(X_save[i][1]) - if vel_norm != 0: - X_save_dir[i][1] = X_save[i][1] / vel_norm - X_save_dir[i][2] = vel_norm - - # Save data: - np.save('data_1dof_vboc_10.npy', np.asarray(X_save)) - - # Training parameters: - beta = 0.8 # moving average param - n_minibatch = 64 # batch size - B = int(X_save.shape[0]*100/n_minibatch) # number of iterations for 100 epoch - it_max = B * 100 # max number of iterations - - print('Start model training') - - it = 0 - val = max(X_save_dir[:,2]) - training_evol = [] - - # Train the model - while val > 1e-4 and it < it_max: - - # Extract batch of data: - ind = random.sample(range(len(X_save_dir)), n_minibatch) - X_iter_tensor = torch.Tensor([X_save_dir[i][:2] for i in ind]).to(device) - y_iter_tensor = torch.Tensor([[X_save_dir[i][2]] for i in ind]).to(device) - - # Forward pass: - outputs = model_dir(X_iter_tensor) - loss = criterion_dir(outputs, y_iter_tensor) - - # Backward and optimize: - loss.backward() - optimizer_dir.step() - optimizer_dir.zero_grad() - - # Moving average of training errors: - val = beta * val + (1 - beta) * loss.item() - - it += 1 - - if it % B == 0: - print(val) - training_evol.append(val) - - print('Model training completed') - - print("Execution time: %s seconds" % (time.time() - start_time)) - - # Show the training evolution: - plt.figure() - plt.plot(training_evol) - plt.ylabel('Training error moving average') - plt.xlabel('Epochs') - plt.title("Training evolution") - - # Save the model: - torch.save(model_dir.state_dict(), 'model_1dof_vboc_10') - torch.save(mean_dir, 'mean_1dof_vboc_10') - torch.save(std_dir, 'std_1dof_vboc_10') - - with torch.no_grad(): - # Show the resulting RMSE on the training data: - X_iter_tensor = torch.Tensor(X_save_dir[:,:2]).to(device) - y_iter_tensor = torch.Tensor(X_save_dir[:,2:]).to(device) - outputs = model_dir(X_iter_tensor) - print('RMSE training data: ', torch.sqrt(criterion_dir(outputs, y_iter_tensor))) - - # Show the data and the resulting set approximation: - f,ax=plt.subplots(figsize=(6, 4)) - - # plt.plot([q_min, q_max], [v_min, v_min], color='white', linestyle='--', alpha=0.7, linewidth=1) - # plt.plot([q_min, q_max], [v_max, v_max], color='white', linestyle='--', alpha=0.7, linewidth=1) - # plt.plot([q_max, q_max], [v_min, v_max], color='white', linestyle='--', alpha=0.7, linewidth=1) - # plt.plot([q_min, q_min], [v_min, v_max], color='white', linestyle='--', alpha=0.7, linewidth=1) - - ax.plot( - X_save[:int(X_save.shape[0]/2),0], X_save[:int(X_save.shape[0]/2),1], "ko", markersize=3 - ) - ax.plot( - X_save[int(X_save.shape[0]/2):,0], X_save[int(X_save.shape[0]/2):,1], "bo", markersize=3 - ) - h = 0.005 - xx, yy = np.meshgrid(np.arange(q_min-(q_max-q_min)/30, q_max+(q_max-q_min)/30, h), np.arange(v_min-(v_max-v_min)/30, v_max+(v_max-v_min)/30, h)) - inp = np.c_[xx.ravel(), yy.ravel(), yy.ravel()] - for i in range(inp.shape[0]): - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - vel_norm = abs(inp[i][1]) - if vel_norm != 0: - inp[i][1] = inp[i][1] / vel_norm - inp[i][2] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:2].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][2] > out[i] or inp[i][0]*std_dir+mean_dir > q_max or inp[i][0]*std_dir+mean_dir < q_min or inp[i][1] * inp[i][2] > v_max - 0.01: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - ax.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - # plt.xlim([q_min-(q_max-q_min)/30, q_max+(q_max-q_min)/30]) - # plt.ylim([v_min-(v_max-v_min)/30, v_max+(v_max-v_min)/30]) - plt.ylabel('$\dot{q}$ (rad/s)') - plt.xlabel('$q$ (rad)') - import matplotlib.ticker as tck - from fractions import Fraction - def pi_formatter(x, pos): - return f"${Fraction((x/np.pi)).limit_denominator(max_denominator=12)}\pi$" - ax.xaxis.set_major_formatter(tck.FuncFormatter(pi_formatter)) - ax.xaxis.set_major_locator(tck.MultipleLocator(base=np.pi/8)) - plt.grid() - plt.title("Viability kernel approximation") - - plt.show() diff --git a/VBOC/plots_2dof.py b/VBOC/plots_2dof.py deleted file mode 100644 index 6a61529..0000000 --- a/VBOC/plots_2dof.py +++ /dev/null @@ -1,121 +0,0 @@ -import numpy as np -import matplotlib.pyplot as plt -from numpy.linalg import norm as norm -import warnings -warnings.filterwarnings("ignore") -import torch - -def plots_2dof(X_save, q_min, q_max, v_min, v_max, model_dir, mean_dir, std_dir, device): - - # Plot all training data: - plt.figure() - plt.scatter(X_save[:,0],X_save[:,1],s=0.1) - plt.legend(loc="best", shadow=False, scatterpoints=1) - plt.title("OCP dataset positions") - plt.figure() - plt.scatter(X_save[:,2],X_save[:,3],s=0.1) - plt.legend(loc="best", shadow=False, scatterpoints=1) - plt.title("OCP dataset velocities") - - # Show the resulting set approximation: - with torch.no_grad(): - h = 0.01 - xx, yy = np.meshgrid(np.arange(q_min, q_max, h), np.arange(v_min, v_max, h)) - xrav = xx.ravel() - yrav = yy.ravel() - - plt.figure() - inp = np.c_[ - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - xrav, - np.zeros(yrav.shape[0]), - yrav, - np.empty(yrav.shape[0]), - ] - for i in range(inp.shape[0]): - vel_norm = norm([inp[i][2],inp[i][3]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][2] = inp[i][2] / vel_norm - inp[i][3] = inp[i][3] / vel_norm - inp[i][4] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:4].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][4] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - xit = [] - yit = [] - for i in range(X_save.shape[0]): - if ( - norm(X_save[i][0] - (q_min + q_max) / 2) < 0.01 - and norm(X_save[i][2]) < 0.1 - ): - xit.append(X_save[i][1]) - yit.append(X_save[i][3]) - plt.plot( - xit, - yit, - "ko", - markersize=2 - ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_2$') - plt.xlabel('$q_2$') - plt.grid() - plt.title("Classifier section") - - plt.figure() - inp = np.c_[ - xrav, - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - yrav, - np.zeros(yrav.shape[0]), - np.empty(yrav.shape[0]), - ] - for i in range(inp.shape[0]): - vel_norm = norm([inp[i][2],inp[i][3]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][2] = inp[i][2] / vel_norm - inp[i][3] = inp[i][3] / vel_norm - inp[i][4] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:4].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][4] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - xit = [] - yit = [] - for i in range(X_save.shape[0]): - if ( - norm(X_save[i][1] - (q_min + q_max) / 2) < 0.01 - and norm(X_save[i][3]) < 0.1 - ): - xit.append(X_save[i][0]) - yit.append(X_save[i][2]) - plt.plot( - xit, - yit, - "ko", - markersize=2 - ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_1$') - plt.xlabel('$q_1$') - plt.grid() - plt.title("Classifier section") - - plt.show() diff --git a/VBOC/plots_3dof.py b/VBOC/plots_3dof.py deleted file mode 100644 index 6f669ec..0000000 --- a/VBOC/plots_3dof.py +++ /dev/null @@ -1,210 +0,0 @@ -import numpy as np -import matplotlib.pyplot as plt -from numpy.linalg import norm as norm -import warnings -warnings.filterwarnings("ignore") -import torch -import random - -def plots_3dof(X_save, q_min, q_max, v_min, v_max, model_dir, mean_dir, std_dir, device): - - # Plot all training data: - plt.figure() - ax = plt.axes(projection ="3d") - ax.scatter3D(X_save[:,0],X_save[:,1],X_save[:,2]) - plt.title("OCP dataset positions") - plt.figure() - ax = plt.axes(projection ="3d") - ax.scatter3D(X_save[:,3],X_save[:,4],X_save[:,5]) - plt.title("OCP dataset velocities") - - # Show the resulting set approximation: - with torch.no_grad(): - h = 0.01 - xx, yy = np.meshgrid(np.arange(q_min, q_max, h), np.arange(v_min, v_max, h)) - xrav = xx.ravel() - yrav = yy.ravel() - - plt.figure() - inp = np.float32( - np.c_[(q_min + q_max) / 2 * np.ones(xrav.shape[0]), - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - xrav, - np.zeros(yrav.shape[0]), - np.zeros(yrav.shape[0]), - yrav, - np.empty(yrav.shape[0]), - ] - ) - for i in range(inp.shape[0]): - vel_norm = norm([inp[i,3],inp[i,4],inp[i,5]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - inp[i][2] = (inp[i][2] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][4] = inp[i][4] / vel_norm - inp[i][3] = inp[i][3] / vel_norm - inp[i][5] = inp[i][5] / vel_norm - inp[i][6] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:6].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][6] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - xit = [] - yit = [] - for i in range(len(X_save)): - if ( - norm(X_save[i][0] - (q_min + q_max) / 2) < 0.1 and - norm(X_save[i][1] - (q_min + q_max) / 2) < 0.1 - and norm(X_save[i][3]) < 0.1 - and norm(X_save[i][4]) < 0.1 - ): - xit.append(X_save[i][2]) - yit.append(X_save[i][5]) - plt.plot( - xit, - yit, - "ko", - markersize=2 - ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_3$') - plt.xlabel('$q_3$') - plt.grid() - plt.title("Classifier section") - - # Plot the results: - plt.figure() - inp = np.float32( - np.c_[(q_min + q_max) / 2 * np.ones(xrav.shape[0]), - xrav, - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - np.zeros(yrav.shape[0]), - yrav, - np.zeros(yrav.shape[0]), - np.empty(yrav.shape[0]), - ] - ) - for i in range(inp.shape[0]): - vel_norm = norm([inp[i,3],inp[i,4],inp[i,5]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - inp[i][2] = (inp[i][2] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][4] = inp[i][4] / vel_norm - inp[i][3] = inp[i][3] / vel_norm - inp[i][5] = inp[i][5] / vel_norm - inp[i][6] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:6].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][6] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - xit = [] - yit = [] - for i in range(len(X_save)): - if ( - norm(X_save[i][0] - (q_min + q_max) / 2) < 0.1 and - norm(X_save[i][2] - (q_min + q_max) / 2) < 0.1 - and norm(X_save[i][3]) < 0.1 - and norm(X_save[i][5]) < 0.1 - ): - xit.append(X_save[i][1]) - yit.append(X_save[i][4]) - plt.plot( - xit, - yit, - "ko", - markersize=2 - ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_2$') - plt.xlabel('$q_2$') - plt.grid() - plt.title("Classifier section") - - # Plot the results: - plt.figure() - inp = np.float32( - np.c_[xrav, - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - yrav, - np.zeros(yrav.shape[0]), - np.zeros(yrav.shape[0]), - np.empty(yrav.shape[0]), - ] - ) - for i in range(inp.shape[0]): - vel_norm = norm([inp[i,3],inp[i,4],inp[i,5]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - inp[i][2] = (inp[i][2] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][4] = inp[i][4] / vel_norm - inp[i][3] = inp[i][3] / vel_norm - inp[i][5] = inp[i][5] / vel_norm - inp[i][6] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:6].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][6] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - xit = [] - yit = [] - for i in range(len(X_save)): - if ( - norm(X_save[i][1] - (q_min + q_max) / 2) < 0.1 and - norm(X_save[i][2] - (q_min + q_max) / 2) < 0.1 - and norm(X_save[i][4]) < 0.1 - and norm(X_save[i][5]) < 0.1 - ): - xit.append(X_save[i][0]) - yit.append(X_save[i][3]) - plt.plot( - xit, - yit, - "ko", - markersize=2 - ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_1$') - plt.xlabel('$q_1$') - plt.grid() - plt.title("Classifier section") - - for _ in range(10): - q1ran = q_min + random.random() * (q_max-q_min) - q2ran = q_min + random.random() * (q_max-q_min) - q3ran = q_min + random.random() * (q_max-q_min) - - plt.figure() - ax = plt.axes(projection ="3d") - xit = np.zeros((1,6)) - for i in range(len(X_save)): - if ( - norm(X_save[i][1] - q1ran) < 0.1 and - norm(X_save[i][2] - q2ran) < 0.1 and - norm(X_save[i][3] - q3ran) < 0.1 - ): - xit = np.append(xit,[X_save[i]], axis=0) - ax.scatter3D(xit[:,3],xit[:,4],xit[:,5]) - plt.title("q1="+str(q1ran)+" q2="+str(q2ran)+" q3="+str(q3ran)) - - plt.show() diff --git a/VBOC/triplependulum_class_vboc.py b/VBOC/triplependulum_class_vboc.py deleted file mode 100644 index 4d223ed..0000000 --- a/VBOC/triplependulum_class_vboc.py +++ /dev/null @@ -1,239 +0,0 @@ -from acados_template import AcadosOcp, AcadosOcpSolver, AcadosSim, AcadosSimSolver -import numpy as np -from acados_template import AcadosModel -from casadi import SX, vertcat, cos, sin - - -class OCPtriplependulum: - def __init__(self): - - # --------------------SET MODEL-------------------- - # ------------------------------------------------- - model_name = "triple_pendulum_ode" - - # constants - self.m1 = 0.4 # mass of the first link [kself.g] - self.m2 = 0.4 # mass of the second link [kself.g] - self.m3 = 0.4 # mass of the third link [kself.g] - self.g = 9.81 # self.gravity constant [m/s^2] - self.l1 = 0.8 # lenself.gth of the first link [m] - self.l2 = 0.8 # lenself.gth of the second link [m] - self.l3 = 0.8 # lenself.gth of the second link [m] - - # states - theta1 = SX.sym("theta1") - theta2 = SX.sym("theta2") - theta3 = SX.sym("theta3") - dtheta1 = SX.sym("dtheta1") - dtheta2 = SX.sym("dtheta2") - dtheta3 = SX.sym("dtheta3") - dt = SX.sym('dt') - self.x = vertcat(theta1, theta2, theta3, dtheta1, dtheta2, dtheta3, dt) - - # controls - C1 = SX.sym("C1") - C2 = SX.sym("C2") - C3 = SX.sym("C3") - u = vertcat(C1, C2, C3) - - # parameters - w1 = SX.sym("w1") - w2 = SX.sym("w2") - w3 = SX.sym("w3") - wt = SX.sym("wt") - p = vertcat(w1, w2, w3, wt) - - # dynamics - f_expl = dt*vertcat( - dtheta1, - dtheta2, - dtheta3, - (-self.g * self.l1 * self.l2 * self.l3 * self.m1 * self.m3 * sin(-2 * theta3 + 2 * theta2 + theta1) - self.g * self.l1 * self.l2 * self.l3 * self.m1 * self.m3 * sin(2 * theta3 - 2 * theta2 + theta1) + 2 * C1 * self.l2 * self.l3 * self.m3 * cos(-2 * theta3 + 2 * theta2) + 2 * dtheta1 ** 2 * self.l1 ** 2 * self.l2 * self.l3 * self.m2 * (self.m2 + self.m3) * sin(-2 * theta2 + 2 * theta1) - 2 * C3 * self.l1 * self.l2 * (self.m2 + self.m3) * cos(-2 * theta2 + theta1 + theta3) - 2 * C2 * self.l1 * self.l3 * self.m3 * cos(-2 * theta3 + theta2 + theta1) + 2 * self.l1 * self.l2 * self.l3 ** 2 * self.m2 * self.m3 * dtheta3 ** 2 * sin(-2 * theta2 + theta1 + theta3) + 2 * C3 * self.l1 * self.l2 * (self.m2 + self.m3) * - cos(theta1 - theta3) + 2 * (C2 * self.l1 * (self.m3 + 2 * self.m2) * cos(-theta2 + theta1) + (self.g * self.l1 * self.m2 * (self.m2 + self.m3) * sin(-2 * theta2 + theta1) + 2 * dtheta2 ** 2 * self.l1 * self.l2 * self.m2 * (self.m2 + self.m3) * sin(-theta2 + theta1) + self.m3 * dtheta3 ** 2 * sin(theta1 - theta3) * self.l1 * self.l3 * self.m2 + self.g * self.l1 * (self.m2 ** 2 + (self.m3 + 2 * self.m1) * self.m2 + self.m1 * self.m3) * sin(theta1) - C1 * (self.m3 + 2 * self.m2)) * self.l2) * self.l3) / self.l1 ** 2 / self.l3 / (self.m2 * (self.m2 + self.m3) * cos(-2 * theta2 + 2 * theta1) + self.m1 * self.m3 * cos(-2 * theta3 + 2 * theta2) - self.m2 ** 2 + (-self.m3 - 2 * self.m1) * self.m2 - self.m1 * self.m3) / self.l2 / 2, - (-2 * C3 * self.l1 * self.l2 * (self.m2 + self.m3) * cos(2 * theta1 - theta3 - theta2) - 2 * self.l1 * self.l2 * self.l3 ** 2 * self.m2 * self.m3 * dtheta3 ** 2 * sin(2 * theta1 - theta3 - theta2) + self.g * self.l1 * self.l2 * self.l3 * self.m1 * self.m3 * sin(theta2 + 2 * theta1 - 2 * theta3) - self.g * self.l1 * self.l3 * ((self.m1 + 2 * self.m2) * self.m3 + 2 * self.m2 * (self.m1 + self.m2)) * self.l2 * sin(-theta2 + 2 * theta1) - 2 * dtheta2 ** 2 * self.l1 * self.l2 ** 2 * self.l3 * self.m2 * (self.m2 + self.m3) * sin(-2 * theta2 + 2 * theta1) + 2 * C2 * self.l1 * self.l3 * self.m3 * cos(-2 * theta3 + 2 * theta1) + 2 * self.l1 * self.l2 ** 2 * self.l3 * self.m1 * self.m3 * dtheta2 ** 2 * sin(-2 * theta3 + 2 * theta2) - 2 * C1 * self.l2 * self.l3 * self.m3 * cos(-2 * theta3 + theta2 + theta1) + 2 * self.l1 ** 2 * self.l2 * self.l3 * self.m1 * self.m3 * dtheta1 ** 2 * sin(-2 * theta3 + - theta2 + theta1) - 2 * self.l1 ** 2 * self.l3 * dtheta1 ** 2 * ((self.m1 + 2 * self.m2) * self.m3 + 2 * self.m2 * (self.m1 + self.m2)) * self.l2 * sin(-theta2 + theta1) + 2 * C3 * self.l1 * self.l2 * (self.m3 + 2 * self.m1 + self.m2) * cos(-theta3 + theta2) + (2 * C1 * self.l2 * (self.m3 + 2 * self.m2) * cos(-theta2 + theta1) + self.l1 * (4 * dtheta3 ** 2 * self.m3 * self.l3 * (self.m1 + self.m2 / 2) * self.l2 * sin(-theta3 + theta2) + self.g * self.m3 * self.l2 * self.m1 * sin(-2 * theta3 + theta2) + self.g * ((self.m1 + 2 * self.m2) * self.m3 + 2 * self.m2 * (self.m1 + self.m2)) * self.l2 * sin(theta2) - 2 * C2 * (self.m3 + 2 * self.m1 + 2 * self.m2))) * self.l3) / (self.m2 * (self.m2 + self.m3) * cos(-2 * theta2 + 2 * theta1) + self.m1 * self.m3 * cos(-2 * theta3 + 2 * theta2) + (-self.m1 - self.m2) * self.m3 - 2 * self.m1 * self.m2 - self.m2 ** 2) / self.l1 / self.l3 / self.l2 ** 2 / 2, - (-2 * self.m3 * C2 * self.l1 * self.l3 * (self.m2 + self.m3) * cos(2 * theta1 - theta3 - theta2) + self.g * self.m3 * self.l1 * self.l2 * self.l3 * self.m1 * (self.m2 + self.m3) * sin(2 * theta1 + theta3 - 2 * theta2) + 2 * C3 * self.l1 * self.l2 * (self.m2 + self.m3) ** 2 * cos(-2 * theta2 + 2 * theta1) - self.g * self.m3 * self.l1 * self.l2 * self.l3 * self.m1 * (self.m2 + self.m3) * sin(2 * theta1 - theta3) - self.g * self.m3 * self.l1 * self.l2 * self.l3 * self.m1 * (self.m2 + self.m3) * sin(-theta3 + 2 * theta2) - 2 * self.l1 * self.l2 * self.l3 ** 2 * self.m1 * self.m3 ** 2 * dtheta3 ** 2 * sin(-2 * theta3 + 2 * theta2) - 2 * C1 * self.l2 * self.l3 * self.m3 * (self.m2 + self.m3) * cos(-2 * theta2 + theta1 + theta3) + 2 * self.m3 * dtheta1 ** 2 * self.l1 ** 2 * - self.l2 * self.l3 * self.m1 * (self.m2 + self.m3) * sin(-2 * theta2 + theta1 + theta3) + 2 * self.m3 * C2 * self.l1 * self.l3 * (self.m3 + 2 * self.m1 + self.m2) * cos(-theta3 + theta2) + (self.m2 + self.m3) * (2 * C1 * self.l3 * self.m3 * cos(theta1 - theta3) + self.l1 * (-2 * self.m3 * dtheta1 ** 2 * self.l1 * self.l3 * self.m1 * sin(theta1 - theta3) - 4 * self.m3 * dtheta2 ** 2 * sin(-theta3 + theta2) * self.l2 * self.l3 * self.m1 + self.g * self.m3 * sin(theta3) * self.l3 * self.m1 - 2 * C3 * (self.m3 + 2 * self.m1 + self.m2))) * self.l2) / self.m3 / (self.m2 * (self.m2 + self.m3) * cos(-2 * theta2 + 2 * theta1) + self.m1 * self.m3 * cos(-2 * theta3 + 2 * theta2) + (-self.m1 - self.m2) * self.m3 - 2 * self.m1 * self.m2 - self.m2 ** 2) / self.l1 / self.l3 ** 2 / self.l2 / 2, - 0., - ) - - self.model = AcadosModel() - - self.model.f_expl_expr = f_expl - self.model.x = self.x - self.model.u = u - self.model.p = p - self.model.name = model_name - # ------------------------------------------------- - - # ---------------------SET OCP--------------------- - # ------------------------------------------------- - self.ocp = AcadosOcp() - - # times - self.N = 100 - self.ocp.solver_options.tf = self.N - self.ocp.dims.N = self.N - - # ocp model - self.ocp.model = self.model - - # cost - self.ocp.cost.cost_type_0 = 'EXTERNAL' - self.ocp.cost.cost_type = 'EXTERNAL' - - self.ocp.model.cost_expr_ext_cost_0 = w1 * dtheta1 + w2 * dtheta2 + w3 * dtheta3 + wt * dt - self.ocp.model.cost_expr_ext_cost = wt * dt - self.ocp.parameter_values = np.array([0., 0., 0., 0.]) - - # set constraints - self.Cmax = 10. - self.thetamax = np.pi / 4 + np.pi - self.thetamin = - np.pi / 4 + np.pi - self.dthetamax = 10. - - self.ocp.constraints.lbu = np.array([-self.Cmax, -self.Cmax, -self.Cmax]) - self.ocp.constraints.ubu = np.array([self.Cmax, self.Cmax, self.Cmax]) - self.ocp.constraints.idxbu = np.array([0, 1, 2]) - self.ocp.constraints.lbx = np.array( - [self.thetamin, self.thetamin, self.thetamin, - - self.dthetamax, -self.dthetamax, -self.dthetamax, 0.] - ) - self.ocp.constraints.ubx = np.array( - [self.thetamax, self.thetamax, self.thetamax, self.dthetamax, self.dthetamax, self.dthetamax, 1e-2] - ) - self.ocp.constraints.idxbx = np.array([0, 1, 2, 3, 4, 5, 6]) - self.ocp.constraints.lbx_e = np.array( - [self.thetamin, self.thetamin, self.thetamin, - - self.dthetamax, -self.dthetamax, -self.dthetamax, 0.] - ) - self.ocp.constraints.ubx_e = np.array( - [self.thetamax, self.thetamax, self.thetamax, self.dthetamax, self.dthetamax, self.dthetamax, 0.] - ) - self.ocp.constraints.idxbx_e = np.array([0, 1, 2, 3, 4, 5, 6]) - self.ocp.constraints.lbx_0 = np.array( - [self.thetamin, self.thetamin, self.thetamin, - - self.dthetamax, -self.dthetamax, -self.dthetamax, 0.] - ) - self.ocp.constraints.ubx_0 = np.array( - [self.thetamax, self.thetamax, self.thetamax, self.dthetamax, self.dthetamax, self.dthetamax, 0.] - ) - self.ocp.constraints.idxbx_0 = np.array([0, 1, 2, 3, 4, 5, 6]) - - self.ocp.constraints.C = np.zeros((3,7)) - self.ocp.constraints.D = np.zeros((3,3)) - self.ocp.constraints.lg = np.zeros((3)) - self.ocp.constraints.ug = np.zeros((3)) - - # options - self.ocp.solver_options.nlp_solver_type = "SQP" - self.ocp.solver_options.hessian_approx = 'EXACT' - self.ocp.solver_options.exact_hess_constr = 0 - # self.ocp.solver_options.exact_hess_cost = 0 - self.ocp.solver_options.exact_hess_dyn = 0 - self.ocp.solver_options.nlp_solver_tol_stat = 1e-3 - self.ocp.solver_options.qp_solver_tol_stat = 1e-3 - self.ocp.solver_options.qp_solver_iter_max = 100 - self.ocp.solver_options.nlp_solver_max_iter = 1000 - self.ocp.solver_options.globalization = "MERIT_BACKTRACKING" - self.ocp.solver_options.alpha_reduction = 0.3 - self.ocp.solver_options.alpha_min = 1e-2 - self.ocp.solver_options.levenberg_marquardt = 1e-5 - - # ------------------------------------------------- - - -class OCPtriplependulumINIT(OCPtriplependulum): - def __init__(self): - - # inherit initialization - super().__init__() - - # solver - self.ocp_solver = AcadosOcpSolver(self.ocp, json_file="acados_ocp.json") - - def OCP_solve(self, x_sol_guess, u_sol_guess, p, q_lb, q_ub, u_lb, u_ub, q_init_lb, q_init_ub, q_fin_lb, q_fin_ub): - - # Reset current iterate: - self.ocp_solver.reset() - - # Set parameters, guesses and constraints: - for i in range(self.N): - self.ocp_solver.set(i, 'x', x_sol_guess[i]) - self.ocp_solver.set(i, 'u', u_sol_guess[i]) - self.ocp_solver.set(i, 'p', p) - self.ocp_solver.constraints_set(i, 'lbx', q_lb) - self.ocp_solver.constraints_set(i, 'ubx', q_ub) - self.ocp_solver.constraints_set(i, 'lbu', u_lb) - self.ocp_solver.constraints_set(i, 'ubu', u_ub) - self.ocp_solver.constraints_set(i, 'C', np.zeros((3,7))) - self.ocp_solver.constraints_set(i, 'D', np.zeros((3,3))) - self.ocp_solver.constraints_set(i, 'lg', np.zeros((3))) - self.ocp_solver.constraints_set(i, 'ug', np.zeros((3))) - - C = np.zeros((3,7)) - d = np.array([p[:3].tolist()]) - dt = np.transpose(d) - C[:,3:6] = np.identity(3)-np.matmul(dt,d) - self.ocp_solver.constraints_set(0, "C", C, api='new') - - self.ocp_solver.constraints_set(0, "lbx", q_init_lb) - self.ocp_solver.constraints_set(0, "ubx", q_init_ub) - - self.ocp_solver.constraints_set(self.N, "lbx", q_fin_lb) - self.ocp_solver.constraints_set(self.N, "ubx", q_fin_ub) - self.ocp_solver.set(self.N, 'x', x_sol_guess[-1]) - self.ocp_solver.set(self.N, 'p', p) - - # Solve the OCP: - status = self.ocp_solver.solve() - - return status - - -class SYMtriplependulumINIT(OCPtriplependulum): - def __init__(self): - - # inherit initialization - super().__init__() - - # states - theta1 = SX.sym("theta1") - theta2 = SX.sym("theta2") - theta3 = SX.sym("theta3") - dtheta1 = SX.sym("dtheta1") - dtheta2 = SX.sym("dtheta2") - dtheta3 = SX.sym("dtheta3") - self.x = vertcat(theta1, theta2, theta3, dtheta1, dtheta2, dtheta3) - - # controls - C1 = SX.sym("C1") - C2 = SX.sym("C2") - C3 = SX.sym("C3") - u = vertcat(C1, C2, C3) - - p = [] - - # dynamics - f_expl = vertcat( - dtheta1, - dtheta2, - dtheta3, - (-self.g * self.l1 * self.l2 * self.l3 * self.m1 * self.m3 * sin(-2 * theta3 + 2 * theta2 + theta1) - self.g * self.l1 * self.l2 * self.l3 * self.m1 * self.m3 * sin(2 * theta3 - 2 * theta2 + theta1) + 2 * C1 * self.l2 * self.l3 * self.m3 * cos(-2 * theta3 + 2 * theta2) + 2 * dtheta1 ** 2 * self.l1 ** 2 * self.l2 * self.l3 * self.m2 * (self.m2 + self.m3) * sin(-2 * theta2 + 2 * theta1) - 2 * C3 * self.l1 * self.l2 * (self.m2 + self.m3) * cos(-2 * theta2 + theta1 + theta3) - 2 * C2 * self.l1 * self.l3 * self.m3 * cos(-2 * theta3 + theta2 + theta1) + 2 * self.l1 * self.l2 * self.l3 ** 2 * self.m2 * self.m3 * dtheta3 ** 2 * sin(-2 * theta2 + theta1 + theta3) + 2 * C3 * self.l1 * self.l2 * (self.m2 + self.m3) * - cos(theta1 - theta3) + 2 * (C2 * self.l1 * (self.m3 + 2 * self.m2) * cos(-theta2 + theta1) + (self.g * self.l1 * self.m2 * (self.m2 + self.m3) * sin(-2 * theta2 + theta1) + 2 * dtheta2 ** 2 * self.l1 * self.l2 * self.m2 * (self.m2 + self.m3) * sin(-theta2 + theta1) + self.m3 * dtheta3 ** 2 * sin(theta1 - theta3) * self.l1 * self.l3 * self.m2 + self.g * self.l1 * (self.m2 ** 2 + (self.m3 + 2 * self.m1) * self.m2 + self.m1 * self.m3) * sin(theta1) - C1 * (self.m3 + 2 * self.m2)) * self.l2) * self.l3) / self.l1 ** 2 / self.l3 / (self.m2 * (self.m2 + self.m3) * cos(-2 * theta2 + 2 * theta1) + self.m1 * self.m3 * cos(-2 * theta3 + 2 * theta2) - self.m2 ** 2 + (-self.m3 - 2 * self.m1) * self.m2 - self.m1 * self.m3) / self.l2 / 2, - (-2 * C3 * self.l1 * self.l2 * (self.m2 + self.m3) * cos(2 * theta1 - theta3 - theta2) - 2 * self.l1 * self.l2 * self.l3 ** 2 * self.m2 * self.m3 * dtheta3 ** 2 * sin(2 * theta1 - theta3 - theta2) + self.g * self.l1 * self.l2 * self.l3 * self.m1 * self.m3 * sin(theta2 + 2 * theta1 - 2 * theta3) - self.g * self.l1 * self.l3 * ((self.m1 + 2 * self.m2) * self.m3 + 2 * self.m2 * (self.m1 + self.m2)) * self.l2 * sin(-theta2 + 2 * theta1) - 2 * dtheta2 ** 2 * self.l1 * self.l2 ** 2 * self.l3 * self.m2 * (self.m2 + self.m3) * sin(-2 * theta2 + 2 * theta1) + 2 * C2 * self.l1 * self.l3 * self.m3 * cos(-2 * theta3 + 2 * theta1) + 2 * self.l1 * self.l2 ** 2 * self.l3 * self.m1 * self.m3 * dtheta2 ** 2 * sin(-2 * theta3 + 2 * theta2) - 2 * C1 * self.l2 * self.l3 * self.m3 * cos(-2 * theta3 + theta2 + theta1) + 2 * self.l1 ** 2 * self.l2 * self.l3 * self.m1 * self.m3 * dtheta1 ** 2 * sin(-2 * theta3 + - theta2 + theta1) - 2 * self.l1 ** 2 * self.l3 * dtheta1 ** 2 * ((self.m1 + 2 * self.m2) * self.m3 + 2 * self.m2 * (self.m1 + self.m2)) * self.l2 * sin(-theta2 + theta1) + 2 * C3 * self.l1 * self.l2 * (self.m3 + 2 * self.m1 + self.m2) * cos(-theta3 + theta2) + (2 * C1 * self.l2 * (self.m3 + 2 * self.m2) * cos(-theta2 + theta1) + self.l1 * (4 * dtheta3 ** 2 * self.m3 * self.l3 * (self.m1 + self.m2 / 2) * self.l2 * sin(-theta3 + theta2) + self.g * self.m3 * self.l2 * self.m1 * sin(-2 * theta3 + theta2) + self.g * ((self.m1 + 2 * self.m2) * self.m3 + 2 * self.m2 * (self.m1 + self.m2)) * self.l2 * sin(theta2) - 2 * C2 * (self.m3 + 2 * self.m1 + 2 * self.m2))) * self.l3) / (self.m2 * (self.m2 + self.m3) * cos(-2 * theta2 + 2 * theta1) + self.m1 * self.m3 * cos(-2 * theta3 + 2 * theta2) + (-self.m1 - self.m2) * self.m3 - 2 * self.m1 * self.m2 - self.m2 ** 2) / self.l1 / self.l3 / self.l2 ** 2 / 2, - (-2 * self.m3 * C2 * self.l1 * self.l3 * (self.m2 + self.m3) * cos(2 * theta1 - theta3 - theta2) + self.g * self.m3 * self.l1 * self.l2 * self.l3 * self.m1 * (self.m2 + self.m3) * sin(2 * theta1 + theta3 - 2 * theta2) + 2 * C3 * self.l1 * self.l2 * (self.m2 + self.m3) ** 2 * cos(-2 * theta2 + 2 * theta1) - self.g * self.m3 * self.l1 * self.l2 * self.l3 * self.m1 * (self.m2 + self.m3) * sin(2 * theta1 - theta3) - self.g * self.m3 * self.l1 * self.l2 * self.l3 * self.m1 * (self.m2 + self.m3) * sin(-theta3 + 2 * theta2) - 2 * self.l1 * self.l2 * self.l3 ** 2 * self.m1 * self.m3 ** 2 * dtheta3 ** 2 * sin(-2 * theta3 + 2 * theta2) - 2 * C1 * self.l2 * self.l3 * self.m3 * (self.m2 + self.m3) * cos(-2 * theta2 + theta1 + theta3) + 2 * self.m3 * dtheta1 ** 2 * self.l1 ** 2 * - self.l2 * self.l3 * self.m1 * (self.m2 + self.m3) * sin(-2 * theta2 + theta1 + theta3) + 2 * self.m3 * C2 * self.l1 * self.l3 * (self.m3 + 2 * self.m1 + self.m2) * cos(-theta3 + theta2) + (self.m2 + self.m3) * (2 * C1 * self.l3 * self.m3 * cos(theta1 - theta3) + self.l1 * (-2 * self.m3 * dtheta1 ** 2 * self.l1 * self.l3 * self.m1 * sin(theta1 - theta3) - 4 * self.m3 * dtheta2 ** 2 * sin(-theta3 + theta2) * self.l2 * self.l3 * self.m1 + self.g * self.m3 * sin(theta3) * self.l3 * self.m1 - 2 * C3 * (self.m3 + 2 * self.m1 + self.m2))) * self.l2) / self.m3 / (self.m2 * (self.m2 + self.m3) * cos(-2 * theta2 + 2 * theta1) + self.m1 * self.m3 * cos(-2 * theta3 + 2 * theta2) + (-self.m1 - self.m2) * self.m3 - 2 * self.m1 * self.m2 - self.m2 ** 2) / self.l1 / self.l3 ** 2 / self.l2 / 2, - ) - - self.model.f_expl_expr = f_expl - self.model.x = self.x - self.model.p = p - self.model.u = u - - sim = AcadosSim() - sim.model = self.model - sim.solver_options.T = 1e-2 - sim.solver_options.num_stages = 4 - self.acados_integrator = AcadosSimSolver(sim) diff --git a/VBOC/triplependulum_vboc.py b/VBOC/triplependulum_vboc.py deleted file mode 100644 index 60cc710..0000000 --- a/VBOC/triplependulum_vboc.py +++ /dev/null @@ -1,757 +0,0 @@ -import os -import sys -sys.path.insert(1, os.getcwd() + '/..') - -import numpy as np -import random -import matplotlib.pyplot as plt -from numpy.linalg import norm as norm -import time -from triplependulum_class_vboc import OCPtriplependulumINIT, SYMtriplependulumINIT -import warnings -warnings.filterwarnings("ignore") -import torch -import torch.nn as nn -from my_nn import NeuralNetDIR -import math -from multiprocessing import Pool - -def data_generation(v): - valid_data = np.ndarray((0, 6)) - - # Reset the time parameters: - N = ocp.N # number of time steps - ocp.ocp_solver.set_new_time_steps(np.full((N,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N) - - # Initialization of the OCP: The OCP is set to find an extreme trajectory. The initial joint positions - # are set to random values, except for the reference joint whose position is set to an extreme value. - # The initial joint velocities are left free. The final velocities are all set to 0. The OCP has to maximise - # the initial velocity norm in a predefined direction. - - # Selection of the reference joint: - joint_sel = random.choice([0, 1, 2]) # 0 to select first joint as reference, 1 to select second joint - - # Selection of the start position of the reference joint: - vel_sel = random.choice([-1, 1]) # -1 to maximise initial vel, + 1 to minimize it - if vel_sel == -1: - q_init_sel = q_min - q_fin_sel = q_max - else: - q_init_sel = q_max - q_fin_sel = q_min - - # Initial velocity optimization direction (the cost is in the form p[0] * dtheta1 + p[1] * dtheta2 + p[2] * dtheta3 + p[3] * dt): - ran1 = vel_sel * random.random() - ran2 = random.choice([-1, 1]) * random.random() - ran3 = random.choice([-1, 1]) * random.random() - norm_weights = norm(np.array([ran1, ran2, ran3])) - if joint_sel == 0: - p = np.array([ran1/norm_weights, ran2/norm_weights, ran3/norm_weights, 0.]) - elif joint_sel == 1: - p = np.array([ran2/norm_weights, ran1/norm_weights, ran3/norm_weights, 0.]) - else: - p = np.array([ran2/norm_weights, ran3/norm_weights, ran1/norm_weights, 0.]) - - # Initial position of the other joints: - q_init1 = q_min + random.random() * (q_max-q_min) - if q_init1 > q_max - eps: - q_init1 = q_init1 - eps - if q_init1 < q_min + eps: - q_init1 = q_init1 + eps - - q_init2 = q_min + random.random() * (q_max-q_min) - if q_init2 > q_max - eps: - q_init2 = q_init2 - eps - if q_init2 < q_min + eps: - q_init2 = q_init2 + eps - - q_init3 = q_min + random.random() * (q_max-q_min) - if q_init3 > q_max - eps: - q_init3 = q_init3 - eps - if q_init3 < q_min + eps: - q_init3 = q_init3 + eps - - # Bounds on the initial state: - q_init_lb = np.array([q_init1, q_init2, q_init3, v_min, v_min, v_min, dt_sym]) - q_init_ub = np.array([q_init1, q_init2, q_init3, v_max, v_max, v_max, dt_sym]) - if q_init_sel == q_min: - q_init_lb[joint_sel] = q_min + eps - q_init_ub[joint_sel] = q_min + eps - else: - q_init_lb[joint_sel] = q_max - eps - q_init_ub[joint_sel] = q_max - eps - - # Guess: - x_sol_guess = np.empty((N, 7)) - u_sol_guess = np.empty((N, 3)) - for i, tau in enumerate(np.linspace(0, 1, N)): - x_guess = np.array([q_init1, q_init2, q_init3, 0., 0., 0., dt_sym]) - x_guess[joint_sel] = (1-tau)*q_init_sel + tau*q_fin_sel - x_guess[joint_sel+3] = 2*(1-tau)*(q_fin_sel-q_init_sel) - x_sol_guess[i] = x_guess - u_sol_guess[i] = np.array([0.,0.,0.]) - - cost = 1e6 - all_ok = False - - q_lb = np.array([q_min, q_min, q_min, v_min, v_min, v_min, dt_sym]) - q_ub = np.array([q_max, q_max, q_max, v_max, v_max, v_max, dt_sym]) - u_lb = np.array([-tau_max, -tau_max, -tau_max]) - u_ub = np.array([tau_max, tau_max, tau_max]) - q_fin_lb = np.array([q_min, q_min, q_min, 0., 0., 0., dt_sym]) - q_fin_ub = np.array([q_max, q_max, q_max, 0., 0., 0., dt_sym]) - - # Iteratively solve the OCP with an increased number of time steps until the solution does not change. - # If the solver fails, try with a slightly different initial condition: - for _ in range(10): - - # Solve the OCP: - status = ocp.OCP_solve(x_sol_guess, u_sol_guess, p, q_lb, q_ub, u_lb, u_ub, q_init_lb, q_init_ub, q_fin_lb, q_fin_ub) - - if status == 0: # the solver has found a solution - - # Compare the current cost with the previous one: - cost_new = ocp.ocp_solver.get_cost() - - if cost_new > cost - tol: # the time is sufficient to have achieved an optimal solution - all_ok = True - break - - cost = cost_new - - # Update the guess with the current solution: - x_sol_guess = np.empty((N+1,7)) - u_sol_guess = np.empty((N+1,3)) - for i in range(N): - x_sol_guess[i] = ocp.ocp_solver.get(i, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i, "u") - x_sol_guess[N] = ocp.ocp_solver.get(N, "x") - u_sol_guess[N] = np.array([0.,0.,0.]) - - # Increase the number of time steps: - N = N + 1 - ocp.N = N - ocp.ocp_solver.set_new_time_steps(np.full((N,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N) - - else: - - # Reset the number of steps used in the OCP: - N = ocp.N - ocp.ocp_solver.set_new_time_steps(np.full((N,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N) - - # Initial velocity optimization direction (the cost is in the form p[0] * dtheta1 + p[1] * dtheta2 + p[2] * dt): - ran1 = p[0] + random.random() * random.choice([-1, 1]) * 0.01 - ran2 = p[1] + random.random() * random.choice([-1, 1]) * 0.01 - ran3 = p[2] + random.random() * random.choice([-1, 1]) * 0.01 - norm_weights = norm(np.array([ran1, ran2, ran3])) - p = np.array([ran1/norm_weights, ran2/norm_weights, ran3/norm_weights, 0]) - - # Initial position of the other joint: - dev = random.random() * random.choice([-1, 1]) * 0.01 - for j in range(3): - if j != joint_sel: - val = q_init_lb[j] + dev - if val > q_max - eps: - val = val - eps - if val < q_min + eps: - val = val + eps - q_init_lb[j] = val - q_init_ub[j] = val - - # Guess: - x_sol_guess = np.empty((N, 7)) - u_sol_guess = np.empty((N, 3)) - for i, tau in enumerate(np.linspace(0, 1, N)): - x_guess = np.array([q_init_lb[0], q_init_lb[1], q_init_lb[2], 0., 0., 0., dt_sym]) - x_guess[joint_sel] = (1-tau)*q_init_sel + tau*q_fin_sel - x_guess[joint_sel+3] = 2*(1-tau)*(q_fin_sel-q_init_sel) - x_sol_guess[i] = x_guess - u_sol_guess[i] = np.array([0., 0., 0.]) - - cost = 1e6 - - if all_ok: - - # Save the optimal trajectory: - x_sol = np.empty((N+1,7)) - u_sol = np.empty((N,3)) - for i in range(N): - x_sol[i] = ocp.ocp_solver.get(i, "x") - u_sol[i] = ocp.ocp_solver.get(i, "u") - x_sol[N] = ocp.ocp_solver.get(N, "x") - - # Generate the unviable sample in the cost direction: - x_sym = np.full((N+1,6), None) - - x_out = np.copy(x_sol[0][:6]) - x_out[3] = x_out[3] - eps * p[0] - x_out[4] = x_out[4] - eps * p[1] - x_out[5] = x_out[5] - eps * p[2] - - # save the initial state: - valid_data = np.append(valid_data, [[x_sol[0][0], x_sol[0][1], x_sol[0][2], x_sol[0][3], x_sol[0][4], x_sol[0][5]]], axis = 0) - - # Check if initial velocities lie on a limit: - if x_out[4] > v_max or x_out[4] < v_min or x_out[3] > v_max or x_out[3] < v_min or x_out[5] > v_max or x_out[5] < v_min: - is_x_at_limit = True # the state is on dX - else: - is_x_at_limit = False # the state is on dV - x_sym[0] = x_out - - # Iterate through the trajectory to verify the location of the states with respect to V: - for f in range(1, N): - - if is_x_at_limit: - - x_out = np.copy(x_sol[f][:6]) - norm_vel = norm(x_out[3:]) - x_out[3] = x_out[3] + eps * x_out[3]/norm_vel - x_out[4] = x_out[4] + eps * x_out[4]/norm_vel - x_out[5] = x_out[5] + eps * x_out[5]/norm_vel - - # If the previous state was on a limit, the current state location cannot be identified using - # the corresponding unviable state but it has to rely on the proximity to the state limits - # (more restrictive): - if x_sol[f][0] > q_max - eps or x_sol[f][0] < q_min + eps or x_sol[f][1] > q_max - eps or x_sol[f][1] < q_min + eps or x_sol[f][2] > q_max - eps or x_sol[f][2] < q_min + eps or x_out[3] > v_max or x_out[3] < v_min or x_out[4] > v_max or x_out[4] < v_min or x_out[5] > v_max or x_out[5] < v_min: - is_x_at_limit = True # the state is on dX - else: - is_x_at_limit = False # the state is either on the interior of V or on dV - - # if the traj detouches from a position limit it usually enters V: - if x_sol[f-1][0] > q_max - eps or x_sol[f-1][0] < q_min + eps or x_sol[f-1][1] > q_max - eps or x_sol[f-1][1] < q_min + eps or x_sol[f-1][2] > q_max - eps or x_sol[f-1][2] < q_min + eps: - break - - # Solve an OCP to verify whether the following part of the trajectory is on V or dV. To do so - # the initial joint positions are set to the current ones and the final state is fixed to the - # final state of the trajectory. The initial velocities are left free and maximized in the - # direction of the current joint velocities. - - N_test = N - f - ocp.N = N_test - ocp.ocp_solver.set_new_time_steps(np.full((N_test,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N_test) - - # Cost: - norm_weights = norm(np.array([x_sol[f][3], x_sol[f][4], x_sol[f][5]])) - p = np.array([-x_sol[f][3]/norm_weights, -x_sol[f][4]/norm_weights, -x_sol[f][5]/norm_weights, 0.]) # the cost direction is based on the current velocity direction - - # Bounds on the initial state: - q_init_lb = np.array([x_sol[f][0], x_sol[f][1], x_sol[f][2], v_min, v_min, v_min, dt_sym]) - q_init_ub = np.array([x_sol[f][0], x_sol[f][1], x_sol[f][2], v_max, v_max, v_max, dt_sym]) - - # Guess: - x_sol_guess = np.empty((N_test+1, 7)) - u_sol_guess = np.empty((N_test+1, 3)) - for i in range(N_test): - x_sol_guess[i] = x_sol[i+f] - u_sol_guess[i] = u_sol[i+f] - u_g = np.array([0.,0.,0.]) - x_sol_guess[N_test] = x_sol[N] - u_sol_guess[N_test] = u_g - - norm_old = norm(np.array([x_sol[f][3:6]])) # velocity norm of the original solution - norm_bef = 0 - all_ok = False - - for _ in range(5): - - # Solve the OCP: - status = ocp.OCP_solve(x_sol_guess, u_sol_guess, p, q_lb, q_ub, u_lb, u_ub, q_init_lb, q_init_ub, q_fin_lb, q_fin_ub) - - if status == 0: # the solver has found a solution - - # Compare the current cost with the previous one: - x0_new = ocp.ocp_solver.get(0, "x") - norm_new = norm(np.array([x0_new[3:6]])) - - if norm_new < norm_bef + tol: - all_ok = True - break - - norm_bef = norm_new - - # Update the guess with the current solution: - x_sol_guess = np.empty((N_test+1,7)) - u_sol_guess = np.empty((N_test+1,3)) - for i in range(N_test): - x_sol_guess[i] = ocp.ocp_solver.get(i, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i, "u") - x_sol_guess[N_test] = ocp.ocp_solver.get(N_test, "x") - u_sol_guess[N_test] = np.array([0.,0.,0.]) - - # Increase the number of time steps: - N_test = N_test + 1 - ocp.N = N_test - ocp.ocp_solver.set_new_time_steps(np.full((N_test,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N_test) - - else: - break - - if all_ok: - - # Compare the old and new velocity norms: - if norm_new > norm_old + tol: # the state is inside V - - # Update the optimal solution: - for i in range(N-f): - x_sol[i+f] = ocp.ocp_solver.get(i, "x") - u_sol[i+f] = ocp.ocp_solver.get(i, "u") - - x_out = np.copy(x_sol[f][:6]) - x_out[4] = x_out[4] + eps * x_out[4]/norm_new - x_out[3] = x_out[3] + eps * x_out[3]/norm_new - x_out[5] = x_out[5] + eps * x_out[5]/norm_new - - # Check if velocities lie on a limit: - if x_out[4] > v_max or x_out[4] < v_min or x_out[3] > v_max or x_out[3] < v_min or x_out[5] > v_max or x_out[5] < v_min: - is_x_at_limit = True # the state is on dX - else: - is_x_at_limit = False # the state is on dV - x_sym[f] = x_out - - else: - is_x_at_limit = False # the state is on dV - # xv_state[f] = 0 - - # Generate the new corresponding unviable state in the cost direction: - x_out = np.copy(x_sol[f][:6]) - x_out[3] = x_out[3] - eps * p[0] - x_out[4] = x_out[4] - eps * p[1] - x_out[5] = x_out[5] - eps * p[2] - if x_out[joint_sel+3] > v_max: - x_out[joint_sel+3] = v_max - if x_out[joint_sel+3] < v_min: - x_out[joint_sel+3] = v_min - x_sym[f] = x_out - - else: # we cannot say whether the state is on dV or inside V - - for r in range(f, N): - if abs(x_sol[r][4]) > v_max - eps or abs(x_sol[r][3]) > v_max - eps or abs(x_sol[r][5]) > v_max - eps: - - # Save the viable states at velocity limits: - valid_data = np.append(valid_data, [[x_sol[f][0], x_sol[f][1], x_sol[f][2], x_sol[f][3], x_sol[f][4], x_sol[f][5]]], axis = 0) - - break - - else: - # If the previous state was not on a limit, the current state location can be identified using - # the corresponding unviable state which can be computed by simulating the system starting from - # the previous unviable state. - - # Simulate next unviable state: - u_sym = np.copy(u_sol[f-1]) - sim.acados_integrator.set("u", u_sym) - sim.acados_integrator.set("x", x_sym[f-1]) - sim.acados_integrator.set("T", dt_sym) - status = sim.acados_integrator.solve() - x_out = sim.acados_integrator.get("x") - x_sym[f] = x_out - - # When the state of the unviable simulated trajectory violates a limit, the corresponding viable state - # of the optimal trajectory is on dX: - if x_out[0] > q_max or x_out[0] < q_min or x_out[1] > q_max or x_out[1] < q_min or x_out[2] > q_max or x_out[2] < q_min or x_out[3] > v_max or x_out[3] < v_min or x_out[4] > v_max or x_out[4] < v_min or x_out[5] > v_max or x_out[5] < v_min: - is_x_at_limit = True # the state is on dX - else: - is_x_at_limit = False # the state is on dV - - if x_sol[f][0] < q_max - eps and x_sol[f][0] > q_min + eps and x_sol[f][1] < q_max - eps and x_sol[f][1] > q_min + eps and x_sol[f][2] < q_max - eps and x_sol[f][2] > q_min + eps and abs(x_sol[f][3]) > tol and abs(x_sol[f][4]) > tol and abs(x_sol[f][5]) > tol: - - # Save the viable and unviable states: - valid_data = np.append(valid_data, [[x_sol[f][0], x_sol[f][1], x_sol[f][2], x_sol[f][3], x_sol[f][4], x_sol[f][5]]], axis = 0) - - return valid_data.tolist() - - else: - return None - -start_time = time.time() - -X_test = np.load('../data3_test.npy') - -# Ocp initialization: -ocp = OCPtriplependulumINIT() -sim = SYMtriplependulumINIT() - -# Position, velocity and torque bounds: -v_max = ocp.dthetamax -v_min = - ocp.dthetamax -q_max = ocp.thetamax -q_min = ocp.thetamin -tau_max = ocp.Cmax - -stop_time = 21600 # total computational time -dt_sym = 1e-2 # time step duration - -device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # pytorch device - -tol = ocp.ocp.solver_options.nlp_solver_tol_stat # OCP cost tolerance -eps = tol*10 # unviable data generation parameter - -iteration = 0 -print('Start data generation, iteration:', iteration) - -# Data generation: -cpu_num = 30 -num_prob = 1000 -with Pool(cpu_num) as p: - traj = p.map(data_generation, range(num_prob)) - -X_temp = [i for i in traj if i is not None] -X_save = np.array([i for f in X_temp for i in f]) - -print('Data generation completed') - -# Pytorch params: -input_layers = 6 -hidden_layers = 500 -output_layers = 1 -learning_rate = 1e-3 - -# Model and optimizer: -model_dir = NeuralNetDIR(input_layers, hidden_layers, output_layers).to(device) -criterion_dir = nn.MSELoss() -optimizer_dir = torch.optim.Adam(model_dir.parameters(), lr=learning_rate) - -# Joint positions mean and variance: -mean_dir, std_dir = torch.mean(torch.tensor(X_save[:,:3].tolist())).to(device).item(), torch.std(torch.tensor(X_save[:,:3].tolist())).to(device).item() -torch.save(mean_dir, 'mean_3dof_vboc') -torch.save(std_dir, 'std_3dof_vboc') - -# Rewrite data in the form [normalized positions, velocity direction, velocity norm]: -X_train_dir = np.empty((X_save.shape[0],7)) -for i in range(X_train_dir.shape[0]): - X_train_dir[i][0] = (X_save[i][0] - mean_dir) / std_dir - X_train_dir[i][1] = (X_save[i][1] - mean_dir) / std_dir - X_train_dir[i][2] = (X_save[i][2] - mean_dir) / std_dir - vel_norm = norm([X_save[i][3], X_save[i][4], X_save[i][5]]) - if vel_norm != 0: - X_train_dir[i][3] = X_save[i][3] / vel_norm - X_train_dir[i][4] = X_save[i][4] / vel_norm - X_train_dir[i][5] = X_save[i][5] / vel_norm - X_train_dir[i][6] = vel_norm - -beta = 0.95 -n_minibatch = 4096 -B = int(X_save.shape[0]*100/n_minibatch) # number of iterations for 100 epoch -it_max = B * 10 - -print('Start model training') - -it = 1 -val = max(X_train_dir[:,6]) - -# Train the model -while val > 1e-3 and it < it_max: - ind = random.sample(range(len(X_train_dir)), n_minibatch) - - X_iter_tensor = torch.Tensor([X_train_dir[i][:6] for i in ind]).to(device) - y_iter_tensor = torch.Tensor([[X_train_dir[i][6]] for i in ind]).to(device) - - # Forward pass - outputs = model_dir(X_iter_tensor) - loss = criterion_dir(outputs, y_iter_tensor) - - # Backward and optimize - loss.backward() - optimizer_dir.step() - optimizer_dir.zero_grad() - - val = beta * val + (1 - beta) * loss.item() - it += 1 - - if it % B == 0: - print(val) - -print('Model training completed') - -# Store time: -times = np.array([time.time() - start_time]) - -# Store resulting RMSE wrt testing data: -X_test_dir = np.empty((X_test.shape[0],7)) -for i in range(X_test_dir.shape[0]): - X_test_dir[i][0] = (X_test[i][0] - mean_dir) / std_dir - X_test_dir[i][1] = (X_test[i][1] - mean_dir) / std_dir - X_test_dir[i][2] = (X_test[i][2] - mean_dir) / std_dir - vel_norm = norm([X_test[i][3],X_test[i][4],X_test[i][5]]) - if vel_norm != 0: - X_test_dir[i][3] = X_test[i][3] / vel_norm - X_test_dir[i][4] = X_test[i][4] / vel_norm - X_test_dir[i][5] = X_test[i][5] / vel_norm - X_test_dir[i][6] = vel_norm - -with torch.no_grad(): - X_iter_tensor = torch.Tensor(X_test_dir[:,:6]).to(device) - y_iter_tensor = torch.Tensor(X_test_dir[:,6:]).to(device) - rmse = np.array([torch.sqrt(criterion_dir(model_dir(X_iter_tensor), y_iter_tensor)).item()]) - -while time.time() - start_time < stop_time: - - iteration += 1 - print('Start data generation, iteration:', iteration) - - # Data generation: - with Pool(cpu_num) as p: - traj = p.map(data_generation, range(num_prob)) - - X_temp = [i for i in traj if i is not None] - - # Add new training data to the dataset: - X_new = np.array([i for f in X_temp for i in f]) - X_save = np.concatenate((X_save, X_new)) - - print('Data generation completed') - - # Rewrite data in the form [normalized positions, velocity direction, velocity norm]: - X_new_dir = np.empty((X_save.shape[0],7)) - for i in range(X_new_dir.shape[0]): - X_new_dir[i][0] = (X_save[i][0] - mean_dir) / std_dir - X_new_dir[i][1] = (X_save[i][1] - mean_dir) / std_dir - X_new_dir[i][2] = (X_save[i][2] - mean_dir) / std_dir - vel_norm = norm([X_save[i][3], X_save[i][4], X_save[i][5]]) - if vel_norm != 0: - X_new_dir[i][3] = X_save[i][3] / vel_norm - X_new_dir[i][4] = X_save[i][4] / vel_norm - X_new_dir[i][5] = X_save[i][5] / vel_norm - X_new_dir[i][6] = vel_norm - X_train_dir = np.concatenate((X_train_dir, X_new_dir)) - - print('Start model training') - - it = 1 - val = max(X_train_dir[:,6]) - - # Train the model - while val > 1e-3 and it < it_max: - # ind = random.sample(range(len(X_train_dir)), n_minibatch) - - ind = random.sample(range(len(X_train_dir) - len(X_new_dir)), int(n_minibatch / 2)) - ind.extend( - random.sample( - range(len(X_train_dir) - len(X_new_dir), len(X_train_dir)), - int(n_minibatch / 2), - ) - ) - - X_iter_tensor = torch.Tensor([X_train_dir[i][:6] for i in ind]).to(device) - y_iter_tensor = torch.Tensor([[X_train_dir[i][6]] for i in ind]).to(device) - - # Forward pass - outputs = model_dir(X_iter_tensor) - loss = criterion_dir(outputs, y_iter_tensor) - - # Backward and optimize - loss.backward() - optimizer_dir.step() - optimizer_dir.zero_grad() - - val = beta * val + (1 - beta) * loss.item() - it += 1 - - if it % B == 0: - print(val) - - print('Model training completed') - - # Store time: - times = np.append(times, [time.time() - start_time]) - - # Store resulting RMSE wrt testing data: - with torch.no_grad(): - X_iter_tensor = torch.Tensor(X_test_dir[:,:6]).to(device) - y_iter_tensor = torch.Tensor(X_test_dir[:,6:]).to(device) - rmse = np.append(rmse, [torch.sqrt(criterion_dir(model_dir(X_iter_tensor), y_iter_tensor)).item()]) - -# Show total time and resulting RMSE: -print('Iterations completed with total time:', time.time() - start_time) -print('RMSE test data: ', torch.sqrt(criterion_dir(model_dir(X_iter_tensor), y_iter_tensor)).item()) - -# Store times and RMSE: -np.save('times_3dof_vboc.npy', np.asarray(times)) -np.save('rmse_3dof_vboc.npy', np.asarray(rmse)) - -# Show the RMSE evolution over time: -plt.figure() -plt.plot(times, rmse) - -# Save the training data and the model: -np.save('data_3dof_vboc.npy', np.asarray(X_save)) -torch.save(model_dir.state_dict(), 'model_3dof_vboc') - -with torch.no_grad(): - h = 0.01 - xx, yy = np.meshgrid(np.arange(q_min, q_max, h), np.arange(v_min, v_max, h)) - xrav = xx.ravel() - yrav = yy.ravel() - - # Plot the results: - plt.figure() - inp = np.float32( - np.c_[(q_min + q_max) / 2 * np.ones(xrav.shape[0]), - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - xrav, - np.zeros(yrav.shape[0]), - np.zeros(yrav.shape[0]), - yrav, - np.empty(yrav.shape[0]), - ] - ) - for i in range(inp.shape[0]): - vel_norm = norm([inp[i,3],inp[i,4],inp[i,5]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - inp[i][2] = (inp[i][2] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][4] = inp[i][4] / vel_norm - inp[i][3] = inp[i][3] / vel_norm - inp[i][5] = inp[i][5] / vel_norm - inp[i][6] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:6].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][6] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - # xit = [] - # yit = [] - # for i in range(len(X_save)): - # if ( - # norm(X_save[i][0] - (q_min + q_max) / 2) < 0.1 and - # norm(X_save[i][1] - (q_min + q_max) / 2) < 0.1 - # and norm(X_save[i][3]) < 0.1 - # and norm(X_save[i][4]) < 0.1 - # ): - # xit.append(X_save[i][2]) - # yit.append(X_save[i][5]) - # plt.plot( - # xit, - # yit, - # "ko", - # markersize=2 - # ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_3$') - plt.xlabel('$q_3$') - plt.grid() - plt.title("Classifier section") - - # Plot the results: - plt.figure() - inp = np.float32( - np.c_[(q_min + q_max) / 2 * np.ones(xrav.shape[0]), - xrav, - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - np.zeros(yrav.shape[0]), - yrav, - np.zeros(yrav.shape[0]), - np.empty(yrav.shape[0]), - ] - ) - for i in range(inp.shape[0]): - vel_norm = norm([inp[i,3],inp[i,4],inp[i,5]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - inp[i][2] = (inp[i][2] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][4] = inp[i][4] / vel_norm - inp[i][3] = inp[i][3] / vel_norm - inp[i][5] = inp[i][5] / vel_norm - inp[i][6] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:6].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][6] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - # xit = [] - # yit = [] - # for i in range(len(X_save)): - # if ( - # norm(X_save[i][0] - (q_min + q_max) / 2) < 0.1 and - # norm(X_save[i][2] - (q_min + q_max) / 2) < 0.1 - # and norm(X_save[i][3]) < 0.1 - # and norm(X_save[i][5]) < 0.1 - # ): - # xit.append(X_save[i][1]) - # yit.append(X_save[i][4]) - # plt.plot( - # xit, - # yit, - # "ko", - # markersize=2 - # ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_2$') - plt.xlabel('$q_2$') - plt.grid() - plt.title("Classifier section") - - # Plot the results: - plt.figure() - inp = np.float32( - np.c_[xrav, - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - (q_min + q_max) / 2 * np.ones(xrav.shape[0]), - yrav, - np.zeros(yrav.shape[0]), - np.zeros(yrav.shape[0]), - np.empty(yrav.shape[0]), - ] - ) - for i in range(inp.shape[0]): - vel_norm = norm([inp[i,3],inp[i,4],inp[i,5]]) - inp[i][0] = (inp[i][0] - mean_dir) / std_dir - inp[i][1] = (inp[i][1] - mean_dir) / std_dir - inp[i][2] = (inp[i][2] - mean_dir) / std_dir - if vel_norm != 0: - inp[i][4] = inp[i][4] / vel_norm - inp[i][3] = inp[i][3] / vel_norm - inp[i][5] = inp[i][5] / vel_norm - inp[i][6] = vel_norm - out = (model_dir(torch.from_numpy(inp[:,:6].astype(np.float32)).to(device))).cpu().numpy() - y_pred = np.empty(out.shape) - for i in range(len(out)): - if inp[i][6] > out[i]: - y_pred[i] = 0 - else: - y_pred[i] = 1 - Z = y_pred.reshape(xx.shape) - plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8) - # xit = [] - # yit = [] - # for i in range(len(X_save)): - # if ( - # norm(X_save[i][1] - (q_min + q_max) / 2) < 0.1 and - # norm(X_save[i][2] - (q_min + q_max) / 2) < 0.1 - # and norm(X_save[i][4]) < 0.1 - # and norm(X_save[i][5]) < 0.1 - # ): - # xit.append(X_save[i][0]) - # yit.append(X_save[i][3]) - # plt.plot( - # xit, - # yit, - # "ko", - # markersize=2 - # ) - plt.xlim([q_min, q_max]) - plt.ylim([v_min, v_max]) - plt.ylabel('$\dot{q}_1$') - plt.xlabel('$q_1$') - plt.grid() - plt.title("Classifier section") - -plt.show() diff --git a/VBOC/vboc.py b/VBOC/vboc.py deleted file mode 100644 index a71bd0a..0000000 --- a/VBOC/vboc.py +++ /dev/null @@ -1,660 +0,0 @@ -import os -import sys -sys.path.insert(1, os.getcwd() + '/..') - -import numpy as np -import cProfile -import pstats -import random -import matplotlib.pyplot as plt -from numpy.linalg import norm as norm -import time -from triplependulum_class_vboc import OCPtriplependulumINIT, SYMtriplependulumINIT -from doublependulum_class_vboc import OCPdoublependulumINIT, SYMdoublependulumINIT -import warnings -warnings.filterwarnings("ignore") -import torch -import torch.nn as nn -from my_nn import NeuralNetDIR -from multiprocessing import Pool -from torch.utils.data import DataLoader -from plots_2dof import plots_2dof -from plots_3dof import plots_3dof -import torch.nn.utils.prune as prune - -def data_generation(v): - - valid_data = np.ndarray((0, ocp.ocp.dims.nx - 1)) - - # Reset the time parameters: - N = N_start - ocp.N = N - ocp.ocp_solver.set_new_time_steps(np.full((N,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N) - - # Initialization of the OCP: The OCP is set to find an extreme trajectory. The initial joint positions - # are set to random values, except for the reference joint whose position is set to an extreme value. - # The initial joint velocities are left free. The final velocities are all set to 0. The OCP has to maximise - # the initial velocity norm in a predefined direction. - - # Selection of the reference joint: - joint_sel = random.choice(range(system_sel)) - - # Selection of the start position of the reference joint: - vel_sel = random.choice([-1, 1]) # -1 to maximise initial vel, + 1 to minimize it - - # Initial velocity optimization direction: - p = np.zeros((system_sel+1)) - - for l in range(system_sel): - if l == joint_sel: - p[l] = random.random() * vel_sel - else: - p[l] = random.random() * random.choice([-1, 1]) - - norm_weights = norm(p) - p = p/norm_weights - - # Bounds on the initial state: - q_init_lb = np.full((ocp.ocp.dims.nx), v_min) - q_init_ub = np.full((ocp.ocp.dims.nx), v_max) - q_init_lb[-1] = dt_sym - q_init_ub[-1] = dt_sym - - for l in range(system_sel): - - if l == joint_sel: - - if vel_sel == -1: - q_init_sel = q_min + eps - q_fin_sel = q_max - eps - else: - q_init_sel = q_max - eps - q_fin_sel = q_min + eps - - q_init_lb[l] = q_init_sel - q_init_ub[l] = q_init_sel - - else: - q_init_oth = q_min + random.random() * (q_max-q_min) - - if q_init_oth > q_max - eps: - q_init_oth = q_init_oth - eps - if q_init_oth < q_min + eps: - q_init_oth = q_init_oth + eps - - q_init_lb[l] = q_init_oth - q_init_ub[l] = q_init_oth - - # State and input bounds: - q_lb = np.copy(q_init_lb) - q_ub = np.copy(q_init_ub) - - for l in range(system_sel): - q_lb[l] = q_min - q_ub[l] = q_max - - u_lb = np.full((system_sel), -tau_max) - u_ub = np.full((system_sel), tau_max) - - # Bounds on the final state: - q_fin_lb = np.copy(q_lb) - q_fin_ub = np.copy(q_ub) - - for l in range(system_sel): - q_fin_lb[l+system_sel] = 0. - q_fin_ub[l+system_sel] = 0. - - # Guess: - x_sol_guess = np.empty((N, ocp.ocp.dims.nx)) - u_sol_guess = np.empty((N, ocp.ocp.dims.nu)) - - for i, tau in enumerate(np.linspace(0, 1, N)): - - x_guess = np.copy(q_init_ub) - - for l in range(system_sel): - if l == joint_sel: - x_guess[l] = (1-tau)*q_init_sel + tau*q_fin_sel - x_guess[l+system_sel] = 2*(1-tau)*(q_fin_sel-q_init_sel) - else: - x_guess[l] = q_init_oth - x_guess[l+system_sel] = 0 - - x_sol_guess[i] = x_guess - u_sol_guess[i] = np.zeros((system_sel)) - - cost_old = 1e6 - all_ok = False - - # Iteratively solve the OCP with an increased number of time steps until the solution does not change. - # If the solver fails, try with a slightly different initial condition: - for _ in range(10): - - # Solve the OCP: - status = ocp.OCP_solve(x_sol_guess, u_sol_guess, p, q_lb, q_ub, u_lb, u_ub, q_init_lb, q_init_ub, q_fin_lb, q_fin_ub) - - if status == 0: # the solver has found a solution - - # Compare the current cost with the previous one: - cost_new = ocp.ocp_solver.get_cost() - - if cost_new > cost_old - tol: # the time is sufficient to have achieved an optimal solution - all_ok = True - break - - cost_old = cost_new - - # Update the guess with the current solution: - x_sol_guess = np.empty((N+1,ocp.ocp.dims.nx)) - u_sol_guess = np.empty((N+1,ocp.ocp.dims.nu)) - for i in range(N): - x_sol_guess[i] = ocp.ocp_solver.get(i, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i, "u") - x_sol_guess[N] = ocp.ocp_solver.get(N, "x") - u_sol_guess[N] = np.zeros((ocp.ocp.dims.nu)) - - # Increase the number of time steps: - N = N + 1 - ocp.N = N - ocp.ocp_solver.set_new_time_steps(np.full((N,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N) - - else: - break - - # else: - - # # Reset the number of steps used in the OCP: - # N = N_start - # ocp.N = N - # ocp.ocp_solver.set_new_time_steps(np.full((N,), 1.)) - # ocp.ocp_solver.update_qp_solver_cond_N(N) - - # # Initial velocity optimization direction (the cost is in the form p[0] * dtheta1 + p[1] * dtheta2 + p[2] * dt): - # for l in range(system_sel): - # p[l] = p[l] + random.random() * random.choice([-1, 1]) * 0.01 - - # norm_weights = norm(p) - # p = p/norm_weights - - # # Initial position of the other joint: - # for j in range(system_sel): - # if j != joint_sel: - - # val = q_init_lb[j] + random.random() * random.choice([-1, 1]) * 0.01 - - # if val > q_max - eps: - # val = val - eps - # if val < q_min + eps: - # val = val + eps - - # q_init_lb[j] = val - # q_init_ub[j] = val - - # # Guess: - # x_sol_guess = np.empty((N, ocp.ocp.dims.nx)) - # u_sol_guess = np.empty((N, ocp.ocp.dims.nu)) - - # for i, tau in enumerate(np.linspace(0, 1, N)): - - # x_guess = np.copy(q_init_ub) - - # for l in range(system_sel): - # if l == joint_sel: - # x_guess[l+system_sel] = 2*(1-tau)*(q_fin_sel-q_init_sel) - # else: - # x_guess[l+system_sel] = 0 - - # x_guess[joint_sel] = (1-tau)*q_init_sel + tau*q_fin_sel - - # x_sol_guess[i] = x_guess - # u_sol_guess[i] = np.zeros((system_sel)) - - # cost_old = 1e6 - - if all_ok: - - # Save the optimal trajectory: - x_sol = np.empty((N+1, ocp.ocp.dims.nx)) - u_sol = np.empty((N, ocp.ocp.dims.nu)) - - for i in range(N): - x_sol[i] = ocp.ocp_solver.get(i, "x") - u_sol[i] = ocp.ocp_solver.get(i, "u") - - x_sol[N] = ocp.ocp_solver.get(N, "x") - - # Generate the unviable sample in the cost direction: - x_sym = np.full((N+1,ocp.ocp.dims.nx - 1), None) - - x_out = np.copy(x_sol[0][:ocp.ocp.dims.nx - 1]) - for l in range(system_sel): - x_out[l+system_sel] = x_out[l+system_sel] - eps * p[l] - - # save the initial state: - valid_data = np.append(valid_data, [x_sol[0][:ocp.ocp.dims.nx - 1]], axis = 0) - - # Check if initial velocities lie on a limit: - if any(i > v_max or i < v_min for i in x_out[system_sel:ocp.ocp.dims.nx - 1]): - is_x_at_limit = True # the state is on dX - else: - is_x_at_limit = False # the state is on dV - x_sym[0] = x_out - - # Iterate through the trajectory to verify the location of the states with respect to V: - for f in range(1, N): - - if is_x_at_limit: - - # If the previous state was on a limit, the current state location cannot be identified using - # the corresponding unviable state but it has to rely on the proximity to the state limits - # (more restrictive): - if any(i > q_max - eps or i < q_min + eps for i in x_sol[f][:system_sel]) or any(i > v_max - tol or i < v_min + tol for i in x_sol[f][system_sel:ocp.ocp.dims.nx - 1]): - is_x_at_limit = True # the state is on dX - - else: - is_x_at_limit = False # the state is either on the interior of V or on dV - - # if the traj detouches from a position limit it usually enters V: - if any(i > q_max - eps or i < q_min + eps for i in x_sol[f-1][:system_sel]): - break - - # Solve an OCP to verify whether the following part of the trajectory is on V or dV. To do so - # the initial joint positions are set to the current ones and the final state is fixed to the - # final state of the trajectory. The initial velocities are left free and maximized in the - # direction of the current joint velocities. - - N_test = N - f - ocp.N = N_test - ocp.ocp_solver.set_new_time_steps(np.full((N_test,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N_test) - - # Cost: - norm_weights = norm(x_sol[f][system_sel:ocp.ocp.dims.nx - 1]) - p = np.zeros((system_sel+1)) - for l in range(system_sel): - p[l] = -x_sol[f][l+system_sel]/norm_weights # the cost direction is based on the current velocity direction - - # Bounds on the initial state: - for l in range(system_sel): - q_init_lb[l] = x_sol[f][l] - q_init_ub[l] = x_sol[f][l] - - # Guess: - x_sol_guess = np.empty((N_test+1, ocp.ocp.dims.nx)) - u_sol_guess = np.empty((N_test+1, ocp.ocp.dims.nu)) - for i in range(N_test): - x_sol_guess[i] = x_sol[i+f] - u_sol_guess[i] = u_sol[i+f] - x_sol_guess[N_test] = x_sol[N] - u_sol_guess[N_test] = np.zeros((ocp.ocp.dims.nu)) - - norm_old = norm_weights # velocity norm of the original solution - norm_bef = 0 - all_ok = False - - for _ in range(5): - - # Solve the OCP: - status = ocp.OCP_solve(x_sol_guess, u_sol_guess, p, q_lb, q_ub, u_lb, u_ub, q_init_lb, q_init_ub, q_fin_lb, q_fin_ub) - - if status == 0: # the solver has found a solution - - # Compare the current cost with the previous one: - x0_new = ocp.ocp_solver.get(0, "x") - norm_new = norm(x0_new[system_sel:ocp.ocp.dims.nx - 1]) - - if norm_new < norm_bef + tol: - all_ok = True - break - - norm_bef = norm_new - - # Update the guess with the current solution: - x_sol_guess = np.empty((N_test+1,ocp.ocp.dims.nx)) - u_sol_guess = np.empty((N_test+1,ocp.ocp.dims.nu)) - for i in range(N_test): - x_sol_guess[i] = ocp.ocp_solver.get(i, "x") - u_sol_guess[i] = ocp.ocp_solver.get(i, "u") - x_sol_guess[N_test] = ocp.ocp_solver.get(N_test, "x") - u_sol_guess[N_test] = np.zeros((ocp.ocp.dims.nu)) - - # Increase the number of time steps: - N_test = N_test + 1 - ocp.N = N_test - ocp.ocp_solver.set_new_time_steps(np.full((N_test,), 1.)) - ocp.ocp_solver.update_qp_solver_cond_N(N_test) - - else: - break - - if all_ok: - - # Compare the old and new velocity norms: - if norm_new > norm_old + tol: # the state is inside V - - # Update the optimal solution: - for i in range(N-f): - x_sol[i+f] = ocp.ocp_solver.get(i, "x") - u_sol[i+f] = ocp.ocp_solver.get(i, "u") - - x_out = np.copy(x_sol[f][:ocp.ocp.dims.nx - 1]) - - for l in range(system_sel): - x_out[l+system_sel] = x_out[l+system_sel] + eps * x_out[l+system_sel]/norm_new - - # Check if velocities lie on a limit: - if any(i > v_max or i < v_min for i in x_out[system_sel:ocp.ocp.dims.nx - 1]): - is_x_at_limit = True # the state is on dX - else: - is_x_at_limit = False # the state is on dV - x_sym[f] = x_out - - else: - is_x_at_limit = False # the state is on dV - - # Generate the new corresponding unviable state in the cost direction: - x_out = np.copy(x_sol[f][:ocp.ocp.dims.nx - 1]) - - for l in range(system_sel): - x_out[l+system_sel] = x_out[l+system_sel] - eps * p[l] - - x_sym[f] = x_out - - else: # we cannot say whether the state is on dV or inside V - - for r in range(f, N): - if any(abs(i) > v_max - eps for i in x_sol[r][system_sel:ocp.ocp.dims.nx - 1]): - - # Save the viable states at velocity limits: - valid_data = np.append(valid_data, [x_sol[r][:ocp.ocp.dims.nx - 1]], axis = 0) - - break - - else: - # If the previous state was not on a limit, the current state location can be identified using - # the corresponding unviable state which can be computed by simulating the system starting from - # the previous unviable state. - - # Simulate next unviable state: - u_sym = np.copy(u_sol[f-1]) - sim.acados_integrator.set("u", u_sym) - sim.acados_integrator.set("x", x_sym[f-1]) - # sim.acados_integrator.set("T", dt_sym) - status = sim.acados_integrator.solve() - x_out = sim.acados_integrator.get("x") - x_sym[f] = x_out - - # When the state of the unviable simulated trajectory violates a limit, the corresponding viable state - # of the optimal trajectory is on dX: - if any(i > q_max or i < q_min for i in x_out[:system_sel]) or any(i > v_max or i < v_min for i in x_out[system_sel:ocp.ocp.dims.nx - 1]): - is_x_at_limit = True # the state is on dX - else: - is_x_at_limit = False # the state is on dV - - if all(i < q_max - eps and i > q_min + eps for i in x_sol[f][:system_sel]) and all(abs(i) > tol for i in x_sol[f][system_sel:ocp.ocp.dims.nx - 1]): - - # Save the viable and unviable states: - valid_data = np.append(valid_data, [x_sol[f][:ocp.ocp.dims.nx - 1]], axis = 0) - - return valid_data.tolist() - - else: - return None - -start_time = time.time() - -# Select system: -system_sel = 3 # 2 for double pendulum, 3 for triple pendulum - -# Prune the model: -prune_model = False -prune_amount = 0.5 # percentage of connections to delete - -# Ocp initialization: -if system_sel == 3: - ocp = OCPtriplependulumINIT() - sim = SYMtriplependulumINIT() -elif system_sel == 2: - ocp = OCPdoublependulumINIT() - sim = SYMdoublependulumINIT() -else: - raise Exception("Sorry, the selected system is not recognised") - -X_test = np.load('../data' + str(system_sel) + '_test.npy') - -# Position, velocity and torque bounds: -v_max = ocp.dthetamax -v_min = - ocp.dthetamax -q_max = ocp.thetamax -q_min = ocp.thetamin -tau_max = ocp.Cmax - -device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # pytorch device - -dt_sym = 1e-2 # time step duration -N_start = 100 # initial number of timesteps -tol = ocp.ocp.solver_options.nlp_solver_tol_stat # OCP cost tolerance -eps = tol*10 # unviable data generation parameter - -print('Start data generation') - -# Data generation: -cpu_num = 30 -num_prob = 100 -with Pool(cpu_num) as p: - traj = p.map(data_generation, range(num_prob)) - -# traj, statpos, statneg = zip(*temp) -X_temp = [i for i in traj if i is not None] -print('Data generation completed') - -# Print data generations statistics: -solved=len(X_temp) -print('Solved/tot', len(X_temp)/num_prob) -X_save = np.array([i for f in X_temp for i in f]) -print('Saved/tot', len(X_save)/(solved*100)) - -# Save training data: -np.save('data_' + str(system_sel) + 'dof_vboc', np.asarray(X_save)) - -# # Load training data: -# X_save = np.load('data_' + str(system_sel) + 'dof_vboc' + '.npy') - -# Pytorch params: -input_layers = ocp.ocp.dims.nx - 1 -hidden_layers = (input_layers - 1) * 100 -output_layers = 1 -learning_rate = 1e-3 - -# Model and optimizer: -model_dir = NeuralNetDIR(input_layers, hidden_layers, output_layers).to(device) -criterion_dir = nn.MSELoss() -optimizer_dir = torch.optim.Adam(model_dir.parameters(), lr=learning_rate) - -# model_dir.load_state_dict(torch.load('model_' + system_sel + 'dof_vboc')) - -# Joint positions mean and variance: -mean_dir, std_dir = torch.mean(torch.tensor(X_save[:,:system_sel].tolist())).to(device).item(), torch.std(torch.tensor(X_save[:,:system_sel].tolist())).to(device).item() -torch.save(mean_dir, 'mean_' + str(system_sel) + 'dof_vboc') -torch.save(std_dir, 'std_' + str(system_sel) + 'dof_vboc') - -# Rewrite data in the form [normalized positions, velocity direction, velocity norm]: -X_train_dir = np.empty((X_save.shape[0],ocp.ocp.dims.nx)) - -for i in range(X_train_dir.shape[0]): - - vel_norm = norm(X_save[i][system_sel:ocp.ocp.dims.nx - 1]) - X_train_dir[i][ocp.ocp.dims.nx - 1] = vel_norm - - for l in range(system_sel): - X_train_dir[i][l] = (X_save[i][l] - mean_dir) / std_dir - X_train_dir[i][l+system_sel] = X_save[i][l+system_sel] / vel_norm - -beta = 0.95 -n_minibatch = 4096 -B = int(X_save.shape[0]*100/n_minibatch) # number of iterations for 100 epoch -it_max = B * 10 - -training_evol = [] - -print('Start model training') - -it = 1 -val = max(X_train_dir[:,ocp.ocp.dims.nx - 1]) - -# Train the model -while val > 1e-3 and it < it_max: - ind = random.sample(range(len(X_train_dir)), n_minibatch) - - X_iter_tensor = torch.Tensor([X_train_dir[i][:ocp.ocp.dims.nx - 1] for i in ind]).to(device) - y_iter_tensor = torch.Tensor([[X_train_dir[i][ocp.ocp.dims.nx - 1]] for i in ind]).to(device) - - # Forward pass - outputs = model_dir(X_iter_tensor) - loss = criterion_dir(outputs, y_iter_tensor) - - # Backward and optimize - loss.backward() - optimizer_dir.step() - optimizer_dir.zero_grad() - - val = beta * val + (1 - beta) * loss.item() - it += 1 - - if it % B == 0: - print(val) - training_evol.append(val) - -print('Model training completed') - -# Show the resulting RMSE on the training data: -outputs = np.empty((len(X_train_dir),1)) -n_minibatch_model = pow(2,15) -with torch.no_grad(): - X_iter_tensor = torch.Tensor(X_train_dir[:,:ocp.ocp.dims.nx - 1]).to(device) - y_iter_tensor = torch.Tensor(X_train_dir[:,ocp.ocp.dims.nx - 1:]).to(device) - my_dataloader = DataLoader(X_iter_tensor,batch_size=n_minibatch_model,shuffle=False) - for (idx, batch) in enumerate(my_dataloader): - if n_minibatch_model*(idx+1) > len(X_train_dir): - outputs[n_minibatch_model*idx:len(X_train_dir)] = model_dir(batch).cpu().numpy() - else: - outputs[n_minibatch_model*idx:n_minibatch_model*(idx+1)] = model_dir(batch).cpu().numpy() - outputs_tensor = torch.Tensor(outputs).to(device) - print('RMSE train data: ', torch.sqrt(criterion_dir(outputs_tensor, y_iter_tensor))) - -# Compute resulting RMSE wrt testing data: -X_test_dir = np.empty((X_test.shape[0],ocp.ocp.dims.nx)) -for i in range(X_test_dir.shape[0]): - vel_norm = norm(X_test[i][system_sel:ocp.ocp.dims.nx - 1]) - X_test_dir[i][ocp.ocp.dims.nx - 1] = vel_norm - for l in range(system_sel): - X_test_dir[i][l] = (X_test[i][l] - mean_dir) / std_dir - X_test_dir[i][l+system_sel] = X_test[i][l+system_sel] / vel_norm - -with torch.no_grad(): - X_iter_tensor = torch.Tensor(X_test_dir[:,:ocp.ocp.dims.nx - 1]).to(device) - y_iter_tensor = torch.Tensor(X_test_dir[:,ocp.ocp.dims.nx - 1:]).to(device) - outputs = model_dir(X_iter_tensor) - print('RMSE test data: ', torch.sqrt(criterion_dir(outputs, y_iter_tensor))) - -# Save the model: -torch.save(model_dir.state_dict(), 'model_' + str(system_sel) + 'dof_vboc') - -if prune_model: - - parameters_to_prune = ( - (model_dir.linear_relu_stack[0], 'weight'), - (model_dir.linear_relu_stack[2], 'weight'), - (model_dir.linear_relu_stack[4], 'weight'), - (model_dir.linear_relu_stack[0], 'bias'), - (model_dir.linear_relu_stack[2], 'bias'), - (model_dir.linear_relu_stack[4], 'bias'), - ) - - prune.global_unstructured( - parameters_to_prune, - pruning_method=prune.L1Unstructured, - amount=prune_amount, - ) - - # prune.l1_unstructured(model_dir.linear_relu_stack[0], name='weight', amount=prune_amount) - # prune.l1_unstructured(model_dir.linear_relu_stack[2], name='weight', amount=prune_amount) - # prune.l1_unstructured(model_dir.linear_relu_stack[4], name='weight', amount=prune_amount) - # prune.l1_unstructured(model_dir.linear_relu_stack[0], name='bias', amount=prune_amount) - # prune.l1_unstructured(model_dir.linear_relu_stack[2], name='bias', amount=prune_amount) - # prune.l1_unstructured(model_dir.linear_relu_stack[4], name='bias', amount=prune_amount) - - # model_dir.load_state_dict({k: v for k, v in initial_model_params.items() if 'weight' in k or 'bias' in k}, strict=False) - - print('Restart model training after pruning') - - it = 1 - val = max(X_train_dir[:,ocp.ocp.dims.nx - 1]) - - # Train the model - while val > 1e-3 and it < it_max: - ind = random.sample(range(len(X_train_dir)), n_minibatch) - - X_iter_tensor = torch.Tensor([X_train_dir[i][:ocp.ocp.dims.nx - 1] for i in ind]).to(device) - y_iter_tensor = torch.Tensor([[X_train_dir[i][ocp.ocp.dims.nx - 1]] for i in ind]).to(device) - - # Forward pass - outputs = model_dir(X_iter_tensor) - loss = criterion_dir(outputs, y_iter_tensor) - - # Backward and optimize - loss.backward() - optimizer_dir.step() - optimizer_dir.zero_grad() - - val = beta * val + (1 - beta) * loss.item() - it += 1 - - if it % B == 0: - print(val) - training_evol.append(val) - - print('Model training completed') - - prune.remove(model_dir.linear_relu_stack[0], 'weight') - prune.remove(model_dir.linear_relu_stack[2], 'weight') - prune.remove(model_dir.linear_relu_stack[4], 'weight') - prune.remove(model_dir.linear_relu_stack[0], 'bias') - prune.remove(model_dir.linear_relu_stack[2], 'bias') - prune.remove(model_dir.linear_relu_stack[4], 'bias') - - # print('----------------------') - # print(list(model_dir.named_parameters())) - # print(list(model_dir.named_buffers())) - - with torch.no_grad(): - X_iter_tensor = torch.Tensor(X_test_dir[:,:ocp.ocp.dims.nx - 1]).to(device) - y_iter_tensor = torch.Tensor(X_test_dir[:,ocp.ocp.dims.nx - 1:]).to(device) - outputs = model_dir(X_iter_tensor) - print('RMSE test data after pruning: ', torch.sqrt(criterion_dir(outputs, y_iter_tensor))) - print('Pruning percentage: ', prune_amount * 100) - -with torch.no_grad(): - # Compute safety margin: - outputs = model_dir(X_iter_tensor).cpu().numpy() - safety_margin = np.amax(np.array([(outputs[i] - X_test_dir[i][-1])/X_test_dir[i][-1] for i in range(X_test_dir.shape[0]) if outputs[i] - X_test_dir[i][-1] > 0])) - print('Maximum error wrt test data', safety_margin) - -# Save the pruned model: -torch.save(model_dir.state_dict(), 'model_' + str(system_sel) + 'dof_vboc') - -print("Execution time: %s seconds" % (time.time() - start_time)) - -# Show the training evolution: -plt.figure() -plt.plot(training_evol) - -# Show training data and resulting set approximation: -if system_sel == 3: - plots_3dof(X_save, q_min, q_max, v_min, v_max, model_dir, mean_dir, std_dir, device) -elif system_sel == 2: - plots_2dof(X_save, q_min, q_max, v_min, v_max, model_dir, mean_dir, std_dir, device) - -plt.show() \ No newline at end of file diff --git a/config/controller.yaml b/config/controller.yaml new file mode 100644 index 0000000..6a3a655 --- /dev/null +++ b/config/controller.yaml @@ -0,0 +1,20 @@ +T: 1 # horizon in seconds +prob_num: 5000 # number of OCP to create training dataset +test_num: 500 # " " " " " test dataset + +n_steps: 200 # number of time steps in each test +cpu_num: 24 +regenerate: true # regenerate data + +solver_mode: 'BALANCE' # HPIPM mode, can be --> 'BALANCE', 'ROBUST', 'SPEED', 'SPEED_ABS' +nlp_max_iter: 1000 +qp_max_iter: 100 +qp_tol_stat: 1e-3 # QP solver tolerance on stationary residuals +nlp_tol_stat: 1e-3 # NLP solver tolerance on stationary residuals +alpha: 2 +alpha_reduction: 0.3 +alpha_min: 1e-2 +levenberg_marquardt: 1e-5 + +conv_tol: 1e-3 # convergence tolerance for the task +cost_tol: 1e-3 # Tolerance for the cost \ No newline at end of file diff --git a/config/models/triple_pendulum.yaml b/config/models/triple_pendulum.yaml new file mode 100644 index 0000000..8bd7585 --- /dev/null +++ b/config/models/triple_pendulum.yaml @@ -0,0 +1,23 @@ +# Gravity acceleration +g: 9.81 +# Link lengths +l1: 0.8 +l2: 0.8 +l3: 0.8 +# Masses of the links +m1: 0.4 +m2: 0.4 +m3: 0.4 +# Damping +b: 0.01 +# Joint position bounds (defined as fraction of pi) +q_min: -0.25 +q_max: 0.25 +# Joint velocity bounds +dq_min: -10 +dq_max: 10 +# Joint torque bounds +u_min: -10 +u_max: 10 +# State bounds tolerance +state_tol: 1e-2 \ No newline at end of file diff --git a/config/simulator.yaml b/config/simulator.yaml new file mode 100644 index 0000000..ca2b229 --- /dev/null +++ b/config/simulator.yaml @@ -0,0 +1,3 @@ +dt: 1e-2 # Time step +integrator_type: 'ERK' # Explicit Runge-Kutta +num_stages: 4 # RK4 \ No newline at end of file diff --git a/requirements.txt b/requirements.txt new file mode 100644 index 0000000..92ba462 --- /dev/null +++ b/requirements.txt @@ -0,0 +1,56 @@ +cmake==3.27.7 +cmeel==0.53.3 +cmeel-assimp==5.3.1 +cmeel-boost==1.83.0 +cmeel-console-bridge==1.0.2.2 +cmeel-octomap==1.9.8.2 +cmeel-qhull==8.0.2.1 +cmeel-tinyxml==2.6.2.3 +cmeel-urdfdom==3.1.1.1 +contourpy==1.2.0 +cycler==0.12.1 +Cython==3.0.5 +distro==1.8.0 +eigenpy==3.2.0 +filelock==3.13.1 +fonttools==4.45.0 +fsspec==2023.10.0 +future-fstrings==1.2.0 +hpp-fcl==2.4.0 +importlib-resources==6.1.1 +Jinja2==3.1.2 +kiwisolver==1.4.5 +MarkupSafe==2.1.3 +matplotlib==3.8.2 +mpmath==1.3.0 +networkx==3.2.1 +ninja==1.11.1.1 +numpy==1.26.2 +nvidia-cublas-cu12==12.1.3.1 +nvidia-cuda-cupti-cu12==12.1.105 +nvidia-cuda-nvrtc-cu12==12.1.105 +nvidia-cuda-runtime-cu12==12.1.105 +nvidia-cudnn-cu12==8.9.2.26 +nvidia-cufft-cu12==11.0.2.54 +nvidia-curand-cu12==10.3.2.106 +nvidia-cusolver-cu12==11.4.5.107 +nvidia-cusparse-cu12==12.1.0.106 +nvidia-nccl-cu12==2.18.1 +nvidia-nvjitlink-cu12==12.3.52 +nvidia-nvtx-cu12==12.1.105 +packaging==23.2 +Pillow==10.1.0 +pkg_resources==0.0.0 +pyparsing==3.1.1 +python-dateutil==2.8.2 +PyYAML==6.0.1 +scikit-build==0.17.6 +scipy==1.11.3 +six==1.16.0 +sympy==1.12 +tomli==2.0.1 +torch==2.1.0 +tqdm==4.66.1 +triton==2.1.0 +typing_extensions==4.8.0 +zipp==3.17.0 diff --git a/scripts/main.py b/scripts/main.py new file mode 100644 index 0000000..fdf0fc2 --- /dev/null +++ b/scripts/main.py @@ -0,0 +1,234 @@ +import os +import time +import pickle +import random +import numpy as np +import matplotlib.pyplot as plt +import torch +from multiprocessing import Pool +import vboc.model as models +from vboc.parser import Parameters, parse_args +from vboc.abstract import SimDynamics +from vboc.controller import ViabilityController +from vboc.learning import NeuralNetwork, RegressionNN, plot_viability_kernel + + +def computeDataOnBorder(N_guess, uniform_flag=0): + valid_data = np.empty((0, model.nx)) + controller.resetHorizon(N_guess) + + # Randomize the initial state + d = np.array([random.uniform(-1, 1) for _ in range(model.nv)]) + q = np.array([ + random.uniform(params.q_min + model.eps, params.q_max - model.eps) + for _ in range(model.nq)] + ) + + # Set the initial guess + x_guess = np.zeros((N_guess, model.nx)) + u_guess = np.zeros((N_guess, model.nu)) + x_guess[:, :model.nq] = np.full((N_guess, model.nq), q) + + if not uniform_flag: + # Dense sampling near the joint position bounds + vel_dir = random.choice([-1, 1]) + q_init = params.q_min + model.eps if vel_dir == -1 else params.q_max - model.eps + q_fin = params.q_max - model.eps if vel_dir == -1 else params.q_min + model.eps + i = random.choice(range(model.nq)) + d[i] = random.random() * vel_dir + q[i] = q_init + x_guess[:, i] = np.linspace(q_init, q_fin, N_guess) + + d /= np.linalg.norm(d) + controller.setGuess(x_guess, u_guess) + + # Solve the OCP + x_star, u_star, N = controller.solveVBOC(q, d, N_guess, n=N_increment, repeat=50) + if x_star is None: + return None + if uniform_flag: + return x_star[0] + # Add a final node + x_star = np.vstack([x_star, x_star[-1]]) + # Save the initial state as valid data + valid_data = np.vstack([valid_data, x_star[0]]) + + # Generate unviable sample in the cost direction + x_tilde = np.full((N + 1, model.nx), None) + x_tilde[0] = np.copy(x_star[0]) + # Acados optimization is a minimization problem, but we want to maximize the velocity norm + # d points on the opposite direction wrt the optimizer, so we use the minus sign + x_tilde[0, model.nq:] -= model.eps * d + + x_limit = True if model.checkVelocityBounds(x_tilde[0, model.nq:]) else False + + # Iterate along the trajectory to verify the viability of the solution + for j in range(1, N): + if x_limit: + if not model.checkStateBounds(x_star[j]): + + if model.checkPositionBounds(x_star[j - 1, :model.nq]): + break + + gamma = np.linalg.norm(x_star[j, model.nq:]) + d = - x_star[j, model.nq:] + d /= np.linalg.norm(d) + controller.resetHorizon(N - j) + controller.setGuess(x_star[j:N], u_star[j:N]) + x_new, u_new, _ = controller.solveVBOC(x_star[j, :model.nq], d, N - j, n=N_increment, repeat=5) + if x_new is not None: + x0 = controller.ocp_solver.get(0, 'x') + gamma_new = np.linalg.norm(x0[model.nq:]) + if gamma_new > gamma + controller.tol: + # Update the optimal trajectory + x_star[j:N], u_star[j:N] = x_new[:N - j], u_new[:N - j] + + # Create unviable state + x_tilde[j] = np.copy(x_star[j]) + x_tilde[j, model.nq:] += model.eps * x_tilde[j, model.nq:] / gamma_new + + # Check if the unviable state is on bound + x_limit = True if model.checkVelocityBounds(x_tilde[j, model.nq:]) else False + + else: + x_limit = False + x_tilde[j] = np.copy(x_star[j]) + x_tilde[j, model.nq:] -= model.eps * d + else: + for k in range(j, N): + if model.checkVelocityBounds(x_star[k, model.nq:]): + valid_data = np.vstack([valid_data, x_star[k]]) + break + else: + x_tilde[j] = simulator.simulate(x_tilde[j - 1], u_star[j - 1]) + x_limit = True if model.checkStateBounds(x_tilde[j]) else False + if not model.checkStateBounds(x_star[j]): + valid_data = np.vstack([valid_data, x_star[j]]) + # valid_data = np.vstack([valid_data, x_star[j]]) + return valid_data + + +if __name__ == '__main__': + + start_time = time.time() + + args = parse_args() + # Define the available systems + available_systems = { + 'pendulum': 'PendulumModel', + 'double_pendulum': 'DoublePendulumModel', + 'triple_pendulum': 'TriplePendulumModel' + } + if args['system'] not in available_systems: + raise ValueError('System not available. Available: ' + str(list(available_systems.keys()))) + params = Parameters('triple_pendulum') # contains the parameters for all pendulum models + model = getattr(models, available_systems[args['system']])(params) + simulator = SimDynamics(model) + controller = ViabilityController(simulator) + + # Check if data folder exists, if not create it + if not os.path.exists(params.DATA_DIR): + os.makedirs(params.DATA_DIR) + if not os.path.exists(params.NN_DIR): + os.makedirs(params.NN_DIR) + + N = 100 + N_increment = 1 + horizon = args['horizon'] + if horizon < 1: + raise ValueError('The horizon must be greater than 0') + if horizon < N: + N = horizon + N_increment = 0 + + # DATA GENERATION + if args['vboc']: + print('Start data generation') + with Pool(params.cpu_num) as p: + # inputs --> (horizon, flag to compute only the initial state) + # uniform sampling of the initial state --> uniform = 1 + res = p.starmap(computeDataOnBorder, [(N, 0)] * params.prob_num) + + # SEQUENTIAL + # res = [] + # for i in range(params.prob_num): + # res.append(computeDataOnBorder(N, 0)) + + x_data = [i for i in res if i is not None] + x_save = np.vstack(x_data) + + solved = len(x_data) + print('Saved/tot: %.3f' % (len(x_save) / (solved * N))) + print('Total number of points: %d' % len(x_save)) + np.save(params.DATA_DIR + str(model.nq) + 'dof_vboc', np.asarray(x_save)) + + # TRAINING + if args['training']: + # Load the data + x_train = np.load(params.DATA_DIR + str(model.nq) + 'dof_vboc.npy') + beta = args['beta'] + batch_size = args['batch_size'] + device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') + nn_model = NeuralNetwork(model.nx, (model.nx - 1) * 100, 1).to(device) + loss_fn = torch.nn.MSELoss() + optimizer = torch.optim.Adam(nn_model.parameters(), lr=args['learning_rate']) + regressor = RegressionNN(nn_model, loss_fn, optimizer, beta, batch_size) + + # Compute outputs and inputs + y_train = np.linalg.norm(x_train[:, model.nq:], axis=1).reshape(len(x_train), 1) + mean = np.mean(x_train[:, :model.nq]) + std = np.std(x_train[:, :model.nq]) + x_train[:, :model.nq] = (x_train[:, :model.nq] - mean) / std + x_train[:, model.nq:] /= y_train + + print('Start training\n') + evolution = regressor.training(x_train, y_train) + print('Training completed\n') + + print('Evaluate the model') + _, rmse_train = regressor.testing(x_train, y_train) + print('RMSE on Training data: %.5f' % rmse_train) + + # Compute the test data + print('Generation of testing data (only x0*)') + with Pool(params.cpu_num) as p: + data = p.starmap(computeDataOnBorder, [(N, 1)] * params.test_num) + + x_test = np.array([i for i in data if i is not None]) + y_test = np.linalg.norm(x_test[:, model.nq:], axis=1).reshape(len(x_test), 1) + x_test[:, :model.nq] = (x_test[:, :model.nq] - mean) / std + x_test[:, model.nq:] /= y_test + out_test, rmse_test = regressor.testing(x_test, y_test) + print('RMSE on Test data: %.5f' % rmse_test) + + # Safety margin + safety_margin = np.amax((out_test - y_test) / y_test) + print(f'Maximum error wrt test data: {safety_margin:.5f}') + + # Save the model + torch.save({'mean': mean, 'std': std, 'model': nn_model.state_dict()}, + params.NN_DIR + 'model_' + str(model.nq) + 'dof.pt') + + print('\nPlot the loss evolution') + # Plot the loss evolution + plt.figure() + plt.plot(evolution) + plt.xlabel('Epochs') + plt.ylabel('Loss') + plt.title(f'Training evolution, horizon {N}') + plt.savefig(params.DATA_DIR + f'evolution_{N}.png') + + # PLOT THE VIABILITY KERNEL + if args['plot']: + dataset = np.load(params.DATA_DIR + str(model.nq) + 'dof_vboc.npy') + nn_data = torch.load(params.NN_DIR + 'model_' + str(model.nq) + 'dof.pt') + nn_model = NeuralNetwork(model.nx, (model.nx - 1) * 100, 1) + nn_model.load_state_dict(nn_data['model']) + plot_viability_kernel(model.nq, params, nn_model, nn_data['mean'], nn_data['std'], dataset, N) + plt.show() + + elapsed_time = time.time() - start_time + hours = int(elapsed_time // 3600) + minutes = int((elapsed_time % 3600) // 60) + seconds = int(elapsed_time % 60) + print(f'Elapsed time: {hours}:{minutes:2d}:{seconds:2d}') \ No newline at end of file diff --git a/src/vboc/__init__.py b/src/vboc/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/src/vboc/abstract.py b/src/vboc/abstract.py new file mode 100644 index 0000000..bcf7941 --- /dev/null +++ b/src/vboc/abstract.py @@ -0,0 +1,178 @@ +import re +import numpy as np +from casadi import MX +from acados_template import AcadosModel, AcadosSim, AcadosSimSolver, AcadosOcp, AcadosOcpSolver + + +class AbstractModel: + def __init__(self, params): + self.params = params + self.amodel = AcadosModel() + # Dummy dynamics (double integrator) + self.amodel.name = "double_integrator" + self.x = MX.sym("x") + self.x_dot = MX.sym("x_dot") + self.u = MX.sym("u") + self.f_expl = self.u + self.p = MX.sym("p") + self.addDynamicsModel(params) + self.amodel.x = self.x + self.amodel.xdot = self.x_dot + self.amodel.u = self.u + self.amodel.f_expl_expr = params.dt * self.f_expl + self.amodel.p = self.p + + self.nx = self.amodel.x.size()[0] + self.nu = self.amodel.u.size()[0] + self.ny = self.nx + self.nu + self.nq = int(self.nx / 2) + self.nv = self.nx - self.nq + + # Joint limits + self.u_min = -params.u_max * np.ones(self.nu) + self.u_max = params.u_max * np.ones(self.nu) + self.x_min = np.hstack([params.q_min * np.ones(self.nq), -params.dq_max * np.ones(self.nq)]) + self.x_max = np.hstack([params.q_max * np.ones(self.nq), params.dq_max * np.ones(self.nq)]) + self.eps = params.state_tol + + def addDynamicsModel(self, params): + pass + + def insideStateConstraints(self, x): + return np.all(np.logical_and(x >= self.x_min + self.eps, x <= self.x_max - self.eps)) + + def insideControlConstraints(self, u): + return np.all(np.logical_and(u >= self.u_min, u <= self.u_max)) + + def insideRunningConstraints(self, x, u): + return self.insideStateConstraints(x) and self.insideControlConstraints(u) + + def checkPositionBounds(self, q): + return np.logical_or(np.any(q < self.x_min[:self.nq] + self.eps), np.any(q > self.x_max[:self.nq] - self.eps)) + + def checkVelocityBounds(self, v): + return np.logical_or(np.any(v < self.x_min[self.nq:] + self.eps), np.any(v > self.x_max[self.nq:] - self.eps)) + + def checkStateBounds(self, x): + return np.logical_or(np.any(x < self.x_min + self.eps), np.any(x > self.x_max - self.eps)) + + +class SimDynamics: + def __init__(self, model): + self.model = model + self.params = model.params + sim = AcadosSim() + sim.model = model.amodel + # T --> should be dt, but instead we multiply the dynamics by dt + # this speed up the increment of the OCP horizon + sim.solver_options.T = 1. + sim.solver_options.integrator_type = self.params.integrator_type + sim.solver_options.num_stages = self.params.num_stages + sim.parameter_values = np.zeros(model.nv) + gen_name = self.params.GEN_DIR + '/sim_' + sim.model.name + sim.code_export_directory = gen_name + self.integrator = AcadosSimSolver(sim, build=self.params.regenerate, json_file=gen_name + '.json') + + def simulate(self, x, u): + self.integrator.set("x", x) + self.integrator.set("u", u) + self.integrator.solve() + x_next = self.integrator.get("x") + return x_next + + def checkDynamicsConstraints(self, x, u): + # Rollout the control sequence + n = np.shape(u)[0] + x_sim = np.zeros((n + 1, self.model.nx)) + x_sim[0] = np.copy(x[0]) + for i in range(n): + x_sim[i + 1] = self.simulate(x_sim[i], u[i]) + # Check if the rollout state trajectory is almost equal to the optimal one + return np.linalg.norm(x - x_sim) < self.params.state_tol * np.sqrt(n+1) + + +class AbstractController: + def __init__(self, simulator): + self.ocp_name = "".join(re.findall('[A-Z][^A-Z]*', self.__class__.__name__)[:-1]).lower() + self.simulator = simulator + self.params = simulator.params + self.model = simulator.model + + self.N = int(self.params.T / self.params.dt) + self.ocp = AcadosOcp() + + # Dimensions + self.ocp.solver_options.tf = self.N + self.ocp.dims.N = self.N + + # Model + self.ocp.model = self.model.amodel + + # Cost + self.addCost() + + # Constraints + self.ocp.constraints.lbx_0 = self.model.x_min + self.ocp.constraints.ubx_0 = self.model.x_max + self.ocp.constraints.idxbx_0 = np.arange(self.model.nx) + + self.ocp.constraints.lbu = self.model.u_min + self.ocp.constraints.ubu = self.model.u_max + self.ocp.constraints.idxbu = np.arange(self.model.nu) + self.ocp.constraints.lbx = self.model.x_min + self.ocp.constraints.ubx = self.model.x_max + self.ocp.constraints.idxbx = np.arange(self.model.nx) + + self.ocp.constraints.lbx_e = self.model.x_min + self.ocp.constraints.ubx_e = self.model.x_max + self.ocp.constraints.idxbx_e = np.arange(self.model.nx) + # Additional constraints + self.addConstraint() + + # Solver options + self.ocp.solver_options.hessian_approx = "EXACT" + self.ocp.solver_options.exact_hess_constr = 0 + self.ocp.solver_options.exact_hess_dyn = 0 + self.ocp.solver_options.nlp_solver_type = self.params.solver_type + self.ocp.solver_options.hpipm_mode = self.params.solver_mode + self.ocp.solver_options.nlp_solver_max_iter = self.params.nlp_max_iter + self.ocp.solver_options.qp_solver_iter_max = self.params.qp_max_iter + self.ocp.solver_options.globalization = self.params.globalization + self.ocp.solver_options.qp_solver_tol_stat = self.params.qp_tol_stat + self.ocp.solver_options.nlp_solver_tol_stat = self.params.nlp_tol_stat + # IMPORTANT + # NLP solver tol stat is the tolerance on the stationary condition + # Asia used a higher value than default (1e-6) to obtain a higher number of solutions + # maybe try to find a better trade-off btw 1e-3 and 1e-6 + ######################### + self.ocp.solver_options.alpha_reduction = self.params.alpha_reduction + self.ocp.solver_options.alpha_min = self.params.alpha_min + self.ocp.solver_options.levenberg_marquardt = self.params.levenberg_marquardt + + # Generate OCP solver + gen_name = self.params.GEN_DIR + 'ocp_' + self.ocp_name + '_' + self.model.amodel.name + self.ocp.code_export_directory = gen_name + self.ocp_solver = AcadosOcpSolver(self.ocp, json_file=gen_name + '.json', build=self.params.regenerate) + + # Storage + self.x_guess = np.zeros((self.N, self.model.nx)) + self.u_guess = np.zeros((self.N, self.model.nu)) + self.tol = self.params.cost_tol + + def addCost(self): + pass + + def addConstraint(self): + pass + + def setGuess(self, x_guess, u_guess): + self.x_guess = x_guess + self.u_guess = u_guess + + def getGuess(self): + return np.copy(self.x_guess), np.copy(self.u_guess) + + def resetHorizon(self, N): + self.N = N + self.ocp_solver.set_new_time_steps(np.full(N, 1.)) + self.ocp_solver.update_qp_solver_cond_N(N) diff --git a/src/vboc/controller.py b/src/vboc/controller.py new file mode 100644 index 0000000..1e9bccc --- /dev/null +++ b/src/vboc/controller.py @@ -0,0 +1,88 @@ +import numpy as np +from casadi import dot +from .abstract import AbstractController + + +class ViabilityController(AbstractController): + def __init__(self, simulator): + super().__init__(simulator) + self.C = np.zeros((self.model.nv, self.model.nx)) + + def addCost(self): + # Maximize initial velocity + self.ocp.cost.cost_type_0 = 'EXTERNAL' + self.ocp.model.cost_expr_ext_cost_0 = dot(self.model.p, self.model.x[self.model.nq:]) + self.ocp.parameter_values = np.zeros(self.model.nv) + + def addConstraint(self): + q_fin_lb = np.hstack([self.model.x_min[:self.model.nq], np.zeros(self.model.nv)]) + q_fin_ub = np.hstack([self.model.x_max[:self.model.nq], np.zeros(self.model.nv)]) + + self.ocp.constraints.lbx_e = q_fin_lb + self.ocp.constraints.ubx_e = q_fin_ub + self.ocp.constraints.idxbx_e = np.arange(self.model.nx) + + self.ocp.constraints.C = np.zeros((self.model.nv, self.model.nx)) + self.ocp.constraints.D = np.zeros((self.model.nv, self.model.nu)) + self.ocp.constraints.lg = np.zeros((self.model.nv,)) + self.ocp.constraints.ug = np.zeros((self.model.nv,)) + + def solve(self, q_init, d): + self.ocp_solver.reset() + for i in range(self.N): + self.ocp_solver.set(i, 'x', self.x_guess[i]) + self.ocp_solver.set(i, 'u', self.u_guess[i]) + self.ocp_solver.set(i, 'p', d) + self.ocp_solver.set(self.N, 'x', self.x_guess[-1]) + self.ocp_solver.set(self.N, 'p', d) + + # Set the initial constraint + d_arr = np.array([d.tolist()]) + self.C[:, self.model.nq:] = np.eye(self.model.nv) - np.matmul(d_arr.T, d_arr) + self.ocp_solver.constraints_set(0, "C", self.C, api='new') + + # Set initial bounds -> x0_pos = q_init, x0_vel free; (final bounds already set) + q_init_lb = np.hstack([q_init, self.model.x_min[self.model.nq:]]) + q_init_ub = np.hstack([q_init, self.model.x_max[self.model.nq:]]) + self.ocp_solver.constraints_set(0, "lbx", q_init_lb) + self.ocp_solver.constraints_set(0, "ubx", q_init_ub) + + # Solve the OCP + return self.ocp_solver.solve() + + def solveVBOC(self, q, d, N_start, n=1, repeat=10): + N = N_start + gamma = 0 + x_sol, u_sol = None, None + if n == 0: + # N-BRS --> constant horizon N, no need to repeat the process until convergence + repeat = 1 + for _ in range(repeat): + # Solve the OCP + status = self.solve(q, d) + if status == 0: + # Compare the current cost with the previous one: + x0 = self.ocp_solver.get(0, "x") + gamma_new = np.linalg.norm(x0[self.model.nq:]) + + if gamma_new < gamma + self.tol: + break + gamma = gamma_new + + # Rollout the solution + x_sol = np.empty((N + n, self.model.nx)) + u_sol = np.empty((N + n, self.model.nu)) # last control is not used + for i in range(N): + x_sol[i] = self.ocp_solver.get(i, 'x') + u_sol[i] = self.ocp_solver.get(i, 'u') + x_sol[N:] = self.ocp_solver.get(N, 'x') + u_sol[N:] = np.zeros((n, self.model.nu)) + + # Reset the initial guess with the previous solution + self.setGuess(x_sol, u_sol) + # Increase the horizon + N += n + self.resetHorizon(N) + else: + return None, None, None + return x_sol, u_sol, N diff --git a/src/vboc/learning.py b/src/vboc/learning.py new file mode 100644 index 0000000..754b3a8 --- /dev/null +++ b/src/vboc/learning.py @@ -0,0 +1,132 @@ +import random +import numpy as np +import torch +import torch.nn as nn +from torch.utils.data import DataLoader +import matplotlib.pyplot as plt + + +class NeuralNetwork(nn.Module): + """ A simple feedforward neural network with 3 layers (1 hidden). """ + def __init__(self, input_size, hidden_size, output_size, activation=nn.ReLU()): + super().__init__() + self.linear_relu_stack = nn.Sequential( + nn.Linear(input_size, hidden_size), + activation, + nn.Linear(hidden_size, hidden_size), + activation, + nn.Linear(hidden_size, output_size), + activation, + ) + + def forward(self, x): + out = self.linear_relu_stack(x) + return out + + +class RegressionNN: + """ Class that compute training and test of a neural network. """ + def __init__(self, model, loss_fn, optimizer, beta, batch_size): + self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') + self.model = model + self.loss_fn = loss_fn + self.optimizer = optimizer + self.beta = beta + self.batch_size = batch_size + + def training(self, x_train, y_train): + """ Training of the neural network. """ + t = 1 + n = len(x_train) + val = np.amax(y_train) + b = int(n * 100 / self.batch_size) # number of iterations for 100 epochs + max_iter = b * 10 + evolution = [] + self.model.train() + while val > 1e-3 and t < max_iter: + indexes = random.sample(range(n), self.batch_size) + + x_tensor = torch.Tensor(x_train[indexes]).to(self.device) + y_tensor = torch.Tensor(y_train[indexes]).to(self.device) + + # Forward pass: compute predicted y by passing x to the model + y_pred = self.model(x_tensor) + + # Compute the loss + loss = self.loss_fn(y_pred, y_tensor) + + # Backward and optimize + loss.backward() + self.optimizer.step() + self.optimizer.zero_grad() + + val = self.beta * val + (1 - self.beta) * loss.item() + t += 1 + if t % b == 0: + print(f'Iteration {t}, loss = {val}') + evolution.append(val) + return evolution + + def testing(self, x_test, y_test): + """ Compute the RMSE wrt to training or test data. """ + loader = DataLoader(torch.Tensor(x_test).to(self.device), batch_size=self.batch_size, shuffle=False) + self.model.eval() + y_pred = np.empty((len(x_test), 1)) + with torch.no_grad(): + for i, x in enumerate(loader): + if (i + 1) * self.batch_size > len(x_test): + y_pred[i * self.batch_size:] = self.model(x).cpu().numpy() + else: + y_pred[i * self.batch_size:(i+1) * self.batch_size] = self.model(x).cpu().numpy() + return y_pred, np.sqrt(np.mean((y_pred - y_test)**2)) + + +def plot_viability_kernel(nq, params, model, mean, std, dataset, horizon=100, grid=1e-2): + # Create the grid + q, v = np.meshgrid(np.arange(params.q_min, params.q_max, grid), + np.arange(params.dq_min, params.dq_max, grid)) + q_rav, v_rav = q.ravel(), v.ravel() + n = len(q_rav) + + with torch.no_grad(): + for i in range(nq): + plt.figure() + + x = np.zeros((n, nq * 2)) + x[:, :nq] = (params.q_max + params.q_min) / 2 * np.ones((n, nq)) + x[:, i] = q_rav + x[:, nq + i] = v_rav + + # Compute velocity norm + y = np.linalg.norm(x[:, nq:], axis=1) + + # Normalize position + x[:, :nq] = (x[:, :nq] - mean) / std + # Velocity direction + x[:, nq:] /= y.reshape(len(y), 1) + + # Predict + y_pred = model(torch.from_numpy(x.astype(np.float32))).cpu().numpy() + out = np.array([0 if y[j] > y_pred[j] else 1 for j in range(n)]) + z = out.reshape(q.shape) + plt.contourf(q, v, z, cmap='coolwarm', alpha=0.8) + + # Plot of the viable samples + x_viable = np.empty((0, 2)) + rm_idx = [i, nq + i] + idx = np.isin(np.arange(2 * nq), rm_idx, invert=True) + for x_star in dataset: + # Remove the part relative to the considerd joint + x_static = x_star[idx] + if np.linalg.norm(x_static[:nq -1] - np.ones(nq - 1) * (params.q_max + params.q_min) / 2) < 0.01 \ + and np.linalg.norm(x_static[nq - 1:]) < 0.1: + x_viable = np.vstack([x_viable, np.array([x_star[i], x_star[nq + i]])]) + plt.plot(x_viable[:, 0], x_viable[:, 1], 'go', markersize=2) + + plt.xlim([params.q_min, params.q_max]) + plt.ylim([params.dq_min, params.dq_max]) + plt.xlabel('q_' + str(i + 1)) + plt.ylabel('dq_' + str(i + 1)) + plt.grid() + plt.title(f"Classifier section joint {i + 1}, horizon {horizon}") + plt.savefig(params.DATA_DIR + f'{i + 1}dof_{horizon}_BRS.png') diff --git a/src/vboc/model.py b/src/vboc/model.py new file mode 100644 index 0000000..3dc955d --- /dev/null +++ b/src/vboc/model.py @@ -0,0 +1,209 @@ +from casadi import MX, vertcat, cos, sin +from .abstract import AbstractModel + + +class PendulumModel(AbstractModel): + """ Define the pendulum dynamics model. """ + def __init__(self, params): + super().__init__(params) + + def addDynamicsModel(self, params): + self.amodel.name = "pendulum" + + self.x = MX.sym("x", 2) + self.x_dot = MX.sym("x_dot", 2) + self.u = MX.sym("u", 1) + self.p = MX.sym("p", 1) + + # Dynamics + self.f_expl = vertcat( + self.x[1], + (params.m1 * params.g * params.l1 * sin(self.x[0]) + self.u - params.b * self.x[1]) + / (params.l1 * params.l1 * params.m1) + ) + + +class DoublePendulumModel(AbstractModel): + """ Define the double pendulum dynamics model. """ + def __init__(self, params): + super().__init__(params) + + def addDynamicsModel(self, params): + self.amodel.name = "double_pendulum" + + self.x = MX.sym("x", 4) + self.x_dot = MX.sym("x_dot", 4) + self.u = MX.sym("u", 2) + self.p = MX.sym("p", 2) + + # Dynamics + self.f_expl = vertcat( + self.x[2], + self.x[3], + ( + params.l1 ** 2 + * params.l2 + * params.m2 + * self.x[2] ** 2 + * sin(-2 * self.x[1] + 2 * self.x[0]) + + 2 * self.u[1] * cos(-self.x[1] + self.x[0]) * params.l1 + + 2 + * ( + params.g * sin(-2 * self.x[1] + self.x[0]) * params.l1 * params.m2 / 2 + + sin(-self.x[1] + self.x[0]) * self.x[3] ** 2 * params.l1 * params.l2 * params.m2 + + params.g * params.l1 * (params.m1 + params.m2 / 2) * sin(self.x[0]) + - self.u[0] + ) + * params.l2 + ) + / params.l1 ** 2 + / params.l2 + / (params.m2 * cos(-2 * self.x[1] + 2 * self.x[0]) - 2 * params.m1 - params.m2), + ( + -params.g + * params.l1 + * params.l2 + * params.m2 + * (params.m1 + params.m2) + * sin(-self.x[1] + 2 * self.x[0]) + - params.l1 + * params.l2 ** 2 + * params.m2 ** 2 + * self.x[3] ** 2 + * sin(-2 * self.x[1] + 2 * self.x[0]) + - 2 + * self.x[2] ** 2 + * params.l1 ** 2 + * params.l2 + * params.m2 + * (params.m1 + params.m2) + * sin(-self.x[1] + self.x[0]) + + 2 * self.u[0] * cos(-self.x[1] + self.x[0]) * params.l2 * params.m2 + + params.l1 + * (params.m1 + params.m2) + * (sin(self.x[1]) * params.g * params.l2 * params.m2 - 2 * self.u[1]) + ) + / params.l2 ** 2 + / params.l1 + / params.m2 + / (params.m2 * cos(-2 * self.x[1] + 2 * self.x[0]) - 2 * params.m1 - params.m2) + ) + + +class TriplePendulumModel(AbstractModel): + """ Define the triple pendulum dynamics model. """ + def __init__(self, params): + super().__init__(params) + + def addDynamicsModel(self, params): + self.amodel.name = "triple_pendulum" + + self.x = MX.sym("x", 6) + self.x_dot = MX.sym("x_dot", 6) + self.u = MX.sym("u", 3) + self.p = MX.sym("p", 3) + + # dynamics + self.f_expl = vertcat( + self.x[3], + self.x[4], + self.x[5], + (-params.g * params.l1 * params.l2 * params.l2 * params.m1 * params.m3 * sin( + -2 * self.x[2] + 2 * self.x[1] + self.x[ + 0]) - params.g * params.l1 * params.l2 * params.l2 * params.m1 * params.m3 * sin( + 2 * self.x[2] - 2 * self.x[1] + self.x[0]) + 2 * self.u[0] * params.l2 * params.l2 * params.m3 * cos( + -2 * self.x[2] + 2 * self.x[1]) + 2 * self.x[ + 3] ** 2 * params.l1 ** 2 * params.l2 * params.l2 * params.m2 * ( + params.m2 + params.m3) * sin(-2 * self.x[1] + 2 * self.x[0]) - 2 * self.u[ + 2] * params.l1 * params.l2 * ( + params.m2 + params.m3) * cos( + -2 * self.x[1] + self.x[0] + self.x[2]) - 2 * self.u[1] * params.l1 * params.l2 * params.m3 * cos( + -2 * self.x[2] + self.x[1] + self.x[ + 0]) + 2 * params.l1 * params.l2 * params.l2 ** 2 * params.m2 * params.m3 * self.x[5] ** 2 * sin( + -2 * self.x[1] + self.x[0] + self.x[2]) + 2 * self.u[2] * params.l1 * params.l2 * ( + params.m2 + params.m3) * + cos(self.x[0] - self.x[2]) + 2 * ( + self.u[1] * params.l1 * (params.m3 + 2 * params.m2) * cos(-self.x[1] + self.x[0]) + ( + params.g * params.l1 * params.m2 * (params.m2 + params.m3) * sin( + -2 * self.x[1] + self.x[0]) + 2 * self.x[4] ** 2 * params.l1 * params.l2 * params.m2 * ( + params.m2 + params.m3) * sin(-self.x[1] + self.x[0]) + params.m3 * self.x[ + 5] ** 2 * sin( + self.x[0] - self.x[2]) * params.l1 * params.l2 * params.m2 + params.g * params.l1 * ( + params.m2 ** 2 + ( + params.m3 + 2 * params.m1) * params.m2 + params.m1 * params.m3) * sin( + self.x[0]) - self.u[0] * ( + params.m3 + 2 * params.m2)) * params.l2) * params.l2) / params.l1 ** 2 / params.l2 / ( + params.m2 * (params.m2 + params.m3) * cos( + -2 * self.x[1] + 2 * self.x[0]) + params.m1 * params.m3 * cos( + -2 * self.x[2] + 2 * self.x[1]) - params.m2 ** 2 + ( + -params.m3 - 2 * params.m1) * params.m2 - params.m1 * params.m3) / params.l2 / 2, + (-2 * self.u[2] * params.l1 * params.l2 * (params.m2 + params.m3) * cos( + 2 * self.x[0] - self.x[2] - self.x[ + 1]) - 2 * params.l1 * params.l2 * params.l2 ** 2 * params.m2 * params.m3 * self.x[5] ** 2 * sin( + 2 * self.x[0] - self.x[2] - self.x[ + 1]) + params.g * params.l1 * params.l2 * params.l2 * params.m1 * params.m3 * sin( + self.x[1] + 2 * self.x[0] - 2 * self.x[2]) - params.g * params.l1 * params.l2 * ( + (params.m1 + 2 * params.m2) * params.m3 + 2 * params.m2 * ( + params.m1 + params.m2)) * params.l2 * sin( + -self.x[1] + 2 * self.x[0]) - 2 * self.x[ + 4] ** 2 * params.l1 * params.l2 ** 2 * params.l2 * params.m2 * ( + params.m2 + params.m3) * sin( + -2 * self.x[1] + 2 * self.x[0]) + 2 * self.u[1] * params.l1 * params.l2 * params.m3 * cos( + -2 * self.x[2] + 2 * self.x[0]) + 2 * params.l1 * params.l2 ** 2 * params.l2 * params.m1 * params.m3 * + self.x[4] ** 2 * sin( + -2 * self.x[2] + 2 * self.x[1]) - 2 * self.u[0] * params.l2 * params.l2 * params.m3 * cos( + -2 * self.x[2] + self.x[1] + self.x[ + 0]) + 2 * params.l1 ** 2 * params.l2 * params.l2 * params.m1 * params.m3 * self.x[ + 3] ** 2 * sin( + -2 * self.x[2] + + self.x[1] + self.x[0]) - 2 * params.l1 ** 2 * params.l2 * self.x[3] ** 2 * ( + (params.m1 + 2 * params.m2) * params.m3 + 2 * params.m2 * ( + params.m1 + params.m2)) * params.l2 * sin( + -self.x[1] + self.x[0]) + 2 * self.u[2] * params.l1 * params.l2 * ( + params.m3 + 2 * params.m1 + params.m2) * cos( + -self.x[2] + self.x[1]) + (2 * self.u[0] * params.l2 * (params.m3 + 2 * params.m2) * cos( + -self.x[1] + self.x[0]) + params.l1 * ( + 4 * self.x[5] ** 2 * params.m3 * params.l2 * ( + params.m1 + params.m2 / 2) * params.l2 * sin( + -self.x[2] + self.x[ + 1]) + params.g * params.m3 * params.l2 * params.m1 * sin( + -2 * self.x[2] + self.x[1]) + params.g * ( + ( + params.m1 + 2 * params.m2) * params.m3 + 2 * params.m2 * ( + params.m1 + params.m2)) * params.l2 * sin( + self.x[1]) - 2 * self.u[1] * ( + params.m3 + 2 * params.m1 + 2 * params.m2))) * params.l2) / ( + params.m2 * (params.m2 + params.m3) * cos( + -2 * self.x[1] + 2 * self.x[0]) + params.m1 * params.m3 * cos( + -2 * self.x[2] + 2 * self.x[1]) + ( + -params.m1 - params.m2) * params.m3 - 2 * params.m1 * params.m2 - params.m2 ** 2) / params.l1 / params.l2 / params.l2 ** 2 / 2, + (-2 * params.m3 * self.u[1] * params.l1 * params.l2 * (params.m2 + params.m3) * cos( + 2 * self.x[0] - self.x[2] - self.x[ + 1]) + params.g * params.m3 * params.l1 * params.l2 * params.l2 * params.m1 * ( + params.m2 + params.m3) * sin(2 * self.x[0] + self.x[2] - 2 * self.x[1]) + 2 * self.u[ + 2] * params.l1 * params.l2 * ( + params.m2 + params.m3) ** 2 * cos( + -2 * self.x[1] + 2 * self.x[ + 0]) - params.g * params.m3 * params.l1 * params.l2 * params.l2 * params.m1 * ( + params.m2 + params.m3) * sin( + 2 * self.x[0] - self.x[2]) - params.g * params.m3 * params.l1 * params.l2 * params.l2 * params.m1 * ( + params.m2 + params.m3) * sin( + -self.x[2] + 2 * self.x[1]) - 2 * params.l1 * params.l2 * params.l2 ** 2 * params.m1 * params.m3 ** 2 * + self.x[5] ** 2 * sin( + -2 * self.x[2] + 2 * self.x[1]) - 2 * self.u[0] * params.l2 * params.l2 * params.m3 * ( + params.m2 + params.m3) * cos( + -2 * self.x[1] + self.x[0] + self.x[2]) + 2 * params.m3 * self.x[3] ** 2 * params.l1 ** 2 * + params.l2 * params.l2 * params.m1 * (params.m2 + params.m3) * sin( + -2 * self.x[1] + self.x[0] + self.x[2]) + 2 * params.m3 * self.u[1] * params.l1 * params.l2 * ( + params.m3 + 2 * params.m1 + params.m2) * cos(-self.x[2] + self.x[1]) + (params.m2 + params.m3) * ( + 2 * self.u[0] * params.l2 * params.m3 * cos(self.x[0] - self.x[2]) + params.l1 * ( + -2 * params.m3 * self.x[3] ** 2 * params.l1 * params.l2 * params.m1 * sin( + self.x[0] - self.x[2]) - 4 * params.m3 * self.x[4] ** 2 * sin( + -self.x[2] + self.x[1]) * params.l2 * params.l2 * params.m1 + params.g * params.m3 * sin( + self.x[2]) * params.l2 * params.m1 - 2 * self.u[2] * ( + params.m3 + 2 * params.m1 + params.m2))) * params.l2) / params.m3 / ( + params.m2 * (params.m2 + params.m3) * cos( + -2 * self.x[1] + 2 * self.x[0]) + params.m1 * params.m3 * cos( + -2 * self.x[2] + 2 * self.x[1]) + ( + -params.m1 - params.m2) * params.m3 - 2 * params.m1 * params.m2 - params.m2 ** 2) / params.l1 / params.l2 ** 2 / params.l2 / 2, + ) diff --git a/src/vboc/parser.py b/src/vboc/parser.py new file mode 100644 index 0000000..adc2859 --- /dev/null +++ b/src/vboc/parser.py @@ -0,0 +1,82 @@ +import os +import yaml +import numpy as np +import argparse + + +def parse_args(): + parser = argparse.ArgumentParser() + parser.add_argument('-s', '--system', type=str, default='double_pendulum', + help='Systems to test. Available: pendulum, double_pendulum, triple_pendulum') + parser.add_argument('-v', '--vboc', action='store_true', + help='Compute data on border of the viability kernel') + parser.add_argument('--horizon', type=int, default=100, + help='Horizon of the optimal control problem') + parser.add_argument('-t', '--training', action='store_true', + help='Train the neural network model that approximates the viability kernel') + parser.add_argument('-p', '--plot', action='store_true', + help='Plot the approximated viability kernel') + parser.add_argument('-m', '--memmo', action='store_true', + help='Learn a policy that drive to equilibrium using VBOC data') + parser.add_argument('--learning_rate', type=float, default=1e-3, + help='Learning rate for the neural network') + parser.add_argument('--batch_size', type=int, default=4096, + help='Batch size for the neural network') + parser.add_argument('--beta', type=float, default=0.95, + help='Low-pass filter for the loss function') + return vars(parser.parse_args()) + + +class Parameters: + def __init__(self, m_name): + # Define all the useful paths + self.PKG_DIR = os.path.dirname(os.path.abspath(__file__)) + self.ROOT_DIR = os.path.join(self.PKG_DIR, '../..') + self.CONF_DIR = os.path.join(self.ROOT_DIR, 'config/') + self.DATA_DIR = os.path.join(self.ROOT_DIR, 'data/') + self.GEN_DIR = os.path.join(self.ROOT_DIR, 'generated/') + self.NN_DIR = os.path.join(self.ROOT_DIR, 'nn_models/' + m_name + '/') + + model = yaml.load(open(self.CONF_DIR + 'models/' + m_name + '.yaml'), Loader=yaml.FullLoader) + self.g = float(model['g']) + self.l1 = float(model['l1']) + self.l2 = float(model['l2']) + self.l3 = float(model['l3']) + self.m1 = float(model['m1']) + self.m2 = float(model['m2']) + self.m3 = float(model['m3']) + self.b = float(model['b']) + self.q_min = (1 + float(model['q_min'])) * np.pi + self.q_max = (1 + float(model['q_max'])) * np.pi + self.dq_min = float(model['dq_min']) + self.dq_max = float(model['dq_max']) + self.u_min = float(model['u_min']) + self.u_max = float(model['u_max']) + self.state_tol = float(model['state_tol']) + + simulator = yaml.load(open(self.CONF_DIR + 'simulator.yaml'), Loader=yaml.FullLoader) + self.dt = float(simulator['dt']) + self.integrator_type = simulator['integrator_type'] + self.num_stages = int(simulator['num_stages']) + + controller = yaml.load(open(self.CONF_DIR + 'controller.yaml'), Loader=yaml.FullLoader) + self.T = float(controller['T']) + self.prob_num = int(controller['prob_num']) + self.test_num = int(controller['test_num']) + self.n_steps = int(controller['n_steps']) + self.cpu_num = int(controller['cpu_num']) + self.regenerate = bool(controller['regenerate']) + + self.solver_type = 'SQP' + self.solver_mode = controller['solver_mode'] + self.nlp_max_iter = int(controller['nlp_max_iter']) + self.qp_max_iter = int(controller['qp_max_iter']) + self.qp_tol_stat = float(controller['qp_tol_stat']) + self.nlp_tol_stat = float(controller['nlp_tol_stat']) + self.alpha_reduction = float(controller['alpha_reduction']) + self.alpha_min = float(controller['alpha_min']) + self.levenberg_marquardt = float(controller['levenberg_marquardt']) + self.alpha = int(controller['alpha']) + self.conv_tol = float(controller['conv_tol']) + self.cost_tol = float(controller['cost_tol']) + self.globalization = 'MERIT_BACKTRACKING'