Practice#
In this section, you will be tasked with solving a control problem from start to finish.
Feel free to proceed as you wish. You could use a mathematical model or learn a model from data and then attempt to control it.
We will be using the Pendulum environment from gymnasium.
The system consists of a pendulum attached at one end to a fixed point, and the other end being free. The pendulum starts in a random position and we can apply torque to rotate the free end.
As seen below, the pendulum is represented in red and the joint is represented in black.
env = create_pendulum_environment()
result = simulate_environment(env)
show_video(result.frames, fps=env.metadata["render_fps"])
error: XDG_RUNTIME_DIR not set in the environment.
The environments allows the use of the following control (action):
Index |
Action |
Unit |
Min |
Max |
|---|---|---|---|---|
0 |
apply torque to the actuated joint |
torque (N m) |
-2 |
2 |
and the following measurements (observation):
Index |
Observation |
Min |
Max |
|---|---|---|---|
0 |
\(\cos(\theta)\) |
\(-1\) |
\(1\) |
1 |
\(\sin(\theta)\) |
\(-1\) |
\(1\) |
2 |
\(\dot{\theta}\) |
\(-8\) |
\(8\) |
First Goal#
The first goal is to apply torques on the actuated joint to swing the pendulum into an upright position and keep it there.
Second Goal#
The second goal is to apply torques on the actuated joint to swing the pendulum as fast as possible.
Exercise 1#
Exercise 8 (Pendulum Model)
Use one of the previously seen methods to learn a model of the system from data.
Hint
If you would like to use a mathematical model of the system either for the control or just as help when learning a model, then please refer to the following page for the equations.
Solution to ( 8
Data
Show code cell source
env = create_pendulum_environment(max_steps=1000)
observations = dict()
actions = dict()
frames = dict()
controllers = {
"Sinusoid": SineController(env, np.asarray([0.5 * env.max_torque]), frequency=1),
"Schroeder Sweep": SchroederSweepController(
env,
n_time_steps=1000,
n_harmonics=10,
frequency=10,
),
"PRBS": PRBSController(np.asarray([env.max_torque])),
}
for controller_name, controller in controllers.items():
result = simulate_environment(env, controller=controller)
observations[controller_name] = result.observations
actions[controller_name] = result.actions
frames[controller_name] = result.frames
error: XDG_RUNTIME_DIR not set in the environment.
training_controller_name = "PRBS"
testing_controller_name = "Sinusoid"
X_train = observations[training_controller_name][:-1].copy()
U_train = actions[training_controller_name].copy()
t_train = np.arange(0, len(X_train)) * env.dt
X_val = observations[testing_controller_name][:-1].copy()
U_val = actions[testing_controller_name].copy()
t_val = np.arange(0, len(X_val)) * env.dt
SINDYc
Show code cell source
optimizer = ps.STLSQ(threshold=0.9)
feature_library = ps.PolynomialLibrary(degree=2)
sindy_model = ps.SINDy(optimizer=optimizer, feature_library=feature_library)
sindy_model.fit(X_train, u=U_train, t=t_train)
SINDy(differentiation_method=FiniteDifference(),
feature_library=PolynomialLibrary(),
feature_names=['x0', 'x1', 'x2', 'u0'], optimizer=STLSQ(threshold=0.9))In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
SINDy(differentiation_method=FiniteDifference(),
feature_library=PolynomialLibrary(),
feature_names=['x0', 'x1', 'x2', 'u0'], optimizer=STLSQ(threshold=0.9))PolynomialLibrary()
PolynomialLibrary()
STLSQ(threshold=0.9)
STLSQ(threshold=0.9)
(x0)' = -0.981 x1 x2
(x1)' = 0.983 x0 x2
(x2)' = 14.766 x1 + 1.438 u0
Model score: 0.990928
X_sindy = sindy_model.simulate(X_val[0], t_val, u=U_val)
X_sindy = np.vstack([X_val[0][np.newaxis, :], X_sindy])
The results seem to be good, at least for up to a certain number of steps which should be good enough for our purpose. We could also use hyper-parameter optimization to find the best model. However, we have to be careful with overfitting.
EDMDc
Let’s also use DMD to fit a model in order to select the best model of the two methods.
Show code cell source
regressor = pk.regression.EDMDc()
observables = pk.observables.Polynomial(degree=2)
dmd_model = pk.Koopman(observables=observables, regressor=regressor)
dmd_model.fit(X_train, u=U_train, dt=env.dt)
Koopman(observables=Polynomial(), regressor=EDMDc())In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Koopman(observables=Polynomial(), regressor=EDMDc())
Polynomial()
Polynomial()
EDMDc()
EDMDc()
X_dmd = dmd_model.simulate(X_val[0], U_val, n_steps=X_val.shape[0] - 1)
X_dmd = np.vstack([X_val[0][np.newaxis, :], X_dmd])
/opt/conda/lib/python3.10/site-packages/pykoopman/koopman.py:302: ComplexWarning: Casting complex values to real discards the imaginary part
x = x.astype(self.A.dtype)
Model score: 5.936378
The results don’t seem good enough. We could also use hyper-parameter optimization to find the best model. However, we have to be careful with overfitting.
We will use the sindy model for the second exercise.
Exercise 2#
Exercise 9 (Pendulum Control)
Use the learned model and synthesize a controller to achieve the goals described above.
Solution to ( 9
For this exercise, we will use the SINDYc model with along with an MPC controller to achieve our control objectives.
For that we have to first convert the SINDYc model to a CasADi model in order to use it with do-mpc.
Model
mpc_model = build_sindy_model(sindy_model)
To make sure that our model is correct, we simulate the system using it
simulator = Simulator(mpc_model)
params_simulator = {
"integration_tool": "idas",
"abstol": 1e-8,
"reltol": 1e-8,
"t_step": env.dt,
}
simulator.set_param(**params_simulator)
simulator.setup()
Show code cell source
%%capture
x0 = X_val[0]
simulator.reset_history()
simulator.x0 = x0
for u in U_val:
simulator.make_step(u.reshape((-1, 1)))
Controller
setpoint = np.array([1.0, 0.0, 0.0])
cost = casadi.norm_2(mpc_model.x.cat - setpoint) - 100 * mpc_model.x["x0"]
terminal_cost = cost
stage_cost = cost
print(f"Stage Cost = {stage_cost}")
print(f"Terminal Cost = {terminal_cost}")
Stage Cost = (sqrt(((sq((x0-1))+sq(x1))+sq(x2)))-(100*x0))
Terminal Cost = (sqrt(((sq((x0-1))+sq(x1))+sq(x2)))-(100*x0))
u_limits = {"u0": np.array([-2, 2])}
u_penalty = {"u0": 0.00}
x_limits = {"x0": np.array([-1, 1]), "x1": np.array([-1, 1]), "x2": np.array([-8, 8])}
mpc_controller = build_mpc_controller(
model=mpc_model,
t_step=env.dt,
n_horizon=50,
stage_cost=stage_cost,
terminal_cost=terminal_cost,
x_limits=x_limits,
u_penalty=u_penalty,
u_limits=u_limits,
)
Simulation
Show code cell source
%%capture
mpc_controller.reset_history()
simulator.reset_history()
x = np.zeros((3, 1))
# random angle
theta0 = np.random.uniform(low=-np.pi, high=np.pi)
# cosine and sine
x[0] = np.cos(theta0)
x[1] = np.sin(theta0)
# angular velocity
x[2] = np.random.uniform(low=-8, high=8)
simulator.x0 = x
mpc_controller.x0 = x
mpc_controller.set_initial_guess()
for k in range(100):
u = mpc_controller.make_step(x)
x = simulator.make_step(u)
Environment
class MPCController:
def __init__(self, mpc: MPC) -> None:
self.mpc = mpc
self.mpc.reset_history()
x0 = np.zeros((3, 1))
# random angle
theta0 = np.random.uniform(low=-np.pi, high=np.pi)
# cosine and sine
x0[0] = np.cos(theta0)
x0[1] = np.sin(theta0)
# angular velocity
x0[2] = np.random.uniform(low=-8, high=8)
self.mpc.x0 = x0
self.mpc.set_initial_guess()
def act(self, observation: NDArray) -> NDArray:
return self.mpc.make_step(observation.reshape(-1, 1)).ravel()
%%capture
controller = MPCController(mpc_controller)
results = simulate_environment(env, max_steps=100, controller=controller)