You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A practical guide to controls engineering for Robotics and AI, building up to an implementation of Model Predictive Control.
Control engineering for Robotics and AI
Introduction to control engineering:
Control engineering is the field of engineering concerned with "controlling" systems. But what does this actually mean?
The dictionary definition of control is to "determine the behaviour or supervise the running of". In control engineering I think it is more apt to change the definition to "determine the behaviour and supervise the running of". Think of an engineering problem and how you might apply this concept.
This might include making an aeroplane fly. The first step is to determine the aeroplane's behaviour. If an extraterrestrial being has never seen an aeroplane before and is just told to "make it fly", the result will probably not be good. You might end up with the being deconstructing it and throwing the individual pieces, disintegrating it with a laser gun and spreading the dust in front of a fan, or something entirely more absurd.
We humans know that the plane has certain mechanics, or rules for its motion. We know that if we make it move fast enough, it will start to generate lift on its wings, and elevate into the sky. This is an example of a plane's behaviour. Once we have determined this, we can focus on controlling the flight. Without having determined this behaviour, we will not be able to use a logical method to control the flight. We might get lucky with trial and error, but that is not what engineers do. Without determining this relationship between horizontal speed and vertical lift, we might spend entire lifetimes driving our planes at 10mph and wondering when we will fly. (If you look up the history of the aeroplane, you will see why taking the "determining behaviour approach" through mechanics first might save a lot of time and lives)
The next step is supervising the running of the system. In the case of an aeroplane, we want to make sure that when the pilot sets a speed and direction, the plane will maintain that speed.
Sometimes we will encounter disturbances, such as turbulence. A good control system will be robust to disturbances, meaning that they will be able to quickly and smoothly recover from or reject them.
Control Systems
That might mean controlling the speed of a car to a desired setpoint, the orientation of a satellite in the direction of the Earth, etc, etc.
Take the example of a car's cruise control. How can we make a cruise control that keeps the car's speed at our desired setpoint?
First let us define our system. The car's speed, what we want to control, is our output. This is the main value we care about.
To change the output directly, we have to control an input (or set of inputs). In our case, this will be the engine's throttle. Open it more, and more fuel or air enters the engine, making the engine work harder and eventually driving the car to move faster. Close the throttle, and less fuel and air get to mix and combust, and we end up with a slower speed.
So in this case our input might be the "openness" of our throttle.
Then we have to consider the current state of the system. What is our actual speed right now? The car's speed therefore is one of our states. But in order to see this state, we need to either measure or estimate this. We use a sensor (or observer), or a set of sensors for this. In our case, this is the car's speedometer.
You might notice that I mentioned speed being our output and a state. This is because the output is defined as a linear combination of our states and inputs. The states are the core values of the system, and our output tends to be the states we care to control, or some combination of the states we care to control.
Finally, we need a controller. This controller will be what decides how much to change our input, to reach our desired output. In our case, the car computer will decide how much to open or close the engine throttle to maintain our desired speed.
The controller works by finding out how far away from the setpoint our current output is. Let us call the setpoint our reference from now on. There is a lot of nomenclature in control, sorry. It's best to get used to using these words now before you end up all confused in a conference of control engineers discussing MPCs and PIDs and SSE.
Generic control strategies
Bang bang
PID
State Space Modelling
Discretisation
Euler discretisation
$$\dot{x}(t) = A_cx(t)+B_cu(t)$$
$$y(t) = C_cx(t) + D_cu(t)$$
We would like to discretise this system using the Euler discretisation. The euler discretisation uses a forward divided difference and so it is computationally cheap, although it requires to be accurate enough
Assume that we only have an integer timestep so we have $t_s$ as our timestep length and $k$ as our integer multiplier
Then we can restate this as $\tau = -\lambda$. Yes I know I'm redefining and this is mathematically incorrect, but I am an engineer and I already used tau but in the powerpoint slide for the final equation I wanted, my professor used tau :)
Thus rather than compute integrals, and take many matrix exponentials, we only need to take one matrix exponential and then extract our terms.
MPC
MPC is an advanced control strategy that uses a mathematical model to predict future states of the system. It then uses these predictions to come up with a sequence of the best control inputs it can to reach a desired output. It looks into the future for a fixed amount of time, called the time horizon, and it creates a control sequence for a separate fixed amount of time, called the control horizon. The control horizon is always shorter than the time horizon.
The thing is, this method is not robust, as if we followed this sequence of inputs blindly, if an unexpected disturbance appeared, our system would be derailed entirely and not reach our target output or state. So what we do is only execute the first input of our optimal input sequence, and then recompute our inputs for the next time horizon again.
This has 2 distinct advantages:
The system can adapt to overcome disturbances as they happen, and using the model the disturbance can be observed and "rejected" in real time
The policy can be subject to constraints. Because we are dealing with a constantly computed optimisation problem, unlike traditional robust control methodologies, we can apply multiple constraints in the controller design itself
This also has 2 distinct disadvantages:
Computationally expensive, for long time horizons and with terminal cost especially. This means that for particularly quickly sampled states, or for low powered devices this technique won't be very effective
Requires a good model, so this will not work for a black box system, and is not robust to unmodelled dynamics.
Unconstrained stabilisation with QP
We predict $x_{t+k}$ states, where $1 \leq k \leq N$ and $N$ is the number of steps in our time horizon
$$x_t = x(t) \
x_{t+1} = Ax_t + Bu_t \
x_{t+2} = Ax_{t+1} + Bu_{t+1}
= A^2x_t + ABu_t + Bu_{t+1} \
x_{t+N-1} = A^{N-1}x_t + A^{N-2}Bu_t + A^{N-3}Bu_{t+1}+ ... + Bu_{t+N-2}$$
Then we would like to state our states as a function of our initial state,
$$X_t = Fx(t) + \phi U_t$$
This equation allows us to work out all of our states for n steps into the future, given our initial state at time t and a sequence of inputs. We will use a quadratic program to work out the optimal set of states $U_t^*$ given the current state, as well as our discrete-time A and B vectors.
In order to find our $U_t^*$ we must come up with an optimal control problem.
$Q$ and $R$ are tunable, where higher $Q$ penalises state deviation from 0, and greater $R$ penalises higher control effort. Think about it, your controller wants to use the minimal amount of effort to get your state to 0. But sometimes you dont need to limit your control effort as much, and that gives you a quicker response. But sometimes having a lower effort is more important than quick response, such as autopilot car acceleration.
And that concludes our unconstrained MPC for stabilisation, or setting a state to 0
Unconstrained tracking with QP
Track output to a reference $r(t)$.
We assume the reference is a constant, why? We implement only a single control action in our sequence, so for the time $t_s$ our reference doesn't change.
So we can say $r(t) = r_c$
We want to minimise the difference between the output and our reference, so this time we need to get the next output in terms of our previous ouptut and the changes in state and changes in input. These represent the state and input added from a single timestep.
Thus we will need a new set of equations, starting with state equations that contain a constant unknown disturbance
$$\begin{equation}\begin{bmatrix} \Delta x(t+1) \\ y(t+1) \end{bmatrix}
= \begin{bmatrix} A & 0 \\ CA & I \end{bmatrix}
\begin{bmatrix} \Delta x(t) \\ y(t) \end{bmatrix} +
\begin{bmatrix} B & 0 \\ CB & D \end{bmatrix} \begin{bmatrix} \Delta u(t) \\ \Delta u(t+1) \end{bmatrix}\end{equation}$$
And then our new output is
$$\begin{equation}y_a(t) = \begin{bmatrix} 0 & I \end{bmatrix}
\begin{bmatrix} \Delta x(t) \\ y(t) \end{bmatrix}\end{equation}$$
This gives us our augmented state space, so we have