Constrained Optimal Control Problem

The process of setting a specific optimal control problem involves a number of steps:

Creating a system of mathematical relationships that describe the controlled object.
Defining control action.
Generating criterion function and indicating direction of function optimization.
Imposing constraints on system change path and the control action.
Defining optimization period.

Depending on the phenomenon type and required level of detail, various types of equations can be used to generate a mathematical model: ordinary, differential, equations with aftereffect, stochastic equations, partial differential equations, and so on.

Object state depends on the control. Optimal control is selected to optimize criterion function of the problem.

It should be noted that the problems of optimizing socio-economic system development are included into the class of discrete dynamic control problems. Depending on the problem and on retrospective data availability, a month or a year can be used as the frequency step.

Non-linear Optimal Control Problem

Generally, a non-linear optimal control problem can be represented as follows: the state of the controlled object at the time moment t is described with the phase coordinates vector x(t) and the control u(t). Thus, the process is completely determined if the control u(t) (when t > t0, where t0 - the initial moment in time) and the initial phase state x0 = x(t0) are defined.

Input parameters:

T. Optimization period.
X(t). Phase variables vector {xi(t)}. Each variable is described with a structure that looks as follows:
- Fi(t). Expression to calculate the value of xi(t) for the entire period from 1 to T. Generally, Fi(t) depends on X(1),…,X(t-1) and U(1),…,U(t).
- x_Lbi(t), x_Ubi(t). Constraints on the range of variable values for the entire period from 1 to T.
- Ri(t). Retrospective values of phase variable for the period from 0, -1, -2,…, -MaxLag.
U(t). Vector of controlling variables {uj(t)}, each variable is described with a structure that looks as follows:
- u_Lbj(t), u_Ubj(t). Constraints for the range of controlling variable values for the entire period from 1 to T.
- Rj(t). Retrospective values of the controlling variable for the period 0, -1, -2,… -MaxLag.
- Initi(t). Initial approximation for each of the moments of time t = 1…T.
Exp. Constraints on relationships between phase and controlling variables:
- Expk(X(t),U(t)). Main part of the expression.
- e_Lb(t), e_Ub(t). Upper and lower constraint for the t-th expression t = 1…T.
ObjFun(X(t),U(t)). Criterion function.
Optimization direction.
Parameters that describe technical points of the calculation:
- Type of the recurrent expressions' estimation (recurrent substitution of values, expression substitution).
- A set of parameters for optimization method (optimization method, accuracy, maximum number of iterations).

Output parameters:

Optimal value of the criterion function.
Ũ(t). Controlling variables' values that correspond to optimal value of the criterion function.
X̃(t). Phase variables' values that correspond to optimal control Ũ(t).

Consider an example of a simple problem for T = 4.

Phase variables:
- x1. First phase variable.
- x2. Second phase variable.
Controlling variables:
- u(t). First controlling variable.
- v(t). Second controlling variable.
Constraints: -1000 < x1[t] * x2[t] + u[t] * v[t] < 1000.
Criterion function: ObjFun(X(4),U(4)) = x1[4]2 + x2[4]2.

Solution: the problem reduces to an equivalent non-linear programming problem. There are two ways of finding the solution:

Using recurrent substitution of values. To get the criterion function value, sequentially estimate values of phase variables x1[t] and x2[t] from t = 1 to t = T. The recurrent procedure from t = 1 to t = T should be executed each time when values of the controlling variables are changed.
Using expressions substitution. To get the criterion function value, sequentially express values of phase variables x1[t] and x2[t] from t = 1 to t = T. x1[4] and x2[4] are expressed at the last step. The obtained expressions are inserted into the criterion function.
The described process is executed only once at the beginning of calculations. As the result, phase variables are excluded from the problem, and only the controlling variables remain. This approach has the following disadvantage: a very fast growth of the expression size with the increase in the number of variables of the control period T.

Optimization: the obtained problem satisfies conditions of a non-linear programming problem regarding controlling variables U. It includes a criterion function and variables restricted with constraints of any kind. The following three optimization methods are implemented:

Grid Search. This is a highly labor-intensive method. It can be used for small-scale tasks.
Non-Linear Simplex. The method ignores non-linear constraints on the variables, therefore, it can also be used under limited circumstances.
Sequential Quadratic Programming Method. This is a method of solving non-linear programming problems. The method requires defining initial approximations for all the controlling variables. The method enables the user to consider non-linear constraints for controlling variables.

Linear Optimal Control Problem

This is a special case of non-linear problem, but all expressions of a linear problem are linear. This enables the user to apply linear optimization method for finding the solution.

Similar to a non-linear problem, a linear problem includes criterion function, a set of equations that describe the dynamics and state of phase variables, and constraints imposed on phase and controlled variables.

Introduce the notations:

X(t) - (k×1). Vector that describes system state at the time moment t.
U(t) - (l×1). Vector of controlling variables at the time moment t.
k. Number of phase variables.
l. Number of controlling variables.

System dynamics is described with an autoregression equation system:

X(t) = A1X(t-1) + ··· + ApX(t-p) + B0U(t) + B1U(t-1) + ··· +BqU(t-q), where t = 1…T

System's state at the time moment t depends on the p previous states, current control U(t), and on q previous controls.

It is required to optimize the linear function F(XT,UT) → extr subject to the following constraints:

And with the defined initial values: X(0), X(-1), …, X(-p+1), U(0), U(-1), …, U(-q+1)

In accordance with the criterion function, the problem is optimized due to finding such values of controlling variables that minimize the criterion function.