In mathematical terms, the optimization problem is simply the maximization (or minimization) of a function with respect to its domain. This function is often referred as the objective function in bibliography. A straightforward approach to the optimization problem is to calculate all possible problem solutions and decide afterwards which one is the best. This brute force technique requires considerable resources in terms of computation and as a result it is only feasible for small problems. The whole idea behind optimization algorithms is to find the extrema of the objective function with minimum operations.
Most often the optimization problem gets solved systematically by selecting values for the input variables from within the allowed domain and computing / evaluating the output of the objective function. The rational that generates values for the input variables in each iteration has to do with the optimization algorithm itself. Through this process the solution gets gradually “improved” and, ideally, it finally converges to the best available overall.
In practical applications, the objective function f(x), to be minimized or maximized, is subject to constraints in the form of equality and inequality. These constraints define the domain of f(x). Therefore, the optimization problem is stated as per below in mathematics (Mathworks, 2019):
With respect to:
In this formulation:
- The vector x represents the problem variables.
- The function f(x) represents the optimization objective(s).
- Algebraic expression  represents n equality constraints.
- Algebraic expression  represents m inequality constraints.
It is worth mentioning that the constraints of the objective function may be even non linear in the general case.
The objective function f(x) can be either scalar or vector. When f(x) is a scalar, the optimization problem is “single objective”. This means that the input variables have to optimize a unique (numeric) objective. Let’s consider a factory producing and selling multiple products with a definite number of production capacity. The production planning process could be the result of single objective optimization when formulating the problem as per below:
Select the optimal production quantity per product (input variables) in order to maximize revenue from sales (objective function – single objective) with respect to production capacity constraints.
When f(x) is a vector, the optimization problem is “multi-objective”. As stated in Wikipedia, multi-objective optimization (MOO) is the area of multiple criteria decision making that is concerned with mathematical optimization. This formulation involves multiple (numeric) objectives that need to be maximized (or minimized) simultaneously. The production planning process could be the result of multi-objective optimization when formulating the problem as per below:
Select the optimal production quantity per product (input variables) in order to maximize revenue from sales and simultaneously minimize operating costs (objective function – double objective) with respect to production capacity constraints.
In general, the MOO problems are hard since the objectives are often conflicting, and therefore it is very difficult, or even impossible, to design strategies that are optimal for all objectives simultaneously. The goodness of a solution in MOO is determined by the intuition of dominance. It is said that the solution x1 dominates another solution x2 if and only if (Deb, 2001):
- Solution x1 is no worse than x2 in all problem objectives.
- Solution x1 is strictly better than x2 in at least one problem objective.
The multi-objective optimization problem does not have a dominant optimal solution when the problem objectives are conflicting. However, it has a set of incomparable optimal solutions: each is inferior to the other in some objectives and superior in other objectives. The non-dominated set of the entire feasible decision space (meaning the set of solutions for objective function domain) is the so called Pareto set. The boundary defined by the set of all points mapped from the Pareto set is called the Pareto-optimal front (Deb, 2001).
When maximizing two conflicting objectives, all the solutions lying below the Pareto front are optimal but none is dominant to the rest. The selection of a single solution is therefore the trade-off between multiple objectives and the decision making relies majorly on domain expertise.
Pareto-Optimal Front for (2) conflicting maximization objectives.
Source: Santín et al, 2017
The previous discussion describes important intuitions in mathematical optimization and demonstrates that all optimization problems can be classified into:
- Unconstrained vs Constrained Optimization: depending on whether there are no constraints on the input variables or there are constraints on the input variables.
- Single vs Multi-objective Optimization: depending on whether the objective function is a scalar or a vector.
- Convex vs Non Convex Optimization: depending on whether the objective function has a unique and global optimum or not.
When the objective function is convex, it has a unique and global extrema (maximum or minimum) and convergence can be certainly achieved to the (true) optimal solution. When the objective function is non-convex, it has multiple local extremas and converge to the (true) optimal solution is definitely uncertain, and if achieved, this will take significant time to identify. The definition of convexity is slightly more complicated but very interesting for optimization. It has to do with the calculation of the Hessian matrix formed by 2nd order derivatives. Boyd and Vandenberghe (2009) provide this insight in perfect detail.
Finally, an important taxonomy for optimization problems is the distinction between Continuous Optimization vs Discrete Optimization. Some optimization problems only make sense if the input variables receive values from a discrete set, often a subset of integers, whereas other problems define variables that can take on any real value.
In continuous optimization, the objective function may be linear (see: simplex methods) or non-linear (see: descent methods, quadratic programming or heuristics) with respect to the input variables. In discrete optimization, the input variables may receive only integer values (see: integer programming, combinatorial methods) or even non-integer values.