Fundamentals of Optimization

Optimization is the art of getting the greatest value, whether it's finding the best deal on a new pair of shoes or operating a catalytic cracker at its point of maximum efficiency. It can be a relatively simple exercise when the choices are limited and the cost of each is clear, but some optimization problems can involve thousands of variables requiring sophisticated mathematical analysis.

By Vance J. VanDoren, Ph.D., P.E. March 1, 2006
  • Optimization explained

  • Basic terms, techniques

  • Optimizing satellite dish position

  • Applications to control engineering

Optimization is the art of getting the greatest value, whether it’s finding the best deal on a new pair of shoes or operating a catalytic cracker at its point of maximum efficiency. It can be a relatively simple exercise when the choices are limited and the cost of each is clear, but some optimization problems can involve thousands of variables requiring sophisticated mathematical analysis.

However, certain steps are common to all optimization problems:

  • Identify the cost to be minimized or the benefit to be maximized;

  • Determine the manipulatable variables or independent parameters most affecting the cost or benefit; and

  • Search for the solution or optimal parameters yielding the lowest cost or highest benefit.

Television signal optimization

The cost function (1) for the satellite dish-positioning problem penalizes errors between e, the actual elevation of the dish, and e optimal , the elevation that points directly towards the satellite. It is a quadratic function since the positioning error is squared, yielding an exponentially larger cost as the error increases in either direction. This particular cost function also happens to have a single independent parameter e, so its optimal solution occurs where the derivative of C(e) is zero as in equation (2).
Searching for the strongest signal—when there are multiple satellites in the sky—introduces local minimums into the optimization problem. These appear as “dips” in the cost function C(e) corresponding to the position of each of the satellites emitting weaker signals. In this example, an optimization routine that searches strictly from left to right or right to left would quit before finding the strongest signal with the overall minimum cost.

For example, suppose that the strength of a satellite TV signal drops off exponentially as the receiving dish is rotated increasingly out of line with the satellite. The cost function for this optimization problem is shown in the “Single-variable quadratic cost function” example, momentarily assuming that only the dish elevation is adjustable.

In the unlikely event that satellite position is known, this problem can be solved easily by rotating the dish to point directly at the satellite, thereby reducing the cost function to zero. However, trial-and-error is a far more common technique for more complex optimization problems, either because the cost function is not explicitly known or the optimal parameters are difficult to calculate analytically.

Trial-and-error techniques work well so long as a set of possible solutions can be guessed and the costs of each compared. With the satellite dish, any guess that points the dish towards the southern U.S. sky would be reasonable. The cost of each can be measured directly using a meter that becomes audibly louder as the signal strength increases.

Trial-and-error is more effective if results of prior guesses can be used to make future estimates more accurate. For the dish-pointing problem, that would entail noting the relative volume of previous meter readings and rotating the dish in the direction that yields a louder reading—as opposed to randomly pointing the dish, hoping to find a stronger signal. Mathematically, that procedure corresponds to the gradient method —essentially moving the best guess “downhill” along the cost function to reach the parameter yielding lowest cost.


On the other hand, checking hundreds of randomly selected parameters for the one yielding the lowest cost also can be useful, since plotting the parameters against their associated costs yields an empirical cost-function chart. Then, the optimal parameter can be read right from the chart.

This Monte Carlo method is especially useful if the cost function happens to have more than one local minimum ; that is, multiple points on the cost function that are lower than their neighbors but are not the lowest overall. See the “Local minimums” example. Like the gradient method, systematic trial-and-error techniques tend to home in on one of the local minimums, without seeking better solutions elsewhere. That’s not as much of a problem with the Monte Carlo method or any other optimization technique that considers essentially all possible solutions.

A house’s eaves prevent the satellite dish from being rotated above a maximum elevation of e max . The optimization algorithm is therefore constrained to consider just those parameters that are less than e max , as in equation (3). If a high-flying satellite emits a particularly strong signal, the solution to this problem may well be the elevation that is as close to e optimal as the constraint will allow; that is, e =e max. Petrochemical plants often derive optimal performance from parameters that are set right at their constraining values.
This satellite dish can be rotated to an azimuth of á degrees as well as an elevation of e degrees. The quadratic cost function (4) penalizes errors in both directions. House eaves limit the elevation to no more than e max and the side of the house restricts the azimuth to no more than áamax as in constraints (5) and (6). Mathematically, this creates a three-dimensional, bowl-shaped cost function with the optimal solution at the bottom, and the closest achievable solution somewhere up the near side.

Constraints can also complicate an optimization problem. These conditions preclude a particular solution, even though the solution might otherwise yield the lowest overall cost. For example, a poorly located satellite dish might encounter physical barriers that would prevent it from being pointed directly towards the strongest satellite. See the “Constraints” example.

Multi-variable optimization

Optimization problems are trickier when more than one parameter can be manipulated to lower the cost or increase the benefit. Consider a satellite dish adjustable for azimuth and elevation. See the “Two-variable quadratic cost function” example.

This problem can be solved easily by optimizing each parameter independently—set the elevation to an arbitrary value and rotate the dish to its optimal azimuth, then rotate the dish to its optimal elevation without changing the azimuth. One-at-a-time optimization works in this case because each parameter makes a separate contribution to the cost function and each is constrained independently of the other.

However, most multi-variable optimization problems involve constraints that are functions of more than one parameter, such as “the total of flow rate A plus flow rate B must not exceed 100 gpm,” rather than “flow rate A must not exceed 70 gpm” and “flow rate B must not exceed 30 gpm.” Altering one parameter’s value changes limits on the others, so individual parameters cannot be optimized separately.

An angled downspout adds an additional constraint (10b), where the constants a, b, and c depend on the geometry of the downspout and the dish. The other two constraints are the same as for the previous example, but the cost function has been replaced with a linear objective or benefit function B(e, á). As shown in equation (7), it is a sum of the parameters rather than a sum of squared errors. It is structured to maximize the signal rather than minimize the cost of a poor signal. The higher up the inclined plane a possible solution lies, the closer it comes to being optimal. The feasible region—as defined by the constraints—is all of the positions assumable by the dish. Linear programming problems are always constructed so the parameters closest to optimum will be found at one corner of the feasible region. In this case, that would literally be the lower corner of the downspout, where the dish can come as close as possible to direct alignment with the satellite.

Such problems typically require linear programming —that uses a simplified benefit function rather than a cost function—to reduce the total number of possible solutions. See the “Two-variable linear programming” example. Linear programming also provides the simplex method algorithm to search for the optimal solution by manipulating all parameters in concert, rather than individually.

Control applications

Many of these techniques are applicable to feedback-controller design and operation. To determine the optimal control effort, virtually all controllers use some sort of trial-and-error algorithm that minimizes a cost based on the error between the setpoint and the process variable. For example, as a measure of how well a PID controller has minimized its cost, it will use the current error, the sum of past errors, and the difference between the last two errors. If the error isn’t zero yet, it will compute another guess for the optimal control effort using some variation on the gradient method.

Minimum variance controllers use quadratic cost functions, integrating the squared error over time. In addition to past history, they use process models to guess what future control efforts will be required to minimize cost. Minimum variance controllers can also be designed with cost functions that penalize excessive control-effort variations as well as deviations from the setpoint.

PID loop tuning is a multi-variable optimization problem with three-parameters:

  • Proportional gain, P;

  • Integral gain, I; and

  • Derivative gain, D.

Manual tuning procedures adjust each parameter one at a time to maximize the performance of the closed loop measured in terms of rise time, settling time, overshoot, and so on. Unfortunately, since the benefit function is far from linear, PID tuning is not suited to simultaneous parameter adjustments afforded by linear programming.

Linear programming is used more often for setpoint selection when multiple controllers are attempting to achieve competing objectives simultaneously. Linear programming can determine the setpoints that maximize overall profitability of a multi-variable process, provided each individual controller is successful in forcing its process variable to match its assigned setpoint. Furthermore, if all loops’ operating ranges are known, constraints can be placed on the problem to guarantee that controllers are assigned only achievable setpoints.