Modern society is obsessed with optimization. If there are multiple paths to work, why not choose the fastest one? What’s the quickest way to lose weight? Where can I get the most bang for my buck? So it’s no surprise that many mathematicians and engineers have studied how to solve optimization problems.
I want to tell you about a technology that is quite powerful, and much more pervasive than you’d think. It isn’t just kept within the ivory tower of academia, but operates in the systems of Wall Street banks, industrial manufacturers, advertising agencies, and my personal favorite: It helps land SpaceX Rockets. The name of it: convex optimization.
A quick aside. I use the term technology because convex optimization is not really a technology, but a theory. The technology term is important, though, because once someone takes theory and implements it into a set of code instructions to solve a problem, that becomes an algorithm. Those who’ve developed convex optimization algorithms have contributed to the proliferation of the theory just as much, if not more, than the theorists themselves. At this point, there is enough convex optimization software—that is so usable—you can essentially now call it a technology. Anyway, I’m getting ahead of myself.
You may not remember, but in high school algebra class you likely learned about the concept of linear programming. We used it to solve problems which seem contrived, but are actually the basis for very practical problems in finance and manufacturing. For example: You have $10,000 to invest, and three funds to choose from. The bank CDs have a 3% return, the index fund has an average 7% return, and the risky fund has an average 12% return. Say, for tax reasons, you must invest three times as much in the index fund as you do in the CDs, and for personal reasons you decide not to invest more than $1,000 in the risky fund. How then would you split the money to maximize the total interest returned?
We solved these problems using linear programming, which itself is a subset of a class of problems that fall under convex optimization. Much to the chagrin of my academic readers, I will explain how convex optimization works in a non-academic way, but that should make it no less insightful.
Take a look at the graph below. Imagine we have an hockey-stick type of function that represents something we want to minimize. But, we have to stay within a region defined by the dashed straight lines. The allowable region (called the feasible region), then, is in bold. This is the feasible region (in this case, feasible line) because it’s the region where the function we want minimize exists between the dashed straight lines.
In this case, it is easy to know which point on the feasible line is the point which minimizes our function, as shown by the red dot. It’s only this easy to know the minimizing point because we can visualize it. In real problems, there are so many variables that you cannot visualize the feasible region.
The most important thing to know is that this region is convex.
The idea of the feasible region being convex is very important, and here’s why: You can easily search through a convex region. Because you can easily search through it, it means that once you start at any feasible point in that region, searching through that region to find the best point is only a matter of your searching efficiency. The mathematicians who work on this stuff have developed search methods which have guaranteed time frames of finding the best point (based on the size of the problem).
On top of that, they have even developed methods for which you don’t need to know an initial guess to start from. You just state the constraints, and it will compute what the feasible region is—along with a starting point in that region. For these reasons, you can see why it is very nice when your specific optimization problem happens to be a convex optimization problem.
But that’s the thing, there are many optimization problem which inherently cannot be expressed by convex optimization. This is where the engineer’s perspective comes in. If we can afford to approximate our optimization problem in a way that makes it convex, then we can use the benefits of convex optimization. But now we are solving approximate problems.
There has been a great track record of people turning their inherently non-convex problems into approximate convex ones. And the beautiful thing is—it can work! This is exactly how the SpaceX engineers program their rockets to land back down safely.
As a summary:
We love optimization, and we try to optimize in almost everything we do
In general, optimization is an extremely difficult problem to solve
There is a certain class of problems (convex problems) which we can solve efficiently with guaranteed solutions
The specific benefits of convex optimization:
You don’t need an initial guess
Guaranteed efficiency depending on the size of the problem
If a solution is found, it is the absolute best solution
Thanks for reading,
-William Fife
This is a great introductory on convex optimization! Thank you for putting in the time to write it! I have a couple of comments.
Firstly, you mention that for convex problems we have guaranteed time frames to find the optimal solution. This is true even for nonconvex problems (see branch and bound), but for convex problems the important point is that the runtime is polynomial in the size of the problem rather than exponential as in the case of branch and bound.
Also many convex optimization algorithms do not compute a starting point in the feasible region, for example primal-dual interior point methods which are used in solvers such as CVXGEN and ECOS.
I also think the figure depicting a 1D convex optimization problem is slightly misleading. The two dashed lines should be vertical and the feasible region for the problem is the line segment on the x-axis between the dashed lines.
You also say that you do not need an initial guess to start from to solve convex optimization problems. You do need an initial guess since it is inherently an iterative algorithm, but no matter your initial guess you will find the optimal solution. So the important point is that the quality of the solution is independent of the initial guess.
Finally you mention that SpaceX uses a convex approximation to solve their rocket landing problem. I assume you are referring to LCvx/GFOLD. These are lossless convex relaxations which means that they increase the size of the feasible set and guarantee that a solution to this relaxed convex problem is the optimal solution to the initial nonconvex problem. A convex approximation means that the nonconvex set is approximated by a convex set and the solutions to the two problems do not necessarily match. A lossless convex relaxation as used in LCvx/GFOLD is much stronger than a convex relaxation.