1 s2.0 S0377221722008773 Main
1 s2.0 S0377221722008773 Main
Invited Review
a r t i c l e i n f o a b s t r a c t
Article history: Robust optimization and stochastic optimization are the two main paradigms for dealing with the uncer-
Received 9 February 2022 tainty inherent in almost all real-world optimization problems. The core principle of robust optimization
Accepted 11 November 2022
is the introduction of parameterized families of constraints. Sometimes, these complicated semi-infinite
Available online 30 November 2022
constraints can be reduced to finitely many convex constraints, so that the resulting optimization prob-
Keywords: lem can be solved using standard procedures. Hence flexibility of robust optimization is limited by certain
Conic programming and interior point convexity requirements on various objects. However, a recent strain of literature has sought to expand ap-
methods plicability of robust optimization by lifting variables to a properly chosen matrix space. Doing so allows
Quadratically constrained quadratic to handle situations where convexity requirements are not met immediately, but rather intermediately.
problems In the domain of (possibly nonconvex) quadratic optimization, the principles of copositive optimiza-
Two-stage stochastic standard quadratic tion act as a bridge leading to recovery of the desired convex structures. Copositive optimization has
problems
established itself as a powerful paradigm for tackling a wide range of quadratically constrained quadratic
Adjustable robust optimization
Distributionally robust optimization
optimization problems, reformulating them into linear convex-conic optimization problems involving only
linear constraints and objective, plus constraints forcing membership to some matrix cones, which can be
thought of as generalizations of the positive-semidefinite matrix cone. These reformulations enable ap-
plication of powerful optimization techniques, most notably convex duality, to problems which, in their
original form, are highly nonconvex.
In this text we want to offer readers an introduction and tutorial on these principles of copositive
optimization, and to provide a review and outlook of the literature that applies these to optimization
problems involving uncertainty.
© 2022 The Author(s). Published by Elsevier B.V.
This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)
https://doi.org/10.1016/j.ejor.2022.11.020
0377-2217/© 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
that a robust constraint of the form this circumstance sporadically throughout the text, our discussion
on the topic will be limited. The interested reader may refer to
f (x, u ) ≥ 0 for all u ∈ U is equivalent to inf { f (x, u )} ≥ 0 .
u∈U Ben-Tal, Goryashko, Guslitzer, & Nemirovski (2004); Bomze & Gabl
In case the infimum problem admits a dual (involving dual vari- (2021); Jeyakumar, Li, & Woolnough (2021); Pólik & Terlaky (2007);
ables λ, a dual feasible set D (which typically also involves x in Woolnough, Jeyakumar, & Li (2021).
some tractable manner) and an appropriate dual objective function The rest of this article is organized as follows: in Section 2 we
f˜(x, λ ), say) attaining its optimal value with zero duality gap, the will give a detailed but by no means exhaustive account of copos-
robust constraint can further be reformulated into itive optimization theory and related topics, concluding with a
guide through surrounding literature. After briefly introducing ba-
sup f˜(x, λ ) ≥ 0 , sic concepts of robust optimization and some of its variants in
λ∈D
Section 3, we will discuss in greater detail the various ways coposi-
where finally the supremum operator can be dropped, since any tive optimization has been applied in robust optimization contexts.
nonnegative feasible value certifies that the supremum is nonneg- A core technique in this regard is the reformulation of semi-infinite
ative as well, so that the robust constraint is fulfilled. constraints with quadratic index, which we will discuss extensively
The desired strong duality property is readily available in case in Section 4. Some of the adjustable robust models discussed there
the infimum problem is a convex optimization problem: here only can be tackled by an alternative approach which seeks to reformu-
mild additional regularity conditions, such as Slater’s condition, late the entire problem rather than individual constraints and is
need to be satisfied. Outside the domain of convex optimization, discussed in Section 5. We then review robust versions and a two-
such strong duality results are much more scarce. stage stochastic version of the so-called Standard Quadratic Opti-
In the domain of (possibly nonconvex) quadratic optimization, mization Problem in Sections 6 and 7, respectively. A copositive
the principles of copositive optimization act as a bridge leading to approach to mixed-binary linear optimization under objective un-
recovery of the desired convex structures. Copositive optimization certainty, that sits conceptually in-between stochastic optimization
has established itself as a powerful paradigm for tackling a wide and distributionally robust optimization, is presented in Section 8.
range of quadratically constrained quadratic optimization problems Finally, we discuss a conic approach to two-stage distributionally
(QCQPs). It aims at reformulating QCQPs into linear convex-conic robust optimization in Section 9.
optimization problems involving only linear constraints and objec-
tive, plus constraints forcing membership to so-called set-copositive 1.1. Notation
matrix cones, which can be thought of as generalizations of the
positive-semidefinite matrix cone. These reformulations allow for Throughout the paper, matrices are denoted with sans-serif
the application of the powerful tools of convex optimization, most capital letters, e.g., E is the matrix of all ones, I is the identity ma-
notably convex duality, to problems which, in their original form, trix and O the matrix of all zeros (the matrix order will depend on
are highly nonconvex. the context). Vectors will be given as boldface lower case letters,
In this text we want to offer readers an introduction and tuto- for instance the vector of all ones (a column of E) is e, the vector
rial on these principles of copositive optimization, and to provide of zeros is o and the vector ei is the ith column of I. By T we de-
a review and outlook of the literature that applies these to robust note transpose. For a square matrix M, diag M extracts its diagonal
optimization problems. We hope that the reader will acquire the as a column vector while Diag x produces a diagonal matrix with
following benefits: diagonal x. For any x = [xi ]i ∈ Rn we denote by x ◦ x = [x2i ]i ∈ Rn its
Hadamard square. We will also use the shorthand
• gaining an overview on existing copositive optimization ap-
proaches to robust optimization as well as open questions in 1 xT
Y (x, X ) := .
this field; x X
• understanding basic principles of convexifying nonconvex Sets will mostly be indicated using letters or acronyms in cap-
QCQPs in the style of copositive optimization with a focus ital calligraphic font. Most importantly: S n is the space of sym-
to practice-oriented applications; metric n × n matrices, N n ⊂ S n those of them with no negative
• being exposed to open problems and interesting research di- entries and S+ n those of them with no negative eigenvalues, i.e.,
rections, which hopefully inspire the pursuit of new research positive-semidefinite (psd) symmetric matrices of order n (some-
in this area. times the cone S+ n is referred to as the psd-cone in short), SOC n =
Regarding the final point we will discuss open problems (x0 , x ) ∈ R : x ≤ x0 is the second-order cone.
T T n
throughout the text. However, for the readers’ convenience we will There are occasional exceptions, e.g., the n-dimensional Eu-
attach a dedicated “section with open problems” at the end of clidean space Rn , its nonnegative orthant Rn+ , or the index set
each topic, where we will summarize interesting research direc- [i : j] = {i, i + 1, . . . , j − 1, j}, where i < j are integer numbers. For
tions point by point. a set A we denote cl(A ), int(A ), conv(A ) its closure, interior, and
In the sequel, we will not delve into much much detail on convex hull, respectively, and for a convex set A we denote by
robust optimization theory, since there are great tutorials avail- relint(A ) its relative interior, as well by ext(A ) the set of its ex-
able, providing excellent introductions to the field and its vari- tremal points. For a cone K ∈ Rn we denote the dual cone as
ous sub-genres, for example Bertsimas, Brown, & Caramanis (2011); K∗ := x ∈ Rn : yT x ≥ 0 for all y ∈ K . For any optimization prob-
Gorissen, Yanikoglu, & den Hertog (2015); Rahimian & Mehrotra lem (P ), we denote by val(P ) its optimal value, regardless whether
(2019); Wiesemann, Kuhn, & Sim (2014); Yanikoglu, Gorissen, & it is attained or not.
den Hertog (2019). In the interest of a focused and concise pre-
2. Convexifying QCQPs via set-copositive optimization
sentation, we will also omit discussions on another strain of lit-
erature dealing with convexifications of QCQPs by means of the
2.1. Basic lifting strategies and their core ingredients
so-called S-Lemma and its many variants. However, let us high-
light that this topic has strong ties with copositive optimization
A QCQP consists of minimizing a quadratic function subject to
as well as robust optimization. Most notably, copositive optimiza-
quadratic constraints, formally given by
tion is sometimes referred to as an alternative to the S-Lemma
0 x − ω0 : x Qi x + 2qi x ≤ ωi , i ∈ [1 : m]
inf xT Q0 x + 2qT T T
in the context of robust optimization. While we will comment on (1)
x∈K
450
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
where {Qi : i ∈ [0 : m]} ⊂ S n , {qi : i ∈ [1 : m]} ⊂ Rn and ωi are real demonstrated in the above example for the case where the orig-
numbers. K ⊆ Rn is a cone which one could choose to be any cone inal feasible set contained just two points can be generalized to
representable by (finitely many) linear or quadratic inequality con- the case where the feasible set, say F, is arbitrary. In this case a
straints
n n without leaving the domain of QCQPs, for instance K ∈ general convexification can by achieved via a lifted set given by
R , R+ , SOC n . Note that neither the objective nor the feasible set T
need be convex, the latter may even be disconnected. Indeed, gen- 1 1
eral QCQPs are NP-hard as they contain many NP-hard problems as G (F ) := clconv :x∈F .
x x
special cases (see e.g. Pardalos & Vavasis, 1991).
In our discussion we want to familiarize the reader with a spe-
Characterizing G (F ) for a given set F is challenging, and we will
cific type of convexification of QCQPs, that is simple, yet ultimately
spend a considerable part of this text discussing known strategies,
very powerful. To convince even readers who are unfamiliar with
and highlighting open questions in this regard. However, irrespec-
the subject of the simplicity of the approach, we will now discuss
tive of the characterization, optimizing a linear function over this
some simple examples that nonetheless exhibit all the ingredients
set will always yield optimal points that are dyadic matrices whose
that are necessary for understanding the machinery.
factors contain x ∈ F.
Example 1. Consider the following optimization problem: It is however noteworthy that not all optimal solutions to prob-
lems of the type (5) and its generalization have this quality. But
minn xT Qx + 2qT x : x ∈ {a, b} ⊂ Rn . (2)
x∈R in general optimal solutions are always in the convex hull of the
optimal dyadic solutions.
Clearly, the problem is easily solved by just evaluating the ob-
We also want to highlight the fact that all dyadic matrices are
jective at both feasible points and then choosing the minimizer.
positive-semidefinite. In fact, the psd-cone is the convex hull of all
Still, we have a (possibly) nonconvex objective that is optimized
symmetric dyadic matrices, which are also the generators of the
over a nonconvex feasible set, so that the problem belongs to a
extreme rays of that cone. This foreshadows the fact that, in prac-
class of actually difficult problems and it is in fact a nice take-off
tice, many characterizations of G (F ) are achieved via conic inter-
point for thinking about how to convexify more general problems
sections involving suitable sub-cones of the psd-cone, namely the
in this class. Firstly, observe the following equivalence: xT Qx =
so-called set-completely positive cones whose extreme rays are gen-
Tr(xT Qx ) = Tr(QxxT ) = Q • xxT , which holds since the trace of a
erated by the dyadic matrices the factors of which are elements of
number is the identity function and the trace-operator is invariant
certain sets. We will discuss these objects in more detail later in
under cyclic permutation of matrix products. Note that the Frobe-
the text.
nius product is bilinear, so that we can achieve a linearization of
At this point we want to further the intuition regarding our
the problem via the following modifications:
convexification strategy by repeating a neat example originally
min Q • X + 2qT x : X = xxT , x ∈ {a, b} (3) given in Burer (2015), which we will discuss in extensively in order
x∈Rn ,X∈S n
to highlight its connection to the rest of our exposition.
Further, we can eliminate the explicit relation between X and x by
pushing it into the description of the feasible set in order to obtain Example 2. The next example is an extended take on an exam-
ple discussed in Burer (2015). Consider the following optimization
problem:
min Q • X + 2qT x : ( x, X ) ∈ a, aaT , b, bbT . (4)
x∈Rn ,X∈S n
min Qx2 + 2qx : 1 ≥ x ≥ −1 . (6)
A convexification is now easily obtained by replacing the feasible x∈R
set with its convex hull, since the linear constraint will attain its
Depending on the sign of the coefficient Q this can be a nonconvex
optimum at an extreme point of the so obtained convex feasible
quadratic optimization problem, which we will now conexify in the
set. In our case, the latter is a line segment connecting the two
style discussed in this section. In some simple steps we can obtain
points in the feasible set of (4), which also are the extreme points
of this line segment. Rather than expressing this convexification in
the space of tuples of the form (x, X ), it is instructive to represent
min Qx2 + 2qx : 1 ≥ x ≥ −1
it entirely in the space S n+1 in the following manner: x∈R
min Q • X + 2qT x : Y (x, X ) ∈ conv Y (a, aaT ), Y (b, bbT ) . = min QX + 2qx : X = x2 , 1 ≥ x ≥ −1
x∈R ,X∈S n (x,X )∈R2
n
451
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
0 q x x Theorem 1. Let F := x ∈ K : xT Qi x + 2qT x ≤ ωi , i ∈ [1 : m] ⊆ Rn
min • 0 i
be a feasible set of a QCQP where K is a closed cone, and denote
(x0 ,x,X )∈R3 q Q x X
by
1 0 x x T
• 0 = 1,
0 0 x X 1 1
G (F ) = clconv :x∈F ,
x x
0 0 x x
• 0 ≤ 1,
0 1 x X where clconv(A ) stands for the closure of the convex hull of a set A.
Then for any Q0 ∈ S n and q0 ∈ Rn we have
x0 x
∈ S+2 .
0 x − ω0
val(P ) := inf xT Q0 x + 2qT
x X
x∈F
Here we explicitly write down the de-homogenizing equation x0 =
= inf 0 x − ω0 =: val (R ) .
Q0 • X + 2qT
1 in order to make the geometry of the feasible set as transparent Y (x,X )∈G (F )
as possible. We see that the set of feasible matrices is again a sub-
set of the psd-cone. More importantly, the extreme points of the Proof. See, e.g. Burer & Anstreicher (2013); Eichfelder & Povh
feasible set are all boundary points of the psd-cone, which, in case (2013). For the readers’ convenience we repeat the argument here.
of 2 × 2 matrices, are all dyadic matrices (in higher dimensions the We refer to the QCQP as (P) and to the reformulation as (R).
psd-cone has non-dyadic boundary points). Hence, the above opti- Let x be feasible for (P), then x, xxT is feasible for (R) with
identical objective function value given by Q0 • xxT + qT 0 x − ω0 =
mization problem will attain its optimal value at a point where
T xT Q0 x + 2qT
0
x − ω 0 . Thus val ( R ) ≤ val ( P ). For the converse, let
x0 x 1 x 1 1 (x, X ) be ε -optimal for (R), i.e., Q0 • X + 2qT0 x − ω0 ≤ val(R )+ε.
= = , with − 1 ≤ x ≤ 1 ,
x X x x2 x x (We need an arbitrarily small ε > 0 in case that is val(R ) not at-
or in other words, the set of feasible matrices of the relaxation is tained.) Then by definition of G (F ) as the closure we have like-
precisely G (F ), where F is the original feasible set. wise d (x, X ), λi xi , xi xTi < δ with xi ∈ F, ki=1 λi = 1 and
k
i=1
In the previous example, consider the case where Q = −1 and λi ≥ 0, i ∈ [1 : k], and δ > 0 so small that, by continuity,
q = 0, so that the original quadratic problem is a nonconvex prob- k
lem with optimal value given by −1, which is attained at x ∈ |Q0 • X + 2qT0 x − λi xTi Q0 xi + 2qT0 xi | < ε .
{−1, 1}. The convex reformulation gives the same optimal value i=1
and indeed the points (x, X ) ∈ {(−1, 1 ), (1, 1 )} are optimal solu-
So, on one hand, xT Q0 xi + 2qT x − ω0 ≥ val(P ) for all i ∈ [1 : k] and
tions. But so are all the points (x, X ) = λ(−1, 1 ) + (1 − λ )(1, 1 ), i 0 i
on the other hand,
λ ∈ [0, 1], or, expressed in the lifted space
T T val(R ) + ε ≥ Q0 • X + 2qT
0 x − ω0
x0 x 1 1 1 1
=λ + (1 − λ ) , λ ∈ [0, 1] , k
x X −1 −1 1 1
= Q0 • X + 2qT
0x − ω0 − λi xTi Q0 xi + 2qT0 xi − ω0
which illustrates that the optimal solutions of the relaxation are in i=1
the convex hull of its dyadic solutions. Since the latter correspond k
to optimal solutions of the original problem, the x component of + λi xTi Q0 xi + 2qT0 xi − ω0
the optimal solution to our relaxation are always in the convex i=1
hull of optimal solutions to the original problem. Hence, unless the k
original feasible set is already convex, the x components of a solu- ≥ −ε + λi xTi Q0 xi + 2qT0 xi − ω0
tion to the reformulation are not necessarily feasible to the original i=1
problem. k
One must however not confuse convex combinations in the ≥ −ε + λi val(P ) = val(P ) − ε ,
original space of variables with convex hulls in the lifted space! It i=1
is vital to understand that G (conv(F )) is always a strictly larger set
which shows val(R ) + 2ε ≥ val(P ). As ε was arbitrarily small, we
than G (F ), unless F is a singleton. Said differently: convex combi-
arrive at val(R ) ≥ val(P ).
nations in the original space do not correspond to convex combi-
nations in the lifted space. To illustrate this point, let us revisit the Despite the simplicity of the theorem we want to take a mo-
problem in Example 1 for the special case n = 1, a = −1 and b = 1. ment and reconsider the core ingredients that enable its valid-
As we can see, the feasible set of the problem in Example 2 is just ity. The first one is a linearization by lifting to matrix variables:
the convex hull of these points. However, the feasible set of the from a quadratic form xT Qx = Tr(xT Qx ) = Tr(QxxT ) = Q • xxT we
latter problems convexification is not just the convex hull of the pass on to a linear form Q • X, in substituting Xi j for xi x j . The
two lifted extreme points, but the convex hull of an entire curve second
ingredient is the set G (F ). Merely requiring that (x, X ) ∈
of points, each of which represents a lifting of a convex combina- (x, xxT ) : x ∈ F would obviously render the linearization to be
tion of the points {1, −1}. Merely considering the convex hull of exact. But linear optimization is invariant to taking the convex hull
the lifted extreme points of the interval yields G ({1, −1}), i.e., the of the feasible set, a fact often exploited in, for example, mixed in-
feasible set of the convexification problem in Example 1, which is a teger linear optimization, where one seeks to find the convex hull
much smaller lifted set. In fact, no dyadic matrix can be expressed of integer points.
as the convex combination of two dyadic matrices which are not The characterization of G (F ) is the major challenge when em-
just re-scalings of that matrix, i.e., xxT = λy1 yT1
+ ( 1 − λ )y2 yT
2
im- ploying the reformulation strategy depicted in Theorem 1 and a
plies yi yT
i
= μ i xx T , μ ≥ 0, i ∈ [1 : 2], as we prove later (see the
i general workable description of G (F ) is not known. There are,
proof of Proposition 11 in the appendix). however, characterizations for specific instances of F.
With the preceding discussion in mind, the following theorem, References to important examples of such reformulations in lit-
which is at the heart of all convexifications of QCQPs we will dis- erature will be given in the sequel and will be summarized in
cuss in this text, should be easily accessible to the reader. Section 2.4.
452
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
⎧ inf
2.1.1. Lower bounds by Shor relaxation: exactness and strengthening
⎪ X∈S+2 q11 X11 + 2q12 X12 + q22 X22
results ⎪
⎨ s.t. : 2X + X ≤ 12,
11 22
A natural starting point for the construction of G (F ) is based X11 + 2X22 ≤ 12
upon the so-called Shor relaxation introduced in Shor (1987). A ⎪
⎪ 4X11 + X22 ≥ 4
⎩
central role here is played by the set-completely positive matrix cone X11 + 4X22 ≥ 4 .
defined as
The feasible sets of these problems are depicted in Fig. 1. Since all
CPP (K ) := conv xxT : x ∈ K , matrices at the boundary of S+ 2 are dyadic matrices, we see that
for a cone K ⊆ Rn . The matrix cone CPP (K ) is a closed cone the extreme points of the lifted feasible set are also dyadic. There-
whenever K is closed, and with nonempty interior whenever K has fore the relaxation has optimal solutions of the form xxT and x
nonempty interior (see e.g., Mittal & Hanasusanto, 2021, Lemma 4 is feasible for the original QCQP, hence the relaxation is exact. Of
or Tuncel & Wolkowicz, 2012, Theorem 5.1). It is the convex hull course, there are more potentially optimal solutions to the Shor re-
of extreme rays spanned by dyadic matrices. These are precisely laxation (depending on the objective function), but these are con-
the positive-semidefinite matrices of rank equal to 1, except for vex combinations of dyadic optimal solutions. An example can be
the zero matrix O = ooT , which has rank equal to zero. In gen- seen in Fig. 1 as the line connecting the two lower vertices in the
eral, CPP (K ) is an intractable cone in that membership of a given lifted feasible set.
matrix is hard to decide (Dickinson & Gijben, 2014). Thus, when
working with this object, one is bound to use either approxima- Algorithm 1: Solving copositive optimization problems.
tions or clever tools to check membership. Since these tools are
Result: v∗
essential when working with CPP (K ), we will devote an entire
1 set k = 1;
section to this matter (see Section 2.3.1). In the present section,
2 construct outer approximation Ck ⊇ COP (K ) ;
we will merely focus on its relation to the Shor relaxation, which
3 repeat
can be best explained by looking at a homogeneous QCQP:
generate a feasible point for v(Ck ) to obtain (Sk , yk );
4
453
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
Fig. 1. (a) The feasible set F of the QCQP in Example 3. (b) The √ feasible set FShor of its Shor relaxation. As a consequence of Theorem 2 below, FShor coincides with G (F );
we show a projection of G (F ), given by the map (x, X ) → (X11 , 2 X12 , X22 )T from R2 × S 2 to R3 , illustrating the intersection of four half-spaces and the psd-cone.
that the matrices in which is feasible and yields the optimal value of −2. Thus, there is
an optimal dyadic solution to the Shor relaxation, which is enough
3 1 3 −1 2 0
M := , , to eliminate the relaxation gap.
1 3 −1 3 0 1
Let us summarize our observations. The dyadic matrices are at
n :M•X=α
the boundary of S+ n and for n = 2, this boundary is entirely com-
are all positive-definite, so that the set FM := X ∈ S+
is nonempty, compact, conic intersections whenever M ∈ M and prised of dyadic matrices so that it is actually ext CPP (R2 ). How-
ever, since we consider the convex hull of the latter, i.e., S+ 2 , we
α ≥ 0. The three linear inequalities are precisely of the form 2
M • X ≤ α where M is one of the matrices in M. produced an extreme point in the interior of S+ , which thus is of
Now let us examine those extreme points of the FShor where rank greater than one. For certain choices of the objective function
none of these linear inequalities are binding. Hence, we ask for coefficients, there will therefore be a gap between the two opti-
the extreme points of the psd-cone and the only one there is the mization problems. On the bright side, we also see that even if we
zero matrix O = ooT . If only one linear constraint is active, the ex- are far from describing G (F ), the Shor relaxation can be exact for
treme points are those extreme points of FM , with M ∈ M, which some choices of the objective function coefficients.
fulfill the other two inequalities in FShor strictly. But FM is a com- The above discussion makes it apparent that the Shor relaxation
pact conic intersection, so that its extreme points are points in the is not necessarily tight, since its feasible set FShor can have extreme
intersection of the hyperplane with extreme rays of S+ 2 , i.e. rays
points that are not dyadic matrices. Under additional assumptions,
spanned by dyadic matrices. Therefore they are themselves dyadic one can close the gap at least for the homogeneous case. To this
matrices. Finally, let us examine the extreme points that fulfill ex- end, we introduce the following geometric condition.
actly two of the linear inequalities. The points that fulfill two of
the inequalities must form either a line, a half line or a line seg- Condition 1. For a collection of matrices Qi ∈ S n and real numbers
ment that is a subset of FM for M ∈ M, but these are compact n with
bi , i ∈ [1 : m] we say that Condition 1 holds if for any X ∈ S+
sets, so that they form a line segment, given by the intersection Qi • X ≤ ωi for all i ∈ [1 : m],
of S+2 and a line. The extreme points of these sets are therefore
the two points where the respective lines intersect the boundary Qk • X < ωk for all k ∈ [1 : m] \ {i, j} whenever
of S+2 , which is entirely comprised of dyadic matrices. (Note that Qi • X = ωi and Q j • X = ω j for i = j .
this is the case for the psd-cone S+ 2 only, for S n with n > 2 there
+
are boundary points that are not dyadic. However, we will later see The condition requires that for any feasible X ∈ S n at
in Theorem 2 that the Shor relaxation is exact whenever only two most two constraints can be binding
n : Q • X ≤ ω , i ∈ [1 : m]
at the same time. If
inequality constraints are present, so that the argument would in FShor := X ∈ S+ is bounded (as as-
i i
fact stay valid if n > 2.) sumed in Theorem 2), one can check Condition 1 by solv-
In total, we see that all extreme points of FShor are dyadic ex- ing (m3 − 3m2 + 2m )/6 semidefinite optimization problems
cept for the one we have identified as the identity matrix I. Thus, if of the form supX∈FShor Qk • X − ωk : Qi • X = ωi , Q j • X = ω j .
we choose the objective function coefficients such that the optimal For Condition 1 to hold, all the optimal values must be strictly
solution of the Shor relaxation is attained at a point other than I, smaller than 0. Note that K = Rn here.
then the relaxation will be tight. As an example for the latter case,
let us consider q11 = q22 = −1 and q21 = 0. In this case, the√optimal Theorem 2. Suppose that Condition 1 holds for the matrices Qi ∈ S n
value of the QCQP is given by −2 attained at (x1 , x2 ) = (0, 2 ). The and real numbers ωi ∈ R, i ∈ [1 : m]. Further,
suppose that the set
n : Q • X ≤ ω , i ∈ [1 : m] is bounded. Then
FShor := X ∈ S+
Shor relaxation attains the same optimal value of −2 at X = I, but i i
clearly this is not the only optimal point since the dyadic matrix
formed from the optimal solution of the QCQP gives a feasible so- inf xT Q0 x : xT Qi x ≤ ωi , i ∈ [1 : m]
x∈R n
lution with the same optimal value, that is:
= infn {Q0 • X : Qi • X ≤ ωi , i ∈ [1 : m]}.
T X∈S+
0 √0 0 0
X= √ = ,
2 2 0 2
Proof. See Bomze & Gabl (2021).
454
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
While Theorem 1 and the results above clarify the role of of thelinearizedinequalities, but
the
geometry of the convex hull
G (F ) for optimization problems, explicit characterizations of the of ext CPP (R2+ ) , namely CPP R2+ = S+ 2 ∩ N that generated the
2
set G (F ) have been given in terms of FShor and additional cuts. problem.
The respective results are summarized in the following theorem:
The above example demonstrates that the complex geometry of
Theorem 3. Consider the following feasible sets of QCQPs: CPP (K ) may present a formidable challenge if one seeks to close
the relaxation gap. In the following section we introduce a pow-
• F1 := {x ∈ Rn : x ≤ 1, Ax ≤ b}, with A ∈ Rm×n , where the m
erful machinery that meets this challenge by exploiting this very
hyperplanes described by Ax = b do not intersect inside the
geometry in an elegant way.
unit ball.
• F2 := x ∈ R2 : Ax ≤ b with A ∈ R3×2 , b ∈ R3 such that F2
is a nondegenerate planar 2.1.2. Burer’s convex reformulation of a large class of QCQPs
triangle. One of the most celebrated examples of an application of
• F3 := x ∈ R2 : Ax ≤ b with A ∈ R4×2 , b ∈ R4 such that F3
is a nondegenerate planar quadrangle. Theorem 1 is Burer’s completely positive reformulation of a quite
large class of QCQPs:
Let ai be the ith row of A. Then
n+1 bi x − Xai ≤ bi − aTi x, i ∈ [1 : m],
• G ( F1 ) = Y ( x, X ) ∈ S+ : trace(X ) ≤ 1, ,
bi aTj x + b j aT x − aT Xa j ≤ bi b j , (i, j ) ∈ [1 : m]2
i i
• G ( F2 ) = Y ( x, X ) ∈ S+
3 : b aT x + b aT x − aT Xa ≤ b b , (i, j ) ∈ [1 : 3]2
i j j i i j i j
• G ( F3 ) = Y ( x, X ) ∈ S+
3 : b aT x + b aT x − aT Xa ≤ b b , (i, j ) ∈ [1 : 4]2 .
i j j i i j i j
455
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
(i) Ax = b and diag AXAT = b ◦ b; which is a nonconvex set. We now want to find a convex set J ⊆
(ii) MYMT = O; conv(K ) so that H ∩ J = conv(K ∩ H ). We claim that desired set is
(iii) MY = O.
1 −1
Proof. See Burer (2012, Proposition 3).
J := {x ∈ conv(K ) : x2 = 0} = λ1 0 + λ2 0 : λ1 , λ2 ≥ 0 .
The original proof of the theorem is quite algebraic and seems 1 1
somewhat removed from the simple, geometric motivation of (7)
Theorem 1. Fortunately (Kim et al., 2020) recently provided a
geometrical perspective on the subject. The concepts they intro- To see this, let us first check that conv(H ∩ K ∩ J ) = conv(H ∩ K ),
duce are quite versatile and allow proofs for generalizations of which follows by merely showing that H ∩ K ∩ J = H ∩ K. Clearly
Theorem 4 as well as exactness proofs for relaxations of polyno- H ∩ K ∩ J ⊆ H ∩ K, but also
mial optimization problems. The theorems presented in the re- 1 −1
mainder of this section are simplified (and thus less powerful) ver- J∩K= λ 0 :λ≥0 ∪ λ 0 :λ≥0 ,
sions of results in Kim et al. (2020) for presentational reasons. 1 1
Also, they will be strong enough to prove a weaker version of
which contains H ∩ K, so that the desired equivalence is obvious.
Theorem 4, under the additional assumption that L is bounded.
Now we can use Theorem 6 to establish conv(H ∩ K ∩ J ) = H ∩ J.
We start out by investigating a more general question. Let V
We see that H ∩ J is bounded since
be a vector space of dimension n. For a (possibly nonconvex) cone
K ⊆ V, and vectors Q, H0 ∈ V and a convex set J ⊆ conv(K ), we 1 −1
want to know which conditions establish the equality: λ1 0 + λ2 0 = λ1 + λ2 = 1 , λ1 , λ2 ≥ 0 ⇒ λi ∈ [0, 1] , i = 1, 2 ,
1 1
min {Q, X : X ∈ K ∩ J, H0 , X = 1} 3
(8)
X∈V
= min {Q, X : X ∈ J, H0 , X = 1}. so that all that is left to show is that J is a face of conv(K ).
X∈V
We have that x ∈ conv(K ) implies that x2 ≥ 0 so that J is such a
Defining H := {X : H0 , X = 1} ⊆ V, we can equivalently ask for face by Theorem 7. Geometrically, it is the convex hull of the two
conditions for the equality “legs” of K that point the z-direction. It is also an exposed face of
conv(H ∩ K ∩ J ) = H ∩ J. conv(K ), where the exposing hyperplane is described by x2 = 0.
Let us convince ourselves that the conclusion of the procedure
The following theorem gives an answer based on convex geometry.
is actually true. It is immediate that
Theorem 6. For H, K, J as above, assume that H ∩ J = ∅ is bounded
1 −1
and that J is a face of conv(K ). Then conv(H ∩ K ∩ J ) = H ∩ J.
conv(K ∩ H ) = λ 0 + (1 − λ ) 0 : λ ∈ [0, 1] ,
Proof. See Kim et al. (2020). 1 1
This theorem motivates the search for a condition that lets us on the other hand, in this simple example, (8) already tells us that
identify faces of convex cones, which are provided in the following H ∩ J is the same set.
theorem. We can use this simple setup to test the conditions of
Theorem 6. First let us study a failure of boundedness of H ∩ J,
Theorem 7. Assume that J = {X ∈ conv(K ) : Qi , X = 0, i ∈ [0 : m]}
which we can construct by choosing J = conv(K ). In this case J is
and define
still a (trivial) face of conv(K ) but
J p := {X ∈ conv(K ) : Qi , X = 0, i ∈ [0 : p]},
0
so that Jm = J and J−1 = conv(K ). If Q p ∈ J∗p−1 for all p ∈ [0 : m] H ∩ J = conv(K ∩ H ) + λ 1 : λ ≥ 0 ⊃ conv(H ∩ K ),
then J is a face of conv(K ). 0
Proof. See Kim et al. (2020). hence, we get a strictly bigger set than the desired convex hull.
Now, let us consider a slightly enlarged version of the J defined in
Before we apply this machinery to convexify QCQPs, we will
(7) given by
supply a small example for illustrating above theorems. The ex-
ample itself is not immediately connected to QCQPs, but the geo- 1 −1 0
metric intuition it seeks to convey may further the understanding J := λ1 0 + λ2 0 + λ3 1 : λ1 , λ2 , λ3 ≥ 0 ,
of the convexification strategy as a whole. 1 1 1
Example 6. Consider the nonconvex cone for which we can easily check both J ⊆ conv(K ) and H ∩ K ∩ J =
H ∩ K. Also, boundedness of H ∩ J is immediate from an argument
1 −1 0
K := λ 0 :λ≥0 ∪ λ 0 :λ≥0 ∪ λ 1 :λ≥0 ⊂ R3 , analogous to (8). However, J is no longer a face of conv(K ) and in
1 1 0 fact
which is the union of three half-rays emanating from the origin in 1 −1 0
H∩J= λ1 0 + λ2 0 + λ3 1 : λ1 + λ2 + λ3 = 1, λ1 , λ2 , λ3 ≥ 0 ,
three different directions, two of which form a “V” in the xz-plane
1 1 1
and the other one covers half of the y-axis. The intersection of K
with the hyperplane so that, again, the conclusion of the theorem is not sustained.
Finally we would like to point out that the present example is
H := x ∈ R3 : x3 = 1 ,
not entirely unrelated to QCQPs. Consider again Example 1 with
which is a plane parallel to the xy-plane at height 1, are the points a = [1, 1]T , b = [−1, 1]T . Then the feasible set of (3) can be de-
in scribed as a conic intersection given by
1 −1
x1 1 −1
K∩H= 0 , 0 , ∈ R2 : x2 = 1 , x ∈ K := λ , λ≥0 ∪ λ , λ≥0 .
x2 1 1
1 1
456
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
The Shor relaxation is then given by Step 4: Conclude that G (F ) = J ∩ H = Y : H0 • Y = 1, Q i •
Y = 0, i ∈ [0 : m], Y ∈ CPP (R+ × K )}, by Theorem 6.
x0 x
∈ S 2 : x0 = 1, Y (x, X ) ∈ CPP (K ) , (9)
x X As a reference and illustration we will prove a special case of
which can certainly be made exact by replacing CPP (K ) with Theorem 4, where the feasible set is bounded, using this recipe in
ext (CPP (K ) ). However, the preceeding discussion allows us to the appendix.
conclude that the Shor relaxation is tight anyways. We merely have
to consider isomorphism π : S 2 → R3 given by 2.1.3. Unions of feasible sets and subtractions of ellipsoids
Given a workable description of G (Fi ), i ∈ [1 : k] it is always
! " X
x0 x possible to derive characterizations of G (∪ki=1 Fi ) and it is also pos-
π → x , (10) sible to give a characterization of G (F1 \ ∪ki=2 int Fi ) in case Fi , i ∈
x X
x0 [2 : k] are ellipsoids that fulfill certain regularity conditions. We
to see that π (ext (CPP (K ) ) ) is essentially K where the third leg, summarize the respective procedures in the following two theo-
which was spurious for the derivation of the convexification, got rems.
removed. Also, the hyperplane spanned by x0 = 1 corresponds to
Theorem 8. Let Fi , i ∈ [1 : k] be feasible sets of QCQPs and such that
π −1 (H ). Finally, removing the constraint x0 = 1 from the set in
(9) leaves us with π −1 (J ), so that the set itself is the inverse im-
G (Fi ) = {X ∈ S n : H • X = 1, Ai (X ) = o, X ∈ Ci }, i ∈ [1 : k] ,
age π −1 (H ∩ J ) and therefore represents the exact convexification
of the feasible set of our underlying QCQP. where for all i ∈ [1 : m], Ai : Sn → Rm are appropriate linear operators
and Ci are appropriate convex matrix cones. Further, assume H • X > 0
To see how this is relevant for convex reformulations of QCQPs,
whenever, for at least one i ∈ [1 : k], we have X ∈ Ci and Ai (X ) = o.
consider the following simple reformulation:
Then
0 x − ω0 : Ax = b, x ∈ K, x Qi x + qi x − ωi = 0, i ∈ [1 : m]
min xT Q0 x + 2qT T T
x∈Rn k k
T G (∪ki=1 Fi ) = X= Xi : H • Xi = 1, Ai (Xi ) = o, Xi ∈ Ci , i ∈ [1 : k] .
= min i • Y = 0, i ∈ [0 : m], Y ∈ yy : y ∈ R+ × K
Q̄0 • Y : H0 • Y = 1, Q
n i=1 i=1
Y∈S 1
457
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
There are also recent results by Bomze & Peng (2022) on are still NP-hard in general. But this does not mean that the con-
the relaxation of part b). In Theorem 9 we discussed the in- vexification approach has no merit for solving the problems. We
troduction of holes via quadratic inequalities. However, the will now discuss two major routes by which the convexifications
question when a quadratic constraint can be added to the can be exploited in order to either solve the problem exactly or to
description of the feasible set without losing tightness of the give very good bounds.
set-completely positive reformulation remains, in general, an
open one. 2.3.1. Characterizations and inner/outer approximations of CPP (K )
and COP (K )
2.2. Duality of linear optimization over CPP (K ) As stated before, in general, certifying membership in either
CPP (K ) or COP (K ) is intractable save for some particular in-
One of the decisive advantages of convex reformulations of QC- stances of K. One justification for reformulating QCQPs into copos-
QPs is that the resulting optimization problems enjoy the rich du- itive optimization problems anyway is the fact that there are
ality theory that convex optimization offers. General results on powerful approximations of these cones and in some cases even
convex optimization duality, such as strong duality under Slater’s tractable characterizations. We will now discuss some of the more
condition, can be immediately applied to optimization problems prominent and easily explained approximations and give some in-
involving CPP (K ). For the readers’ convenience we formulate a teresting references to more involved theory on the matter. Before
general linear completely positive optimization problem and its we start this discussion, we want to provide some general and use-
dual here, to review the conditions for full strong conic duality in ful properties of the two cones: Most of them seem to be com-
the sequel. mon knowledge within the community, so attributing historically
So let correct credits is difficult. However, we believe the concise com-
pilation may be of some use here, and for completeness we will
inf Q0 • X
X∈S n provide a proof in the appendix.
s.t. : Qi • X ≤ bi , i ∈ [1 : m], Proposition 11. For any cones K, K1 , K2 ⊆ Rn we have the following
X ∈ CPP (K ), (11) relations:
then its dual is given by 1. COP (K ) = COP (−K ) = COP (K ∪ −K ), which also holds if
m COP is replaced with CPP ,
sup − bi λi 2. If K1 ⊆ K2 , then CPP (K1 ) ⊆ CPP (K2 ) with equality if and
λ∈ R m
+ i=1 only if K2 ⊆ K1 ∪ −K1 ,
m 3. If K1 ⊆ K2 , then COP (K1 ) ⊇ COP (K2 ); if in addition we as-
s.t. : Q0 + λi Qi ∈ COP (K ). (12) sume int K2∗ = ∅, we have COP (K1 ) = COP (K2 ) if and only if1
i=1 K2 ⊆ cl K1 .
Here we use the definition 4. CPP (K ) ⊆ S+ n ⊆ COP (K ); all three sets are equal if and only
if K ∪ −K = Rn , in particular
COP (K ) := CPP (K )∗ = {M ∈ S n : M • X ≥ 0 for all X ∈ CPP (K )} m+1
5. COP (R+ × Rm ) = CPP (R+ × Rm ) = S+ , more generally,
n
= M ∈ S : x Mx ≥ 0
T
for all x ∈ K , M11 MT
6. CPP (K × Rm ) = 21 ∈ S m+n : M
+ 11 ∈ CPP (K ) if
where the second equality is valid since all the extreme rays of M21 M22
CPP (K ) are of the form xxT with x ∈ K. The cone COP (K ) is o ∈ K,
called the set-copositive matrix cone, and can be thought of as 7. COP (K1 ∪ K2 ) = COP (K1 ) ∩ COP (K2 ),
a generalization of the positive-semidefinite matrix cone. It is a 8. CPP (K1 ∪ K2 ) = CPP (K1 ) + CPP (K2 ),
central object in our discussion and we provide a more thorough 9. CPP (convK ) ⊇ CPP (K ) with equality if K is convex,
treatment of this subject in Section 2.3.1. We now state a well 10. COP (convK ) ⊆
COP (K ) with equality if K is convex,
known theorem on strong duality between the two optimization 11. intCPP (K ) = k
i=1 xi xT
i
: xi ∈ intK, span{x1 , . . . , xk } = Rn
problems. if K is closed, convex and intK = ∅,
Theorem 10. For (11) and (12) we always have that val(11 ) ≥ 12. COP (K ) = clCOP (K ) = COP (clK ) while CPP (clK ) =
val(12 ). Further, clCPP (K );
13. COP (K ) = COP (relintK ), if K is convex,
• if (11) has a feasible point X ∈ relint CPP (K ) then val(11 ) = 14. intCOP (K ) = Q ∈ S n : xT Qx > 0 for all x ∈ K \ {o} .
val(12 ) and (12) attains its optimal value,
• if (12) has a feasible point λ ∈ Rm
i=1 λi Qi ∈
m Proof. See appendix.
+ such that Q0 +
relint COP (K ), then val(11 ) = val(12 ), and (11) attains its op- For the case of K = Rn+ we have the following chain of inclu-
timal value. sions
An immediate consequence of the above theorem is that, with- CPP (Rn+ ) ⊆ S+n ∩ N ⊆ S+n + N ⊆ COP (Rn+ ) (13)
out any assumptions, (12) offers a rigorous lower bound of any
where N is the orthant of nonnegative matrices. The cone N n ∩
S+
QCQP (1) whose Shor-relaxation is transformed into (11). This is
of particular importance in situations where primal values are of- is often call the doubly nonnegative matrix cone DN N n , and N n
S+ +
fered which are claimed to be nearly optimal. is often called the nonnegative-decomposable matrix cone N N Dn .
Despite their conceptual simplicity, these cones often turn out to
2.3. Solving copositive optimization problems be quite powerful in practice. We will also discuss some impres-
sive theoretical guarantees that involve these simple approxima-
The conic reformulations discussed so far introduce many of the tions later in this and other sections (see Theorem 13 and the suc-
comforts of convex optimization, most notably convex duality the- ceeding discussion, but also Section 5.2).
ory, to an area that is, in general, highly nonconvex. However, they
do not alleviate the core difficulty of these problems in most cases: 1
note that K1 ⊆ K2 ⊆ cl (K1 ∪ −K1 ) and int K2∗ = ∅ already implies K2 ⊆ cl K1 ,
set-copositive and set-completely positive optimization problems so that this criterion coincides with the criterion of 2. up to closure
458
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
For polyhedral cones K := {x ∈ Rn : Ax ≤ o}, with A ∈ Rm×n , • A = O, hence no linear inequalities are present.
there are simple polyhedral approximations given by • If x ∈ Rn+1 satisfies Bx ∈ SOC r and aT x = 0 for some i ∈ [1 : p],
then x ∈ K.
i
PI (K ) := AT NA : N ∈ N m ⊆ COP (K )
PO (K ) := M ∈ S n : AMAT ∈ N m ⊇ CPP (K ) Clearly the dual of SI (K ) is an outer approximation of CPP (K ),
It is always possible to use but we will not go through the effort of deriving it here. Instead,
Theorem 4 in order to character-
ize CPP (K̄ ), where K̄ := (x, s ) ∈ Rn × Rm + : Ax + s = 0 in order
we want to comment on the philosophy behind its construction.
to derive G (K̄ ) = CPP (K̄ ) so that CPP (K ) is the projection on Note that for any two convex cones K1 and K2 containing the ori-
the north-west n × n entries. The cone CPP (K̄ ) is thereby de- gin we have
scribed via linear constraints and a conic constraint involving K1 + K2 = conv(K1 ∪ K2 ) . (15)
CPP (Rn × Rm + ). The latter constraint can then be reformulated via
n+1
Proposition 11 point 6. where any approximation for CPP (Rm + ) can
Now, SI (K ) is such a sum where the components consist of S+ ,
be inserted in order to obtain inner and outer approximations of an instance of PI (K ), a single ray {λS : λ ≥ 0} and the fourth cone
CPP (K ). For the second-order cone case K = SOC n , the celebrated described in terms of V and T which differs from any of the previ-
S-Lemma (Yakubovich, 1971) allows for an exact characterization ous inner approximations, but whose containment in COP (K ) can
of both, the set-completely positive and the set-copositive matrix be easily checked. Hence, whenever a new inner approximation
cone in terms of psd-constraints, namely is identified, one can combine it with all other inner approxima-
tions to obtain a potentially much stronger inner approximation.
CPP (SOC n ) = M ∈ S+n : M • J ≤ 0 ,
We want to highlight that due to (15), even adding a single ray
COP (SOC n ) = M ∈ S n : M + λJ ∈ S+n , λ≥0 , may increase the size of the inner approximation substantially.
where J is the identity matrix up to the first entry in the first In addition, this inner approximation improves on another pop-
row, which is flipped to −1. Due to Proposition 11 point ular construction discussed in Ben-Tal, El Ghaoui, & Nemirovski
2. we (2009, Theorem B.3.1) where the authors propose the so-called ap-
have CPP (SOC n ) = CPP (K ) with K := x ∈ Rn : xT Jx ≤ 0 , hence
a cone described by a homogeneous quadratic inequality. For the proximate S-Lemma, which can be used to derive an alternative in-
case where multiple such inequalities are present, only limited re- ner approximation of COP (K ), with K as defined in Theorem 13.
sults are available. For example, Bomze & Gabl (2021) proved the However, in Xu & Hanasusanto (2018, Proposition 3) it is demon-
following theorem: strated that SI (K ) gives a superset of the approximations based
on the approximate S-Lemma.
Theorem 12. Let K := {x ∈ Rn : xT Qi x ≤ 0, i ∈ [1 : m]} with Qi ∈ S n .
Assume that there is some x0 with xT Q x < 0 for all i ∈ [1 : m]. Fur-
0 i 0 2.3.2. Algorithmic approaches via copositivity detection
ther, suppose that for all i ∈ [1 : m] Recently Badenbroek & de Klerk (2022) and Anstreicher & Gabl
n
X ∈ S+ \ {O} and Qi • X = 0 ⇒ Q j • X < 0 for all j ∈ [1 : m] \ {i} . (2022) proposed algorithmic approaches to solve a copositive opti-
(14) mization problem where the ground cone is either Rn+ or a polyhe-
Then dral cone, but it seems plausible that similar approaches are fea-
sible for other ground cones K ⊆ Rn . We will give a high-level ab-
m
straction of their approaches here.
COP (K ) = M:M+ λi Qi ∈ S+n for some λ ∈ Rm
+ ,
We consider a general set-copositive optimization problem
i=1
given by
CPP (K ) = M ∈ S+n : M • Qi ≤ 0, i ∈ [1 : m] .
m
The theorem does not cover the case where K is the inter-
∗
v = sup bT y : C − yi Ai = S, S ∈ COP (K ) .
y,S
i=1
section of (perhaps linearly transformed) second-order cones. A
respective characterization of set-copositivity/set-completely posi- The algorithms are based on relaxed problems:
tivity would provide a long desired convex reformulation of the
m
multi-trustregion subproblem. So far, this remains an open prob-
v(C ) := sup bT y : C − yi Ai = S, ( S, y ) ∈ C .
lem, despite substantial effort by the community. Still, one may y,S
i=1
study (Yang & Burer, 2013) to find inspirations for approximations
for instances of K that involve two second-order cone constraints. where C is a convex set such that its projection on the S-
In case K := {x ∈ SOC n : Ax ≤ o} where the hyperplanes en- coordinate contains COP (K ) and over which we can optimize effi-
coded by the linear inequalities do not intersect within the second- ciently. If v(C ) attains its optimum at a point (S, y ) such that S ∈
order cone, one may use a homogeneous version of Theorem 3 (re- COP (K ) then v(C ) = v∗ , and we solved the problem. If S ∈/ COP (K )
garding F1 ) in order do derive a tractable characterization of then there is a certificate x ∈ K such that xT Sx < 0. We assume
CPP (K ) and COP (K ). However, Xu & Hanasusanto (2018) found that we have an oracle that is capable of testing set-copostivity
an elegant way to neatly summarize approximations and exactness and produces a certificate in case of negative answer. The algo-
results for a slightly more general instance of K. rithm proceeds as follows:
The two papers employ different variations of this algorithm.
Theorem 13. Consider
Both have in common that in each iteration, set-copositivity of the
K := {x ∈ Rn × R+ : Ax ≥ o, Bx ∈ SOC r } iterate Sk is tested and the approximations Ck are updated via the
cut generated by the certificate xk , in case the test result is nega-
where A ∈ R p×(n+1 ) and B ∈ Rr×(n+1 ) and define
⎧ ⎫ tive. The algorithms differ in the generation of the feasible points
⎪ W ∈ S+n+1 , U ∈ N p , ⎪
⎨ ⎬ Sk and yk , in the method by which copositivity is checked and in
V ∈ S n+1 , T ∈ R p×r , λ ∈ R+
SI (K ) : = M ∈ S n+1 : , a set of additional cuts Ck , which we did not discuss so far.
⎪ M=W +T λS + A TUAT +V ⎪
T
⎩ r⎭ In (Badenbroek & de Klerk, 2022) the authors deal with the
V = 2 A TB + B T A , Rows(T ) ∈ S OC
1
case where K = Rn+ . The feasible points are generated by finding
i=2 B ei ei B. Then SI (K ) ⊆ COP (K ). Fur- the analytic center of the feasible set of v(Ck ) and at every iter-
r
where S := BT e1 eT1B −
T T
ther, equality holds under one of the following conditions: ation where in case Sk ∈ COP (K ) they implement an optimality
459
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
cut Ck = (S, y ) : bT y ≥ bT yk . In addition, at each iteration one ei- see Dür & Rendl (2021). Finally, accounts of the strengths of convex
ther obtains a lower bound on the problem in case Sk ∈ COP (K ) or relaxations of this style can be found in Anstreicher (2009, 2012);
an upper bound otherwise. The algorithm stops if the relative gap Anstreicher & Burer (2005); Bomze (2015), in which the reader
between the best lower and the best upper bound shrinks below may find theoretical guarantees as well as numerical studies.
predetermined threshold. Also, the copositivity check is performed Many more approximations have been proposed in literature,
by solving the standard quadratic often in the form of hierarchies approximate the cones COP (K )
optimization problem parameter-
ized by Sk , given by minx∈Rn xT Sk x : eT x = 1 via a mixed-integer or CPP (K ) to arbitrary good accuracy, at the cost of introducing
+
programming approach outlined in Xia, Vera, & Zuluaga (2020). an exponentially increasing number of additional constraints. The
In contrast, (Anstreicher & Gabl, 2022) solve v(Ck ) to opti- interested reader may be referred to Bomze & de Klerk (2002);
mality at every iteration. As long as Sk ∈ / COP (K ) they generate Bundfuss & Dür (20 08, 20 09); Dickinson & Povh (2013); de Klerk &
a cut based on the certificate xk . In addition they employ var- Pasechnik (20 02); Lasserre (20 01); Parrilo (20 0 0a,b); Peña, Vera, &
ious second-order cone cuts (which would take the role Ck in Zuluaga (2007); Sponsel, Bundfuss, & Dür (2012); Yıldırım (2012).
our present notation). The algorithm stops as soon as the copos- Due to significantly higher computational cost however, these ap-
itivity test is positive. In addition the authors provide their own proximations have not featured prominently in the literature on
mixed integer optimization based approach to set-copositivity test- optimization under uncertainty yet, which is why we do not go
ing, which is able to deal with cases where K is a polyhedral cone into detail here.
described by intersection of the non-negative orthant and arbitrar-
ily many hyperplanes. Their approach is of particular interest to 3. A brief account on robust optimization and some variants
this text since they apply their algorithm to copositive reformu-
lations of robust optimization problems (of the kind discussed in As mentioned above, we trust that most readers are familiar
Section 4 below), and show that it can be used in conjunction with the core concepts of robust optimization. Therefore, the fol-
with the approximation-based approaches discussed in the previ- lowing exposition is just exhaustive enough to make the subse-
ous section, in order to test the quality of the latter approxima- quent discussion understandable.
tions. In theory there are many types of optimization problems that
can be solved efficiently to any desired accuracy, provided the
2.4. Concise guide: convex reformulations, Shor lifting and structure of the problem, including the relevant data, is known.
copositivity However in practice the latter is often not the case and one is con-
fronted with an uncertain optimization problem:
In what follows we will provide the reader with a roadmap
through the literature which may assist in understanding and inf { f0 (x, u ) : fi (x, u ) ≥ 0, i ∈ [1 : m]} where u ∈ U. (16)
x∈R n
further developing the theory around convex reformulations and
The parameters of the functions fi , i ∈ [0 : m] are uncertain and
copositive optimization. This is by no means an exhaustive list,
governed by the uncertainty parameter vector u that lives in an un-
nor does it imply any judgements on articles not mentioned here.
certainty set U ⊆ Rq . This set encompasses all realizations of u, for
More complete accounts of the respective literature may be found
which the decision maker takes responsibility. Examples for de-
in Bomze, Schachinger, & Uchida (2012); Dür & Rendl (2021).
signing appropriate uncertainty sets can be found in Ben-Tal et al.
Historically, the idea of copositive matrices, hence matrices in
(2009); Bertsimas & Brown (2009); Bertsimas, Gupta, & Kallus
COP (Rn+ ) goes back to Motzkin (1818), where the term and the
(2018); Gorissen et al. (2015).
concept were introduced originally. The dual term of complete pos-
Under the robust optimization paradigm, one seeks to select a
itivity can be found in the early paper (Hall & Newman, 1963).
decision with the best worst-case performance among all decisions
However, the standard reference, as far as linear algebra is con-
that are feasible for any realization of the uncertain data (see Ben-
cerned, is the classic book (Berman & Shaked-Monderer, 2003),
Tal et al., 2009; Gorissen et al., 2015 and references therein). The
which mostly deals with CPP (Rn+ ). Further developments on the
mathematical model encompassing this philosophy, the so-called
analysis of COP (Rn+ ) and CPP (Rn+ ) can be found in Dickinson
robust counterpart of an uncertain optimization problem, is given
(2010, 2013); Dür & Still (2008), which present interesting geomet-
by
rical and topological insights on the two cones. For many of these
results it is still an open question, whether they can be general-
ized to cases where the ground cones differ from the non-negative infn sup { f0 (x, u )} : fi (x, u ) ≥ 0, i ∈ [1 : m] for all u ∈ U .
x∈R u∈U
orthant. Some results for a general closed, convex ground cones
(17)
can be found in Sturm & Zhang (2003). More extensive surveys
on copositive and completely positive matrices are (Bomze, 2012; In the rest of the text we will be mainly concerned with cases
Bomze et al., 2012; Dür, 2010). where fi , i ∈ [0 : m] are quadratic functions in u and affine or con-
The classical Shor relaxation where K = Rn was introduced cave quadratic in x. For many specifications of fi and U, the ro-
in Shor (1987). Exactness proofs of this relaxation are regularly bust counterpart can be reformulated into a tractable optimization
achieved via the results on the rank of extreme matrices of fea- problem, solvable via standard solutions strategies. The downside
sible sets of SDPs given in Pataki (1998), see for example Bomze of this framework is that it is inherently conservative due to its
& Gabl (2021); Burer & Anstreicher (2013). The first exactness re- pessimistic perspective on the eventual outcome of the uncertain
sult for a convex reformulation where K = Rn+ is given in Bomze process.
et al. (20 0 0), where a convex reformulation for the standard Many approaches have been proposed to remedy this short-
quadratic optimization was derived. The core papers that introduce coming of conservativeness. One such approach is called adjustable
the methodology based on G (F ) are (Anstreicher & Burer, 2010; robust optimization (ARO). The domain of this approach are sit-
Burer, 2009; 2012; Burer & Anstreicher, 2013; Eichfelder & Povh, uations where parts of the decision can be delayed until un-
2013; Yang et al., 2016). An earlier contribution is however given certainty is revealed. These adjustable decisions are modeled as
in Sturm & Zhang (2003), who laid out many fundamental ideas function-valued decision variables, hence one looks for the opti-
of that machinery. Still, for the purposes of introduction we rather mal policy which, conditional on the outcome of the uncertain
recommend (Burer, 2015), which will prepare the reader to deal process, will yield a good feasible solution of the optimization
with the more involved texts cited here. For a very recent survey problem. Adjustable robust optimization was first introduced in
460
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
Ben-Tal et al. (2004), for a detailed survey see Yanikoglu et al. d (u, u0 ) is a continuous reference metric on U. The r-Wasserstein
(2019). The adjustable robust counterpart can be written as distance between two distributions P1 , P2 ∈ Mr (U ) is defined as
W r (P1 , P2 ) = inf
inf sup { f0 (x, y(u ), u )} : fi (x, y(u ), u ) ≥ 0 for all u ∈ U , i ∈ [1 : m] .
x∈R
n1
,y ( u ) u∈U $% & 1r
Q is any joint distribution of (u1 , u2 )
(18) d ( u1 , u2 )r Q ( du1 , du2 ) : .
U2 with marginals P1 and P2
Compared to a robust optimization problem, the decision vec-
tor is split into two parts: the first-stage decision vector x ∈ Rn1
Based on this notion, the ambiguity sets are often modelled as
and the second-stage decision vector y(u ) : U → Rn2 , where y(u )
a ball induced by W r , centered around an empirical distribution:
is allowed to adapt to the uncertainty and is thus a function of
u. Since the space of all functions is intractable, so is (18), and Bεr (PˆI ) := P ∈ Mr (U ) : W r (P, PˆI ) ≤ ε , (20)
thus it is much harder to solve in practice than (17). However,
there are many powerful approaches to (approximately) solve it where PˆI isthe empirical probability measure based upon a sample
(see Yanikoglu et al., 2019 and references therein), for example u ˆ I , i.e., PˆI := 1I i∈[1:I] δuˆ where δuˆ is the Dirac measure,
ˆ 1, . . . , u
i
461
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
Readers experienced with robust optimization may have no- 4.2. Various applications of the general strategy for robust
ticed that the most of this general strategy is part of the standard optimization
repertoire of techniques used in this field. Indeed, if f (x, u ) were
linear in u, the lifting and convexification step could be skipped We will now discuss different instances of robust optimization
and the remaining steps would be the familiar way of reformu- that have appeared in the literature, where (21) takes a particu-
lating semi-infinite constraints via linear convex duality theory. Of lar form, and where a reformulation into (24) is possible, given
course the difficult part is the convexification step, which is one of that the requirements of our general strategy are fulfilled. We will
the main reasons why the techniques introduced in Section 2 are briefly describe the models, specify the values for (Q, q, ω ) in the
so vital for robust optimization. Once the hurdle of providing a respective reformulation and discuss some features of their appli-
convex reformulation of the inner QCQP is taken successfully, one cations as stated in the original literature.
may once again tread on familiar territory.
4.2.1. Linear ARO under uncertain recourse and affine decision rule
We also want to highlight that if C ∗ in (24) is replaced by an in- The generic linear ARO problem is given by
∗
ner approximation Cinner ⊆ C ∗ , then (24) and hence (21) is still im-
plied, so that we obtain a conservative approximation of the latter min cT x : (Ai u + ai )x + (Bi u + bi )y(u ) + (uT Di u + dTi u + di ) ≥ 0
x∈X ,y (u )
constraint. This is important since we will mostly work with cases
for all u ∈ U , i ∈ [1 : k]} , (25)
of C that involve CPP (K ) (so that C ∗ involves COP (K )) in some
capacity, and the latter cone is intractable, so that approximations hence we have a linear optimization problem with uncertain co-
are necessary, which is the major motivation behind the detailed efficients, which we model as affine functions and quadratic func-
discussion in Sections 2.3.1 and 2.3.2. tions in u. More specifically, we model the coefficients of the first-
stage decision x in the i-th constraint as affine functions involv-
At this point we also like to comment on a common mod- ing the matrices Ai ∈ Rn1 ×q and vectors ai ∈ Rn1 , and the respective
elling choice,
to construct the uncertainty
set as a conic intersec- coefficients of the second-stage decisions y(u ) as affine functions
tion U := u : (1, uT )T ∈ K ⊆ Rq+1 . This is in fact a generic way to involving matrices Bi ∈ Rn2 ×q and vectors bi := (b1 , . . . , bn2 ) ∈ Rn2 .
construct convex sets, as discussed in Rockafellar (2015, Section 8). Finally, the offsets independent of x, y are modeled as quadratic
The motivation behind this construction is a practical one: most functions involving matrices Di ∈ S q , vectors di ∈ Rq and numbers
studies that apply the general strategy do so in conjunction with di .
Theorem 4 as workhorse which delivers the convexification step, If the matrices Bi and Di , i ∈ [1 : k] were zero, then the above
and this theorem talks about feasible sets that are modelled as model would coincide with the one studied in Ben-Tal et al.
conic intersections. Hence, constructing U in this manner makes (2004), the seminal paper on ARO. In that case, if one applies an
the application of the theorem more straightforward. affine decision rule by specifying y(u ) = Yu + y0 , where the coeffi-
Finally, before reviewing literature where this general strategy cients Y ∈ Rn2 ×m and y0 ∈ Rn2 take the role of the decision vector,
has come to pass, we want to discuss the critical ingredients of then linear, convex duality is readily applicable, modulo some reg-
the above strategy. We already discussed extensively how to close ularity conditions on U, in order to obtain a finite reformulation
the relaxation gap in Section 2. The duality gap is usually easy to of the robust constraints. The complication arises if one consid-
close since U is a bounded set so that the conic reformulation will ers uncertain recourse, i.e., when Bi are not zero. Then, bilinear
also have a bounded feasible set, which is enough to guarantee a terms in u arise and duality of the implied, inner infimum is no
dual Slater point and thus a zero duality gap, albeit without dual longer guaranteed. However, the general strategy allows us to pro-
attainability. The boundedness of U is in fact a generic property of ceed anyway. Focusing on a single constraint of the above model,
an adequate uncertainty set. If it were unbounded, then the fea- we are concerned with
sible set could be empty in case there is no x such that the con- ( Au + a )T x + ( Bu + b )T ( Y u + y0 ) + uT D u + dT u + d ≥ 0 for all u ∈ U ,
straint function is unbounded in u over U. However, if there is a (26)
feasible x then the infinitely many constraints that are associated
where an affine decision rule has already been put into place. We
with u from the directions of recession of U are redundant. Hence,
omit an index indicating which of the k constraints we are con-
it does not make sense to consider unbounded uncertainty sets
cerned with, since they are all structurally identical. Also, letting
and in fact, to the best of our knowledge, uncertainty sets are gen-
D = O does not hinder the application of our techniques, which
erally assumed to be compact (see Ben-Tal et al., 2009; Gorissen
gives some additional modelling power aside from uncertain re-
et al., 2015; Yanikoglu et al., 2019). As a consequence, eliminating
course. Applying the general strategy in a straightforward manner
the duality gap is of little concern in most cases.
allows us to achieve the following result.
Howeve, dual attainability is the more elusive quality. For the
Theorem 14. Assume that (26) has an exact conic reformulation of
conic reformulations we discussed, a Slater point in the primal
the form (23) enjoying full strong duality. Then problem (26) is equiv-
problem, hence a feasible point in int CPP (K ), guarantees dual at-
alent to
tainability. While a simple generalization of the results in Tuncel
(2001) shows that G (F ) has interior whenever F has interior, for gT λ ≤ 0 ,
T
reformulations based on Theorem 4, the most important type of aT x + bT y 0 + d 1
AT x + Y T b + BT y0 + d m
reformulations, it is well known that the feasible set never has in-
2
+ λi G i ∈ C ∗ ,
1 1
2 AT x + Y T b + BT y0 + d D+ 2 BT Y + Y T B i=1
terior. However, the requirement of dual attainability can be loos-
ened quite a bit. As shown in Bomze & Gabl (2021), one loses λ ∈ Rm
+ .
462
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
S-Lemma provided the necessary convexification. Apart from this where Ai (x ) : Rn1 → Rk×q and ai (x ) : Rn1 → Rq are affine matrix
special case, the authors also provided a conservative approxima- and vector pencils, respectively, and ai (x ) : Rn1 → R is a real-
tion based on an approximate S-Lemma. In contrast, our general valued, affine function of x. This case was recently addressed
strategy in conjunction with the results discussed in Section 2 al- by Mittal, Gökalp, & Hanasusanto (2019), in a way similar to the
lows for a wider range of uncertainty sets to be utilized. The approach presented here. We will slightly generalize their result,
first paper to apply this machinery was (Xu & Hanasusanto, 2018), again focusing on an arbitrary constraint in (28) given by
where the convexification was achieved by means of Theorem 4.
− A ( x ) u 2 + ( a ( x ) ) u + a ( x ) + u T D u + d T u + d ≥ 0
T
for all u ∈ U .
Of course the involvement of CPP (K ) may again necessitate the
(29)
use of approximations, but the authors provide such approxima-
tions for interesting choices of U and prove that these perform at It is clear from the general strategy that we can reformulate (29) as
least as good as the approximations based on the approximate S-
Lemma (see Xu & Hanasusanto, 2018, Proposition 3).
bT λ ≤ 0,
d + a (x ) (a (x ) + d )
1 T m
4.2.2. Linear ARO under fixed recourse and quadratic decision rules
We again consider (25) with the slight modification that Bi , i ∈
2
+ λi Gi ∈ C ∗ . (30)
1
2 (a (x ) + d ) D − A ( x )T A ( x ) i=1
[1 : k] are set to zero, hence, we have fixed recourse. In this case
the introduction of an affine decision rule does not lead to bilinear The entries of the south-east diagonal block of the constraints
terms in u, and standard reformulation procedures can be applied. matrix in (30) are now quadratic functions. In case C ∗ = COP (K )
However, we can do better than that. Utilizing our general strategy for some closed, convex cone K ⊆ Rq+1 (which is the case for all
allows us to expand the search space for the second-stage decision the conic reformulations of QCQPs discussed in this text), we can
from the space of affine functions to the space of quadratic func- linearize the constraints by employing the following lemma, which
tions. Thus, we specify is a straightforward generalization of Mittal et al. (2019, Lemma 4).
uT Y1 u + yT 1 u + y1 Lemma 16. Assume C ∗ = COP (K ) for some cone K ⊆ Rq+1 . Then a
y (u ) = ... , vector x ∈ Rn fulfills the conic constraint in (30) if and only if there
uT Yn2 u + yTn2 u + yn2 exists a matrix H ∈ S q such that
so that the robust constraint can be written as d + a (x ) (a(x ) + d )T
1
T 1
( a (x ) + d )
2
D − A ( x )T A ( x )
n2 n2 2
u T
b jY j + D u + b jy j + A x + d
T
u + aT x m
H A ( x )T
j=1 j=1 + λi Gi ∈ COP (K ) and ∈ S+q+k .
A (x ) I
n2 i=1
+ b jy j + d ≥ 0 for all u ∈ U. (27)
j=1 Using this lemma we can derive the following theorem
Note, that under fixed recourse the coefficients of y(u ) reduce to Theorem 17. Assume that (29) has an exact conic reformulation of
the vector b ∈ Rn2 , and we again suppressed the row index. the form (23), with C = CPP (K ) for some appropriate cone K, enjoy-
ing full strong duality. Then (29) is equivalent to
Theorem 15. Assume that (27) has an exact conic reformulation of
the form (23) enjoying full strong duality. Then (27) is equivalent to gT λ ≤ 0 , λ ∈ Rm
+ ,
g λ ≤ 0,
T
d + a (x ) (a(x ) + d )T +
1 m
T 2 λi Gi ∈ COP (K )
a x+T n2
bi y j + d 1 n2
bi y j + A x + d
T m 1
2 (a (x ) + d ) D − A ( x )T A ( x )
i=1
n2
j=1
2 j=1
n2
+ λi G i ∈ C ,
∗
1
bi y j + AT x + d D+ bi Y j i=1 H A ( x )T
∈ S+q+k .
2 j=1 j=1
and
λ ∈ Rm
+ .
A (x ) I
Proof. The theorem follows immediately from our general Proof. The theorem follows immediately from our general
strategy. strategy.
Quadratic decision rules have been applied in various articles, The setting can be transferred to the ARO case in a straightfor-
usually under some restrictions regarding the uncertainty set or ward manner, using the tools discussed in this and the previous
the structure of the quadratic forms in y(u ). For example, in case section. The second-stage variables may enter linearly with fixed
the uncertainty set is ellipsoidal, the S-Lemma allows for a fi- or uncertain recourse, in which case the all the strategies that we
nite convex reformulation of (27), and an exhaustive list of sim- discussed apply immediately. In case the second-stage enters in a
ilar approaches can be found in Yanikoglu et al. (2019, Table 3). convex quadratic manner, analogous to the vector x in this sec-
The approaches often restrict the form of the quadratic decision tion, one can apply an affine policy and use Lemma 16 in order
rule, for example to separable quadratic functions, where no bilin- to obtain a convex conic formulation. At this point, for the sake of
ear terms are present. However, as first shown in Xu & Hanasu- brevity we leave the details to the reader and skip the respective
santo (2018) and again presented here, the quadratic decision rule presentation.
is much more generally applicable if one uses the general strategy
in conjunction with Theorem 4. 4.2.4. Distributionally robust, and two-stage distributionally robust,
optimization
4.2.3. Convex quadratic robust optimization Two recent papers exploit reformulations of distributionally ro-
The model of interest here is bust optimization problems into semi-infinite optimization prob-
min cT x : −Ai (x )u2 + (ai (x ) ) u + ai (x ) + uT Di u + dTi u + di ≥ 0 lems in order to arrive at representations of these problems where
T
x∈X
constraints are amenable to the general strategy. We will briefly
for all u ∈ U , i ∈ [1 : k]} , (28) discuss their approach in the following paragraphs.
463
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
The first paper in this regard (Jiang, Ryu, & Xu, 2019), deals with that the worst case CVaR of our second stage response to an un-
appointment scheduling under data ambiguity, where the ambi- certain parameter is optimized. The authors show that (34) can be
guity set is constructed using Wasserstein balls, the construction reformulated as
of which was discussed in Section 3.1. The authors investigate the 1
inf cT x + θ + sup EP (τ (u ) )
model x∈X ,θ ,τ (u ),y (u ) δ P∈P
s.t. :τ (u ) ≥ 0 ,
inf sup {EP [ f (x, u˜ )]} , (31) τ ( u ) ≥ uT DT y ( u ) − θ , for all u ∈ U .
x∈X
P∈Bεr (PˆI )
Tl (x )T u ≤ uT WlT y(u ) for all l ∈ [1 : L],
(35)
where X ⊆ Rn is a feasible set, not affected by uncertainty, and f
This reformulation should make it tangible for the reader that the
is an objective function. In addition, the metric in the definition of
second stage decision (τ (u ), y(u )) can be subjected to linear and
the Wasserstein distance is chosen to be the p-norm with p = r so
quadratic policies, so that the semi-infinite constraints can be tack-
that
led via the general strategy. However, the supremum term in the
W r (P1 , P2 ) = inf objective still needs to be taken care of first, which would require
detailing the intricate construction of the ambiguity set used in
$% & 1r
Q is any joint distribution of (u1 , u2 )
u1 , u2 rr Q (du1 , du2 ) : . Fan & Hanasusanto (2021) and some extensive massaging of that
U2 with marginals P1 and P2
term depicted therein. But this lies beyond the scope of this text,
and we refer the reader to the original source for these details.
For this model the authors derive the following semi-infinite Nonetheless, the general strategy is a core ingredient of the au-
representation thors’ derivations, the results of which are eventually applied to
I network inventory allocation and the multi-item newsvendor prob-
1
inf εr ρ + θj lem. We do, however, like to mention the fact, that said construc-
x∈X ,ρ ,θ I
j=1 tion of the ambiguity sets necessitates the introduction of addi-
s.t. : f (x, u ) − ρu − u
ˆ j rr ≤θ j for all u ∈ U , all j ∈ [1 : I] tional semi-infinite constraints, which are duplications of the ones
present in (35) corresponding to certain subsets of the support U.
ρ ≥0, θ ∈ RI . (32) The authors tackle the computational challenge of the potentially
In case r ∈ {1, 2} the second term in the semi-infinite constraint is large number of matrix blocks that arise from the general strategy
linear or quadratic in u respectively. If in addition via a Bender’s decomposition approach, which allows for a paral-
T lelization of the solution of the copositive sub-problems
u u
f (x, u ) := sup q(x, u, w ), where q(u, w ) := Q (x ) , 4.3. Viable uncertainty sets
w∈W w w
(33)
So far we have demonstrated how convex reformulations ex-
and hence, a pointwise maximum of quadratic functions involving pand the modeling capabilities with respect to the functional form
some matrix valued function Q : Rn → S k+n and an index set W ⊆ of the robust constraints. However, the theorems that enable these
Rk , we can reformulate the semi-infinite constraint in (33) as reformulations put some requirements on the feasible sets of the
inner QCQP and therefore on the uncertainty sets, while at the
q ( x, u, w ) − u − u
ˆ j rr ≤ θ j for all (uT , wT )T ∈ U × W , same time they are allowing new modeling choices there as well.
all j ∈ [1 : I]. We will now provide an overview over the uncertainty sets that
can be managed with the machinery outlined above, and discuss
Since we have produced semi-infinite constraints with quadratic
their benefits and limitations.
index, we can apply the general strategy in order to obtain a con-
vex reformulation. Note, that as long as the dependence of Q on
4.3.1. Primitive uncertainty sets
x is linear or convex quadratic, we can use the strategy directly
A number of uncertainty sets are regularly cited as being stan-
or consecutively invoke Lemma 16 in order to obtain a problem
dard or classic, among them ellipsoidal and polyhedral uncertainty
with only linear terms in x. The authors apply this methodology to
sets. We will briefly discuss how they are handled in context of
robust appointment scheduling, in which case f is a certain point-
our general strategy.
wise maximum of linear functions linear in u, so that q is bilinear
Ellipsoidal uncertainty sets are easily tackled by the general
in (uT , wT )T and W is and appropriate polyhedron.
strategy via Theorem 3 (regarding F1 with no linear constraints),
The second paper (Fan & Hanasusanto, 2021) deals with risk-
which in essence boils down to a roundabout way of using the
averse two-stage distributionally robust optimization under a the
S-Lemma since the respective characterization of G (F ) is based
conditional value at risk (CVaR) as risk measure. The respective
on that result. However, the S-Lemma can be employed directly
model is given by
to the infimum problem in (22) in order to obtain a dual supre-
inf cT x + sup CVaRPδ (Z (x, u ) ), (34) mum problem and thus a finite reformulation. While our frame-
x∈X P∈P work does not offer anything new in this respect, it is neither re-
strictive as well.
where CVaRPδ (. ) is the conditional value at risk at level δ of a risky
Polyhedral uncertainty sets can be tackled using Theorem 4.
position whose distribution is P , u is the uncertain parameter, P
However, there is some ambiguity to which we like to draw
is a set of plausible distributions supported on a conic intersec-
some attention. One way to generally represent polyhedra is
tion U := u : (1, uT )T ∈ K ⊆ Rq+1 , X ⊆ Rn2 is a feasible set not
P1 := x ∈ Rn+ : Ax = b in which case Theorem 4 readily pro-
affected by uncertainty, and Z (x, u ) is the recourse problem given
vides a description of G (P1 ) involving CPP (Rn+ ). However, another
by
generic description is given by P2 := {x ∈ Rn : Ax ≤ b} in which
Z (x, u ) := infn uT DT y : Tl (x )T u ≤ uT WlT y for all l ∈ [1 : L] , case Theorem 4 can be applied after introducing slack variables
y∈R 2
s ∈ Rm , where m the number of inequality constraints in the de-
with appropriate matrices D, Wl , l ∈ [1 : L] and matrix valued func- scription of P2 . The resulting characterization of G (P2 ) would in-
tions Tl (x ), l ∈ [1 : L]. Hence, we look for a first stage decision x so volve CPP (Rn × Rm + ) which by Proposition 11 point 6. can be ex-
464
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
n+m
pressed using only S+ and CPP (Rm ). Exploiting this ambigu- can easily introduce piecewise policies by merely updating the un-
ity, one might choose the description that yields the smaller com- certainty set accordingly, albeit at the price of a having to work
pletely positive constraint, which may reduce complexity. with a nonconvex uncertainty set. A simple argument shows that
In (Xu & Hanasusanto, 2018) the authors study combinations we have
of these of ellipsoidal and polyhedral uncertainty sets, where the
o ≤ w ≤ w̄ ,
facets of the polyhedron do not meet inside the ellipsoid. In that
U = ( u, w ) ⊆ U × RL : wl ≥ gT l
u − hl , l ∈ [1 : L] ,
case Theorem 13 provides an exact representation of the conic con-
w l ( w l − gl + hl ) = 0 , l ∈ [1 : L]
straints present in G (U ).
= (u, w ) : (1, u, w ) ∈ K , wl (wl − gl + hl ) = 0 , l ∈ [1 : L] ,
4.3.2. Mixed-integer uncertainty sets
where
In (Mittal et al., 2019) the authors use Theorem 4 in order to
introduce uncertainty sets with mixed-integer components, namely w ≤ u0 w̄ ,
K := ( u0 , u, w ) ∈ K × RL+ : .
wl ≥ gT
l
u − hl , l ∈ [1 : L]
U := u ∈ Rk+ : Au = b, ul ∈ Z for all l ∈ L (36) Note that U is a bounded set and for all (u0 , u, w ) ∈ K we
where L ⊆ [1 : k]. One can assume without loss of generality that have wl (wl − gT l
u + hl ) ≥ 0, l ∈ [1 : L], so that the key condition in
L := [1 : L] for some L ≤ k. Under the additional assumption that Theorem 4 is satisfied for any quadratic optimization problem over
U is bounded we can always express any integer component of a U . Hence, after replacing y by an affine or an quadratic policy (in
member of U by binary expansion as ul = Q 2i−1 vil = qT vl for case of fixed recourse) in (u, w ) we can use the general strategy
i=1
some integer Q. Hence the set in conjunction with Theorem 4 to obtain a finite reformulation of
(38) under a piecewise affine/quadratic policy. The final result in-
U : = (u, V, S ) ∈ Rk+ × {0, 1}Q×L × RQ×L
+ : Au = b, ul volves the cone COP (K ) for which the authors of Xu & Hanasu-
santo (2018) find tractable outer, hence conservative, approxima-
= qT vl , vl + sl = e, l ∈ [1 : L]
tions based on SI (K ) from Theorem 13.
has U as its projection on the u-coordinates. Note that next to the
variables in V we also had to introduce additional constraints and
4.4. Application: disjoint convex-convex quadratic optimization
slack variables. This is done in order to meet the requirements
of Theorem 4. Hence, any robust constraint with quadratic index
Following the core idea of Zhen, Marandi, de Moor, den Hertog,
in an uncertainty set U can be cast as a robust constraints over
& Vandenberghe (2022), the authors of Bomze & Gabl (2021) pro-
U , which can then be reformulated using the general strategy in
posed a convex lower bound of special type of QCQP based on a
conjunction with Theorem 4. The resulting copositive constraint
reformulation as an ARO problem that can be approximated, us-
will involve COP (Rk++2QL ), however, the authors of Mittal et al.
ing the general strategy and the results presented in the preced-
(2019) prove that even the simplest inner approximation based on
ing sections. The following theorem presents the QCQP and its ad-
N N Dk+2QL outperforms the classical approach based on the ap-
justable robust reformulation:
proximate S-Lemma introduced in (Ben-Tal et al., 2004).
Note that the convex formulation based on Theorem 4 scales Theorem 18. Let Qx ∈ S n1 , Qxy ∈ Rn1 ×n2 , F ∈ Rk×n2 and G ∈ Rr×n2 .
n
quadratically in the dimension of the original quadratic problem. Further, assume X ⊆ Rn1 is a compact set and Y := {y ∈ R+2 : Fy =
Hence, the introduction of the additional variables may come at a d} ⊆ R has a Slater point and let Z (x ) := {(z, w ) : F z + GT w ≤
n 2 T
xy x}. Then
potentially high cost of optimizing over a large set-copositive con- QT
straint. Providing reformulation strategies that do not require the
excessive lifting when changing from U to U is therefore a desir- inf xT Qx x + xT Qxy y + Gy2 (38)
x∈X ,y∈Y
able achievement to be pursued in future research.
4.3.3. Adapting the uncertainty set to piecewise affine/quadratic = sup{τ : ∀ x ∈ X ∃ (z(x ), w(x )) ∈ Z (x ) with xT Qx x + dT z(x )
decision rules τ
In (Xu & Hanasusanto, 2018) the authors skillfully exploited − 14 w(x )2 ≥ τ } , (39)
the modeling capabilities offered by Theorem 4 in order to enable
piecewise linear and quadratic decision rules. Given that the uncer- where the decision variables z : Rn1 → Rk and w : Rn1 → Rr are
tainty set is defined as a compact, convex, conic intersection given functions.
by: In the ARO problem the variables z(x ) and w(x ) take the role of
U := u : (1, uT )T ∈ K ⊆ Rq+1 , the second-stage variables, the decision vector x takes the role of
the uncertainty parameter vector and its former feasible set X be-
one can lift the uncertainty set to obtain comes the uncertainty set. If the adjustable variables are restricted
U := (u, w ) ⊆ U × RL : wl = max 0, gT to a quadratic and affine policy respectively, i.e., (z(x )) j = xT Z j x +
l u − hl , l ∈ [1 : L] .
xT z j + z j , j ∈ [1 : k], w(x ) = Wx + w, then all the semi-infinite con-
Here gl is interpreted as the folding direction of the lth piece of straints become quadratic in x and are thus amenable to a refor-
the piecewise policy and hl is its breakpoint. Clearly, a general ad- mulation based on the general strategy. Since the application of
justable robust constraint in (25) is equivalent to the policies contracts the feasible set of the supremum problem,
( Au + a )T x + ( Bu + b )T y ( u, w ) + uT Du + dT u + d ≥ 0 we generate a lower bound.
The authors test the resulting lower bound against lower
for all (u, w ) ∈ U , (37)
T T bounds based on relaxation of the completely positive refor-
since y(u ) := y u, max 0, g1 u − h1 , . . . , max 0, gL u − hL is a mulation from Theorem 4 on random instances with X :=
function that maps U into Rn2 , and vice versa any function of u {x ∈ K : Bx = c} given by a compact conic intersection. The results
can be generated from functions of (u, w ), with w defined as in U . are mixed, but it is noted that in case the number of constraints in
However, if we restrict y to be affine or quadratic in its arguments, Y is much bigger than the number of linear equality constraints in
then y(u ) is a piecewise affine/quadratic function in u. Hence, we X , the ARO lower bound has computational advantages. Currently,
465
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
a direct real-world application of this model is not in sight, but we Theorem 17 the copositive approach can be expected to be able
are confident that future research will reveal relevant areas where manage cases where the argument of h(· ) is a convex quadratic
we can profit from the strength of the ARO lower bound, and also function.
ways to exploit the special structure of the lower bound. The inter- Further, it would be interesting to study the relationship be-
ested reader may inspect these structural details in Bomze & Gabl tween said approximations and the approach from Bertsimas et al.
(2021). (2022) mentioned above. Specifically, their approach might inspire
approximations of G (U ) that can be used in other contexts. We
4.5. Outlook on new research direction: robust convex optimization would be interested to cooperate towards this goal, as it has sig-
nificant overlap with our research agenda.
Recently (Bertsimas, den Hertog, Pauphilet, & Zhen, 2022) intro-
duced a reformulation of a general robust convex constraint into a
robust bilinear constraint. The argument rests on the characteriza- 4.6. Open problems
tion of a closed, convex function as the bi-dual conjugate, hence
the conjugate function of its conjugate function (see Rockafellar, Robust convex optimization: The entirety of the discussion of
2015, Section 12). For the readers’ convenience we repeat their Section 4.5 is preliminary and hopefully inspires some read-
derivation here. So consider a robust constraint ers to take up the questions we outlined there.
466
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
5. ARO with uncertain right-hand side: an alternative Under Assumption 1a), the quadratic problem is feasible, since
copositive approach any x ∈ X that would render it infeasible would be optimal
for (41) with minus infinity as optimal value. Thus, the convex re-
In (Xu & Burer, 2018) the authors proposed a copositive re- laxation is exact by Theorem 4. Further, under the Assumption 1b)
formulation of a certain class of linear ARO problems based the dual of the innermost LP is feasible with finite value regard-
on Theorem 4 obtained by means very different from the general less of u, hence attaining its optimal value on an extreme point
strategy we outlined above, and consequently results in a distinc- of its feasible set. The latter set is polyhedral so that its ex-
tive type of copositive reformulation. The derivation is simple and treme points can be contained in a ball of sufficient size, rendering
elegant, and we will give a condensed account of their methodol- wT w ≤ rw redundant for the bilinear problem given that rw > 0 is
ogy in the sequel, extending their model by introducing additional large enough. Also, since U is bounded, the constraint uT u ≤ ru is
uncertainty, also on the left-hand side and in the objective. redundant for large enough ru > 0. It follows that we can always
introduce the constraint zT z ≤ r with sufficiently large r ≥ 0 to the
5.1. Copositive reformulation à la Xu and Burer bilinear optimization problem without changing the optimal value,
hence Z • I ≤ r is redundant for the conic optimization problem. Af-
The class of ARO models we consider here is given by ter doing so, the dual of the conic problem is given by
min max cT x + d(u )T y(u ) : a(x, u ) + By(u ) ≥ f(u ) for all u ∈ U min λ + rρ
x∈X ,y (u ) u∈U λ, ,ρ
= min cT x + max min d(u )T y : a(x, u ) + By ≥ f(u ) (41) 1
x∈X u∈U y s.t. : Q(x ) + λe1 eT
1 + E + ET T
+ ρ I ∈ COP Rm
+ ×K
2
where the latter reformulation is proved by using standard argu-
and since for the identity matrix I we have I ∈ int COP (K ) for any
ments from optimization theory. Also, d(u ) = d0 + Du, f(u ) = f0 +
cone K, the latter problem has a Slater point so that the duality
Fu for appropriate matrices and vectors, a(x, u ) := a0 (x ) + A(x )u
gap is zero. Thus, the original optimization problem can be equiv-
for appropriate vector-valued, affine mappings, and X ⊆ Rn is a
alently reformulated as
feasible set of the first-stage decision not affected by uncertainty.
Again, the uncertainty set is modeled as compact, conic intersec- min cT x + λ + rρ
tion: x,λ, ,ρ
1
U := u : (1, uT )T ∈ K , s.t. : Q(x ) + λe1 eT
1+ E+ET T
+ρ I ∈ COP Rm
+ ×K , x ∈ X .
2
for some closed, convex cone K ⊆ Rq+1 The reformulation strategy (42)
we are about to lay out rests on the following assumptions: The reformulation is exact but the cone COP (Rm
× K ) is in- +
Assumption 1. For problem (41) the following statements hold: tractable even if COP (K ) is tractable. Hence one has to resort
again to inner, hence conservative, approximations.
(a) it is feasible with finite optimal value;
(b) it possesses relatively complete recourse, i.e., for all x ∈ X 5.2. Improving the affine policy
and u ∈ U the innermost LP (in the min-max-min reformu-
lation) is feasible. This raises the question whether any benefit can be incurred by
this strategy when compared to other conservative approximations
The innermost minimization problem can be dualized to obtain
such as the ones based on affine decision rules. The authors of Xu
& Burer (2018) find an elegant answer to this question, at least for
min d(u )T y(u ) : By ≥ f(u ) − a(x, u ) the case where d(u ) is constant. We summarize their findings in
y
the following theorem:
= maxm wT []f(u ) − a(x, u )] : BT w = d(u ) .
w∈R +
Theorem 19. For (41) assume that d(u ) is a constant. Further denote
We can now plug in the dual and the definitions of the functions by vAff the optimal value of (41) under an affine policy and with vIA
representing the uncertain data, to obtain a bilinear optimization the optimal value of (42) after replacing COP Rm + × K with
problem that can be reformulated into a set-completely positive
ST21 : S11 = e1 g + ge1 , g ∈ K, .
T T
S11
optimization problem: IA K × Rm+ := S =
S21 S22 Rows(S21 ) ∈ K , S22 ≥ 0
∗
467
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
5.3. Open problems Despite its simplicity, this Standard Quadratic Problem (StQP) fea-
tures prominently in diverse application areas such as game the-
Improving the affine policy under nonconstant d(u ): In the ory, graph theory, machine learning and copositivity detection. It
original article (Xu & Burer, 2018), both d(u ) as well as A(u ) was the first problem for which an exact copositive reformulation
were assumed constant. While establishing the above theo- was presented in Bomze et al. (20 0 0):
rem for the case where the latter function is not constant
min xT Qx = min {Q • X : E • X = 1},
is simply a matter of carrying along some additional terms x ∈ n X∈CPP (Rn+ )
through the discussion presented in Xu & Burer (2018), the
or in other words it holds that G (n ) = {X ∈ CPP (Rn+ ) : E • X
same is not true for non-constant d(u ). The reason for this
= 1}. While the original proof is straightforward, this by now
lies in the proof strategy that achieves vIA ≤ vAff . It is based
classical result can also easily be derived via the method-
on first deriving the finite reformulation of (41) under an
ology discussed in Section 2.1.2. Specifically, one can apply
affine policy, and then showing that one can turn any fea-
Theorem 6 where the J is chosen to be all of CPP (Rn+ ) and H is
sible solution of that reformulation into a feasible solution
the hyperplane associated with E • X = 1; the details are left to the
of (42), under the required specifications. The finite refor-
reader.
mulation under the affine policy is thereby achieved using
In (Bomze, Kahr, & Leitner, 2021b) the authors investigate ro-
standard linear convex duality, which does not apply if d(u )
bust counterparts of this problem, which are generically given by
is not constant.
It is however possible to give a finite reformulation based
on the general strategy as discussed in Section 4.2.1. It re- min max xT (Q + U )x .
x∈n U∈U
mains to clarify how the resulting reformulation can be pro-
jected into the feasible set of (42). Answering this question, Since the constraints are a structural aspect of the problem
one may be able to find a modification of IA Rm + × K that
(e.g., probabilities are are always positive and sum up to one),
allows for similar performance guarantees. only the objective function is affected by uncertainty. An imme-
Characterizing implied policies: As noted in Xu & Burer (2018), diate question is whether the completely positive relaxation given
(42) is powerful enough to represent the original ARO prob- by
lem, and by the discussion in Section 5.2 we see that the min max (Q + U ) • X,
affine policy can be mapped into the solution space of the X∈G (n ) U∈U
reformulation. However, there is no similar analysis regard- is again tight. While the answer is negative in general, the authors
ing other types of policies. prove that the relaxation gap is exactly the min-max gap.
Improving the quadratic policy: On a related note, it is not
clear whether the conic constraint in (42) can be replaced Theorem 20. Consider the robust Standard Quadratic Problem with
by an inner approximation that performs at least as good as uncertainty set U.
the quadratic policy. Such a result seems tangible since we
(a) For general U we have
know from the discussion in Section 4.2.1 that (41) under
a quadratic policy does have a conic reformulation, where min max xT (Q + U )x ≤ min max (Q + U ) • X .
x∈n U∈U X∈G (n ) U∈U
each row of the constraints is reformulated individually, re-
sulting in a collection of conic constraints. However, there
(b) Suppose U is closed and convex. Then
seems to be no straightforward way in which the feasible
set of such a reformulation can be projected into the feasi- min max (Q + U ) • X = max min xT (Q + U )x ,
X∈G (n ) U∈U U∈U x∈n
ble set of (42).
The case of uncertain recourse: The reformulation presented
in the above section assumes that the matrix B is not af- so that the completely positive relaxation gap is exactly the gap in
fected by uncertainty. If this were the case, we would have the min-max inequality.
to deal with quadratic constraints, which would jeopardize
the application of Theorem 4 at the penultimate step of the Proof. See Bomze et al. (2021b, Theorem 1).
reformulation strategy. Specifically, instead of the linear con-
However, there are instances in which the inner maximization
straints Du + d0 u0 − BT w = o we would have to deal with
problem can be evaluated independently of x. In these cases the
the constraint Du + d0 u0 − (B(u ))T w = o which is bilinear
robust counterpart reduces to a deterministic standard quadratic
in case B(u ) is linear in u. Theorem 4 does not place any
problem so that the exactness of the relaxation stays intact. The
restrictions on linear constraints, but quadratic ones have to
first set of instances for which this is the case are those where the
respect the key condition, in order for the relaxation to be
uncertainty set is constructed via suitable cones.
exact. Hence, the case of uncertain recourse could be tackled
if the problem data is such that the key condition is either Theorem 21. Let C ⊆ COP (Rn ) be a sub-cone of the cone of copos-
satisfied or can be relaxed, e.g., as in Bomze & Peng (2022). itive matrices and L, B ∈ S n be given matrices. Assume that U =
However, we do not know whether either of these strategies {U : U − L ∈ C, B − U ∈ C }. Then the completely positive relaxation is
are feasible for interesting instances of (41) with uncertain an exact reformulation and the robust counterpart reduces to a stan-
recourse. dard quadratic problem with data Q + L.
468
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
Theorem 22. Let C ∈ Rn×n be a nonsingular matrix and known (more or less) exactly whereas the rest of the problem data
define,
for some scalar
ρ > 0, the uncertainty set U := are subject to uncertainty with known probability distribution:
U ∈ S n : CT UCF ≤ ρ . Then
A ˜T
B
−1 ˜ =
Q . (50)
˜
B C˜
min max (Q + U ) • X = min Q − ρ CCT • X, (47)
X∈G (n ) U∈U X∈G (n )
Here u˜ = [B˜,C
˜ ] are the uncertain data. Such a situation may for ex-
i.e., the robust counterpart reduces to a single standard quadratic op- ample arise in portfolio optimization, when the relevant statistics
timization problem. on younger securities can be assessed less accurately due to lack
of historical data.
n n
6.1. Open problems Decomposing z ∈ Rn via zT = [xT , yT ] with (x, y ) ∈ R+1 × R+2
and n1 + n2 = n we arrive at the following problem reformulation
Robust properties of StQP solutions: Many problems in graph of the (random) objective function as
theory such as the stable-set problem and the maximum- q ( z, u
˜ ) = zT Q
˜ z = xT Ax + 2xT B
˜ T y + yT C
˜y .
clique problem have a reformulation as an StQP, and one can
infer the solution to these problems from the optimal solu- Taking the expectation with respect to the probability distribu-
tion of the respective StQP. It is however, unclear if these tion of ξ , we obtain the so-called recourse function
properties also hold for the robustified StQP. Conversely it is
not known whether robust versions of the stable-set prob- r (x ) := Eu˜ min 2xT B
˜ T y + yT C
˜ y : eT y = 1 − eT x
y∈Rn+2
lem or the maximum-clique problem can be modelled as a
robust StQP or, perhaps, a generalization of the latter model. and the two-stage stochastic StQP can be formulated as follows:
Convexified robust StQP: While the above discussion presents min s(x ) := xT Ax + r (x ) ,
cases where the robust StQP reduces to an instance of x∈T n1
StQP which can be tackled via standard convexification ap- with T n1 = conv o, ei : i ∈ [1 : n1 ] = conv (n1 ∪ {o} ).
proaches, a general convexification approach applicable out- In most cases, a two-stage stochastic problem cannot be solved
side of these special cases is not known. The complication directly, since merely evaluating the expected value can be in-
arises from the fact that the pointwise maximum of linear tractable. Thus, in practice one resorts to approximating the true
function is itself not linear but convex, and convex functions uncertainty measure by a finite discretization. This gives rise to the
may attain their optimum at points which are not extreme. so-called scenario problem, which in our case is given by:
Hence, G (n ) fails to deliver the effectiveness we enjoy in
S
the deterministic case.
min xT Ax + ps ( 2xT B
˜Ts ys +ys Cs ys )
T˜
x,y1 ,...,yS
s=1
7. Two-stage stochastic optimization for StQPs
eT x + eT ys = 1, s ∈ [1 : S],
In (Bomze, Gabl, Maggioni, & Pflug, 2021a) the authors consid- ys ≥ o, s ∈ [1 : S],
ered a two-stage stochastic version of the StQP. x ≥ o. (51)
Stochastic optimization deals with optimizing expected out-
comes of uncertain optimization problems, i.e., As we can see, the discretization is achieved by condensing the
true probability measures to a set of S scenarios with associated
min {Eu˜ ( f (x, u
˜ ) )}. (48) probabilities ps s ∈ [1 : S]. There are many schemes on how to ob-
x∈X
tain these discretizations, and it would be beyond the scope of
where the expected value is taken with respect to the random vec-
this text to dicuss them here; the interested reader may consult
˜ , which is defined by a known probability space (, A, P ) with
tor u
the references given in Bomze et al. (2021a, Section 2). Other tech-
support , probability distribution P and σ -field A. Analogously to
niques are preoccupied with reducing the size of an existing dis-
the adjustable robust setting, in two stage stochastic optimization
cretization, in order to obtain a more manageable problem size.
one seeks a decision on the first-stage variables and on a second-
For example Bomze et al. (2021a) employed a dissection tech-
stage decision rule that adapts to the realization of the random
nique to the discretized probability measure. In essence, scenar-
event. Thus, we are considering
ios are grouped together into m groups. Then the smaller sce-
nario problems, that only involve scenarios from one group at time,
min f1 (x ) + Eu˜ min f2 (x, y, u
˜) . (49) are solved using probabilities conditional on the respective group.
x∈X y∈Y (x,u
˜)
The so obtained solutions are averaged, with weights given by
Here we make a choice on the first-stage variables x and second- the probability of the respective group, in order to obtain a lower
stage policies y so that we optimize the sum of a deterministic bound on the scenario problem. By varying the size of the groups
first-stage outcome and the expected value of the optimal second- one can trade-off accuracy against the benefit of having to solve
stage choice. Note, that the innermost minimization problem de- smaller problems.
pends on the random vector u ˜ so that the decision vector y is im- Since (51) describes a class of non-convex QCQPs, which con-
˜ . Hence, the setting is indeed analogous to
plicitly a function of u tains the StQP as a special case, it is NP-hard. However, it clearly
the adjustable robust setting. However, for our purposes it will not is amenable to a convex reformulation based on Theorem 4. Such
be necessary to model y explicitly as a function as it is done in n +Sn +1
a reformulation would involve the cone CPP (R+1 2 ), hence a
ARO. Also, in our case the constraints linking y to x are not uncer- lifting in a space that scales quadratically with S. This is prob-
tain but deterministic: y ∈ Y (x ). lematic as the quality of the approximations yielded by the sce-
Here we are dealing with the special case of the (typically non- nario problem depends on the number of scenarios considered. As
convex) StQP of the form a consequence, the classical convex reformulation becomes imprac-
˜z,
min zT Q tical for those very cases where the scenario problem is relevant,
z ∈ n namely when S is large. The authors of Bomze et al. (2021a) there-
where uncertainty is considered only in the objective function. fore propose an alternative, albeit weaker, relaxation that scales
˜ is
Suppose a (possibly) small n1 × n1 principal submatrix A of Q linearly with S:
469
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
Theorem 23. Consider the problem found so far, nor were the authors able to produce a coun-
S
terexample.
min xT Ax + aT x + xT Bs ys + yT
s Cy s + cs ys
T
x,y1 ,...,yS
s=1 8. Mixed-binary linear optimization problems under objective
s.t. : eT x + eT ys = 1 , s ∈ [1 : S] , uncertainty
x, y1 , . . . , yS ≥ o . (52)
In (Natarajan, Teo, & Zheng, 2011) the authors considered the
The following conic optimization problem gives a lower bound, and if following optimization problem
cs = αs e, Bs = bs eT , αs ∈ R, bs ∈ Rn1 , i ∈ [1 : S] the bound is actu-
ally tight: Z (c˜ ) := max c˜ T x : Ax = b, x j ∈ {0, 1}, j ∈ B
x∈R +
n
S
where c˜ are uncertain objective function coefficients whose true
min A • X + aT x + Bs • Zs + Cs • Ys + cT
s ys
X,Ys ,Zs . ys probability distribution P is assumed to have support in Rn+ and
s=1
apart from that is ambiguous up to its first two moments, the
s.t. : eT x + eT ys = 1 , s ∈ [1 : S] , mean μ := E(c˜ ) and covariance matrix := E(c˜ c˜ T ). The authors
E • X + 2 E • Zs + E • Ys = 1 , s ∈ [1 : S] , aim to give an upper bound on EP [Z (c˜ )] by considering
T
1 x yT
s sup E[Z (c˜ )] ≥ EP [Z (c˜ )]
x X ZT
s ∈ CP P (Rn+1 +n2 +1 ), s ∈ [1 : S] . c˜ ∼(μ,)+
ys Zs Ys
where (μ, )+ is the set of all distributions with nonnegative sup-
(53)
port, mean μ and covariance matrix . While the approach seems
related to the two-stage distributionally robust paradigm, since the
Compared to the classical reformulation which involves
n +Sn +1 decision variables are allowed to adjust to the outcome of the un-
CPP (R+1 2 ), the above relaxation merely exhibits S conic con- certainty the same way it would in a recourse problem, it is dif-
n +n +1
straints involving CPP (R+1 2 ), hence growing linearly with S. ferent in that we do not consider the worst-case distribution, but
This advantage comes at the cost of losing the exactness, so that rather the best-case distribution. However, the worst-case inter-
outside of the special cases mentioned in the theorem, the conic pretation remains valid if the underlying optimization problem al-
problem only provides a lower bound. However, numerical exper- ready is a worst-case estimation, such as for the longest path prob-
iments conducted in Bomze et al. (2021a), comparing the bounds lem. Another way to interpret this model is to see it as the second
obtained by solving the DN N -relaxation of both the classical re- stage of a two-stage distributionally robust optimization problem
formulation and (53), suggest that the gap between the two tends where Z (c˜ ) is the dual of recourse problem problem with uncer-
to be very small. In fact the gap is so small that the authors hinted tain right-hand sides (which is a valid interpretation if B = ∅). In-
at the possibility that it is merely a numerical artefact. The reduc- deed both interpretations have featured in literature following up
tion of computational effort on the other hand is substantially in (Natarajan et al., 2011), which we will briefly discuss at the end of
favor of the lower-dimensional bound. this section.
We also would like to stress that the proof of Theorem 23 relies The authors approach this bound by providing a copositive re-
heavily on the theory laid out in Section 2. By replacing the CPP formulation of
constraint with a more complicated conic constraint in a follow-up
paper (Gabl, 2022), it is even possible to close the relaxation gap sup E max c˜ T x : Ax = b, x j ∈ {0, 1} for all j ∈ B (54)
between (52) and (53) entirely. Among the two proofs of this re- c˜ ∼(μ,)+
nx∈R +
sult, one follows the strategy described in Section 2.1.2. The conic
which necessitates the following set of assumptions:
constraint used in Gabl (2022) is a structured generalization of
CPP -type cones and can be approximated via similar means. Assumption 2. The following statements on Z (c˜ ) hold :
Another interesting feature of the methodology proposed in
Bomze et al. (2021a) was their combination of upper bounds ob- 1. The set (μ, )+ is nonempty.
tained by relaxations, first-order methods and global optimization 2. x ∈ Rn+ : Ax = b implies x j ≤ 1, j ∈ B.
solvers. As it turns out, (53) preserves the original space of vari- 3. The feasible region of the inner maximization problem is
ables and thus yields not only a lower but also an upper bound. nonempty and bounded.
This feasible solution can be used as starting point for local algo-
rithms such as the pairwise Frank–Wolfe algorithm, or for global Note that the first assumption holds exactly if
solvers such as Gurobi. The quality of these refined upper bounds
1 μT
can then be assessed relative to the lower bound obtained by the ∈ CPP (Rn++1 ),
μ
relaxation. As numerical experiments suggest, optimality gaps can
be reduced substantially and with reasonable computational effort, which is, of course, an NP-hard task unless n + 1 ≤ 4. The other
and moreover the combination of procedures yields better results two assumptions are checked easily, the second one can even
than each method would produce on their own. be enforced generically by introducing additional constraints and
slack variables (see our discussion succeeding Theorem 4).
7.1. Open problems The reformulation rests on a particular mixed-moment lifting.
More precisely, let x(c ) denote the optimal solution (or in case of
Efficacy of the sparse model: As stated before, the bounds pro- non-uniqueness, a measurable selection from the set of optimal so-
duced by applying the DN N -relaxation to (53) are almost lutions) to Z (c ) where c is a realization of c˜ . Then x(c˜ ) is a random
identical to the ones obtained from the DN N -relaxation of vector, and we define the random vector
the classical model based on Theorem 4. Based on the ex-
periments in Bomze et al. (2021a), we cannot rule out the 1
possibility that the sparse relaxation is in fact tight. How- y(c˜ ) := c˜ ∈ R2+n+1
ever, despite some effort in Gabl (2022), no such result was x(c˜ )
470
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
471
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
472
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
min cT x + R̄δ (x ) . (60) readers to engage in these challenging questions and to expand on
x∈X
these ideas in future research.
Again we have a finite, conic optimization problem for which the
following theorem can be proved. Appendix A. Longer proofs
473
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
0 • Y = 0, Yn+1 = 1
= Y ∈ CPP (R+ × K ) : Q moreover, there is a c ∈ intK2∗ \ {o} such that cT x > 0. Since
= H ∩ J1 c ∈ intK2∗ ⊆ intK1∗ , the set
474
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
475
I.M. Bomze and M. Gabl European Journal of Operational Research 310 (2023) 449–476
Burer, S., & Anstreicher, K. M. (2013). Second-order-cone constraints for extended Natarajan, K., & Teo, C.-P. (2017). On reduced semidefinite programs for second
trust-region subproblems. SIAM Journal on Optimization, 23(1), 432–451. order moment bounds with applications. Mathematical Programming, 161(1),
Dickinson, P. (2010). An improved characterisation of the interior of the completely 487–518.
positive cone. The Electronic Journal of Linear Algebra, 20, 723–729. Natarajan, K., Teo, C.-P., & Zheng, Z. (2011). Mixed 0–1 linear programs under ob-
Dickinson, P. J. (2013). The copositive cone, the completely positive cone and their gen- jective uncertainty: A completely positive representation. Operations Research,
eralisations. The Netherlands: University of Groningen Ph.D. dissertation. 59(3), 713–728.
Dickinson, P. J., & Povh, J. (2013). Moment approximations for set-semidefinite poly- Padmanabhan, D., Natarajan, K., & Murthy, K. (2021). Exploiting partial correla-
nomials. Journal of Optimization Theory and Applications, 159(1), 57–68. tions in distributionally robust optimization. Mathematical Programming, 186(1),
Dickinson, P. J. C., & Gijben, L. (2014). On the computational complexity of mem- 209–255.
bership problems for the completely positive cone and its dual. Computational Pardalos, P. M., & Vavasis, S. A. (1991). Quadratic programming with one negative
Optimization and Applications, 57(2), 403–415. eigenvalue is NP-hard. Journal of Global Optimization, 1(1), 15–22.
Dür, M. (2010). Copositive programming-a survey. In M. Diehl, F. Glineur, E. Jar- Parrilo, P. A. (20 0 0a). Semidefinite programming based tests for matrix copositiv-
lebring, & W. Michiels (Eds.), Recent advances in optimization and its applications ity. In Proceedings of the 39th IEEE conference on decision and control (Cat. No.
in engineering (pp. 3–20). Berlin Heidelberg: Springer. 00CH37187): vol. 5 (pp. 4624–4629). IEEE.
Dür, M., & Rendl, F. (2021). Conic optimization: A survey with special focus on Parrilo, P. A. (20 0 0b). Structured semidefinite programs and semialgebraic geometry
copositive optimization and binary quadratic problems. EURO Journal of Com- methods in robustness and optimization. California Institute of Technology Ph.D.
putational Optimization, 9, 10 0 021. Thesis.
Dür, M., & Still, G. (2008). Interior points of the completely positive cone. The Elec- Pataki, G. (1998). On the rank of extreme matrices in semidefinite programs and the
tronic Journal of Linear Algebra, 17, 48–53. multiplicity of optimal eigenvalues. Mathematics of Operations Research, 23(2),
Eichfelder, G., & Povh, J. (2013). On the set-semidefinite representation of noncon- 177–203.
vex quadratic programs over arbitrary feasible sets. Optimization Letters, 7(6), Peña, J., Vera, J., & Zuluaga, L. F. (2007). Computing the stability number of a graph
1373–1386. via linear and semidefinite programming. SIAM Journal on Optimization, 18(1),
Esfahani, P. M., & Kuhn, D. (2018). Data-driven distributionally robust optimization 87–105.
using the Wasserstein metric: Performance guarantees and tractable reformula- Pólik, I., & Terlaky, T. (2007). A survey of the S-lemma. SIAM Review, 49(3), 371–418.
tions. Mathematical Programming, 171(1), 115–166. Rahimian, H., & Mehrotra, S. (2019). Distributionally robust optimization: A review.
Fan, X., & Hanasusanto, G. A. (2021). A decision rule approach for two-stage data- ArXiv preprint arXiv:1908.05659
driven distributionally robust optimization problems with random recourse. Rockafellar, R. T. (2015). Convex analysis. Princeton and Oxford: Princeton University
arXiv preprint arXiv:2110.0 0 088 Press.
Gabl, M. (2022). Sparse conic reformulation of structured QCQPs based Shor, N. Z. (1987). Quadratic optimization problems. Soviet Journal of Computer and
on copositive optimization with applications in stochastic optimization. Systems Sciences, 25(6), 1–11.
arXiv preprint arXiv:2101.06219 Sponsel, J., Bundfuss, S., & Dür, M. (2012). An improved algorithm to test copositiv-
Gorissen, B. L., Yanikoglu, I., & den Hertog, D. (2015). A practical guide to robust ity. Journal of Global Optimization, 52(3), 537–551.
optimization. Omega, 53, 124–137. Sturm, J. F., & Zhang, S. (2003). On cones of nonnegative quadratic functions. Math-
Groetzner, P., & Dür, M. (2020). A factorization method for completely positive ma- ematics of Operations Research, 28(2), 246–267.
trices. Linear Algebra and its Applications, 591, 1–24. Tuncel, L. (2001). On the Slater condition for the SDP relaxations of nonconvex sets.
Hall, M., Jr., & Newman, M. (1963). Copositive and completely positive quadratic Operations Research Letters, 29(4), 181–186.
forms. Proceedings of the Cambridge Philosophical Society, 59, 329–339. Tuncel, L., & Wolkowicz, H. (2012). Strong duality and minimal representations for
Hanasusanto, G. A., & Kuhn, D. (2018). Conic programming reformulations of cone optimization. Computational Optimization and Applications, 53(2), 619–648.
two-stage distributionally robust linear programs over Wasserstein balls. Opera- Wiesemann, W., Kuhn, D., & Sim, M. (2014). Distributionally robust convex opti-
tions Research, 66(3), 849–869. mization. Operations Research, 62(6), 1358–1376.
Iancu, D. A., Sharma, M., & Sviridenko, M. (2013). Supermodularity and affine poli- Woolnough, D., Jeyakumar, V., & Li, G. (2021). Exact conic programming reformula-
cies in dynamic robust optimization. Operations Research, 61(4), 941–956. tions of two-stage adjustable robust linear programs with new quadratic deci-
Jeyakumar, V., Li, G., & Woolnough, D. (2021). Quadratically adjustable robust linear sion rules. Optimization Letters, 15, 25–44.
optimization with inexact data via generalized S-lemma: Exact second-order Xia, W., Vera, J. C., & Zuluaga, L. F. (2020). Globally solving nonconvex quadratic
cone program reformulations. EURO Journal on Computational Optimization, 9, programs via linear integer programming techniques. INFORMS Journal on Com-
10 0 019. puting, 32(1), 40–56.
Jiang, R., Ryu, M., & Xu, G. (2019). Data-driven distributionally robust appointment Xu, G., & Burer, S. (2018). A copositive approach for two-stage adjustable robust
scheduling over Wasserstein balls. arXiv preprint arXiv:1907.03219 optimization with uncertain right-hand sides. Computational Optimization and
Kim, S., Kojima, M., & Toh, K.-C. (2020). A geometrical analysis on convex conic re- Applications, 70(1), 33–59.
formulations of quadratic and polynomial optimization problems. SIAM Journal Xu, G., & Hanasusanto, G. A. (2018). Improved decision rule approximations for
on Optimization, 30(2), 1251–1273. multi-stage robust optimization via copositive programming. Preprint, Available
de Klerk, E., & Pasechnik, D. V. (2002). Approximation of the stability number at http://www.optimization-online.org/DB_HTML/2018/08/6776.html.
of a graph via copositive programming. SIAM Journal on Optimization, 12(4), Yakubovich, V. A. (1971). S-procedure in nonlinear control theory. Vestnik Leningrad
875–892. University, 1, 62–77.
Kong, Q., Lee, C.-Y., Teo, C.-P., & Zheng, Z. (2013). Scheduling arrivals to a stochas- Yan, Z., Gao, S. Y., & Teo, C.-P. (2018). On the design of sparse but efficient structures
tic service delivery system using copositive cones. Operations Research, 61(3), in operations. Management Science, 64(7), 3421–3445.
711–726. Yang, B., Anstreicher, K., & Burer, S. (2016). Quadratic programs with hollows. Math-
Kong, Q., Li, S., Liu, N., Teo, C.-P., & Yan, Z. (2015). Appointment scheduling un- ematical Programming, 170(2), 541–553.
der schedule-dependent patient no-show behavior. Management Science, 68(8), Yang, B., & Burer, S. (2013). A two-variable analysis of the two-trust-region subprob-
3480–3500. lem. SIAM Journal on Optimization, 26(1), 661–680.
Lasserre, J. B. (2001). Global optimization with polynomials and the problem of mo- Yanikoglu, I., Gorissen, B., & den Hertog, D. (2019). A survey of adjustable robust
ments. SIAM Journal on Optimization, 11(3), 796–817. optimization. European Journal of Operational Research, 277, 799–813.
Mittal, A., Gökalp, C., & Hanasusanto, G. A. (2019). Robust quadratic programming Yıldırım, E. A. (2012). On the accuracy of uniform polyhedral approximations of the
with mixed-integer uncertainty. INFORMS Journal on Computing, 32(2), 201–218. copositive cone. Optimization Methods and Software, 27(1), 155–173.
Mittal, A., & Hanasusanto, G. A. (2021). Finding minimum volume circumscribing Zhao, C., & Guan, Y. (2018). Data-driven risk-averse stochastic optimization with
ellipsoids using generalized copositive programming. Operations Research, 70(5), Wasserstein metric. Operations Research Letters, 46(2), 262–267.
2867–2882. Zhen, J., Marandi, A., de Moor, D., den Hertog, D., & Vandenberghe, L. (2022). Dis-
Motzkin, T. (1818). Copositive quadratic forms. National Bureau of Standards Report, joint bilinear optimization: A two-stage robust optimization perspective. IN-
1952, 11–22. FORMS Journal on Computing, 34(5), 2410–2427.
476