线性约束非Lipschitz非凸优化问题的平滑活动集方法及其应用（可复现，有问题请联系博主）_smoothing active set method for nonsmooth problems资源-CSDN下载

版权申诉

投影梯度法

161 浏览量 2025-03-19 12:20:52 上传评论收藏 796KB PDF 举报

资源推荐

资源详情

资源评论

SIAM J. OPTIM.

 2020 Society for Industrial and Applied Mathematics

Vol. 30, No. 1, pp. 1–30

A SMOOTHING ACTIVE SET METHOD FOR LINEARLY

CONSTRAINED NON-LIPSCHITZ NONCONVEX OPTIMIZATION

∗

CHAO ZHANG

†

AND XIAOJUN CHEN

‡

Abstract. We propose a novel smoothing active set method for linearly constrained non-

Lipschitz nonconvex problems. At each step of the proposed method, we approximate the objective

function by a smooth function with a ﬁxed smoothing parameter and employ a new active set method

for minimizing the smooth function over the original feasible set, until a special updating rule for the

smoothing parameter meets. The updating rule is always satisﬁed within a ﬁnite number of iterations

since the new active set method for smooth problems proposed in this paper forces at least one sub-

sequence of projected gradients to zero. Any accumulation point of the smoothing active set method

is a stationary point associated with the smoothing function used in the method, which is necessary

for local optimality of the original problem. And any accumulation point for the 

− 

(0 < p < 1)

sparse optimization model is a limiting stationary point, which is a local minimizer under a certain

second-order condition. Numerical experiments demonstrate the eﬃciency and eﬀectiveness of our

smoothing active set method for hyperspectral unmixing on a 3 dimensional image cube of large size.

Key words. non-Lipschitz, nonconvex, linearly constrained, smoothing active set method,

stationary point

AMS subject classiﬁcations. 65K10, 90C26, 90C46

DOI. 10.1137/18M119611X

1. Introduction. Active set methods have been successfully used for linearly

constrained smooth optimization problems of large size; see [8, 13, 17, 18, 25, 42]

and references therein. Hager and Zhang developed a novel active set algorithm for

the bound constrained smooth optimization problems in [17], and ten years later they

extended the method to solve linearly constrained smooth optimization problems [18].

The active set method in [18] switches between phase one that employs the gradient

projection algorithm for the original problem and phase two that uses an algorithm

with certain requirements for solving linearly constrained optimization problems on a

face of the original feasible set. Hager and Zhang [18] showed that any accumulation

point of the sequence generated by their method is a stationary point, and only phase

two is performed after a ﬁnite number of iterations under certain conditions.

For linearly constrained nonsmooth convex optimization problems, Panier pro-

posed an active set method [29], in which the search direction is computed by a

bundle principle. And the convergence result is obtained under a certain nondegener-

acy assumption. Wen et al. developed an active set algorithm for the unconstrained



minimization with good numerical performance and convergence results [36, 37].

For bound-constrained nonsmooth nonconvex optimization, Keskar and W¨achter pro-

posed a limited-memory quasi-Newton algorithm which uses an active set selection

strategy to deﬁne the subspace in which search directions are computed [21]. Numer-

∗

Received by the editors June 22, 2018; accepted for publication (in revised form) August 23,

2019; published electronically January 2, 2020.

https://doi.org/10.1137/18M119611X

Funding: The ﬁrst author's work was supported in part by NSFC grants 11571033 and

11431002. The second author's work was supported in part by Hong Kong Research Council grant

PolyU15300/17P.

†

Department of Applied Mathematics, Beijing Jiaotong University, Beijing 100044, China

([email protected]).

‡

Department of Applied Mathematics, The Hong Kong Polytechnic University, Hung Hum,

Kowloon, Hong Kong (xiaojun.c[email protected]).

Downloaded 04/03/23 to 158.132.161.185 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

2 CHAO ZHANG AND XIAOJUN CHEN

ical experiments were conducted to show the eﬃcacy of the algorithm, but theoretical

convergence guarantees are elusive even for the unconstrained case. To the best of

our knowledge, there is no active set method that tackles linearly constrained non-

Lipschitz nonconvex optimization problems with solid convergence results.

One eﬀective way to overcome the nonsmoothness in optimization is the type of

smoothing methods which uses the structure of the problem to deﬁne smoothing func-

tions and the algorithms for solving smooth problems. Nesterov proposed a smoothing

scheme [27] for minimizing a nonsmooth convex function over a convex set. Zhang

and Chen proposed a smoothing projected gradient method [41] for minimizing a Lip-

schitz continuous function over a convex set. Bian and Chen developed a smoothing

quadratic regularization method [4] for a class of linearly constrained non-Lipschitz

optimization problems arising from image restoration. Xu, Ye, and Zhang proposed

a smoothing sequential quadratic programming method [38] for solving degenerate

nonsmooth and nonconvex constrained optimization problems with applications to

bilevel programs. Liu et al. proposed a smoothing sequential quadratic programming

framework [26] for a class of composite 

(0 < p < 1) minimization over a polyhedron.

Inspired by the active set method [18] and the smoothing technique, we develop

a novel smoothing active set method with solid convergence results for the following

minimization problem

min f (x) s.t. x ∈ Ω,(1.1)

where f : R

→ R is continuous but not necessarily Lipschitz continuous and

Ω = {x ∈ R

: c

x = d

, i ∈ M

; c

x ≤ d

, i ∈ M

}.(1.2)

Here M

= {1, 2, . . . , m

}, M

= {m

+ 1, m

+ 2, . . . , m}, M = M



, and

∈ R

, d

∈ R for i = 1, 2, . . . , m.

Problem (1.1) involving a sparsity penalized term in the objective function has

recently intrigued a lot of researchers. It serves as a basic model for a variety of im-

portant applications, including the compressed sensing [1], the edge-preserving image

restoration [4, 28], the sparse nonnegative matrix factorization for data classiﬁcation

[40], and the sparse portfolio selection [9, 15]. For example, the widely used 

− 

(0 < p < 1) sparse optimization model

min Ax − b

+ τ x

s.t. x ≥ 0,(1.3)

where  ·  refers to the Euclidean norm, x



i=1

, and A ∈ R

l×n

, b ∈ R

and τ > 0 are given. The non-Lipschitz nonconvex term x

in the objective function

and the nonnegative constraints beneﬁt to recover some prior knowledge such as the

sparsity of the signal, or the range of pixels. It is worth mentioning that in typical

compressive sensing or image restoration, the dimension of optimization problems is

large.

In order to develop the smoothing active set method, we ﬁrst assume f is smooth

in (1.1) in section 2 and develop an eﬃcient new active set method for the linearly

constrained smooth problems, which can be considered as a modiﬁcation of the active

set algorithm [18]. The new active set method combines the projected gradient (PG)

method [8] and a linearly constrained optimizer (LCO) that satisﬁes mild require-

ments. We show in Theorem 2.2 that the new active set method forces at least one

subsequence of projected gradients to zero. This property is essential in developing

the smoothing active set method with global convergence in section 3. It is guaran-

teed that any accumulation point of the sequence generated by the new active set

Downloaded 04/03/23 to 158.132.161.185 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

SMOOTHING ACTIVE SET METHOD 3

method is a stationary point. Moreover, if the sequence generated by the new active

set method converges to a stationary point x

∗

, then the sequence can identify the set

of strongly active constraints and hence is trapped by the face exposed by −∇f(x

∗

)

after a ﬁnite number of iterations. The convergence and identiﬁcation properties are

not guaranteed by the active set method in [18] for the smooth problems. Based on

the identiﬁcation properties, we also prove the local convergence result that if the

sequence converges to x

∗

and the strong second-order suﬃcient optimality condition

holds, then only the LCO is executed after a ﬁnite number of iterations.

Combining the new active set method for linearly constrained smooth minimiza-

tion problem with delicate smoothing strategies, we then develop in section 3 a novel

smoothing active set method that solves the linearly constrained non-Lipschitz min-

imization problem (1.1). The new active set method for smooth problems is used to

solve the smoothing problems. We give the concept of a stationary point associated

with the smoothing function and show that it is necessary for optimality of the original

problem. We show that any accumulation point generated by the smoothing active

set method is a stationary point of the original problem. Moreover, it is a limiting

stationary point of problem (1.3). If in addition a second-order condition holds, it is

also a strict local minimizer of (1.3).

We conduct numerical experiments on real applications of large scale in hyper-

spectral unmixing in section 4. The numerical results manifest that the smoothing

active set method performs favorably in comparison to several state-of-the-art meth-

ods in hyperspectral unmixing.

Throughout the paper, we use the following notation. x, y = x

y presents the

inner product of two vectors x and y of the same dimension. R

= {x ∈ R

: x ≥ 0}

and R

= {x ∈ R

: x > 0}. |S| corresponds to the cardinality of a ﬁnite set S. If

S is a subset of {1, 2, . . . , n}, then for any vector u ∈ R

and M ∈ R

n×n

, u

is the

subvector of u whose entries lie in u indexed by S, and M

denotes the submatrix

of M whose rows and columns lie in S. N(M) is the null space of M. Let N be the

set of all natural numbers and N



∞

be the inﬁnite subsets of N. We use the notation

−→

for the convergence indexed by N ∈ N



∞

. The normal cone to a closed convex set

Ω at x is denoted by N

Ω

(x), and P

Ω

[x] = argmin{z − x : z ∈ Ω} is the orthogonal

projection from x into Ω. The ball with center x

∗

and radius δ is denoted by B

∗

For any x ∈ R

, the active and free index sets are deﬁned by

A(x) := M

∪ {i ∈ M

: c

x = d

}, F(x) := {i ∈ M

: c

x < d

2. A new active set method for linearly constrained smooth minimiza-

tion. In this section, we consider the following linearly constrained smooth problem

min f (x) s.t. x ∈ Ω,(2.1)

where f is continuously diﬀerentiable and Ω is deﬁned in (1.2).

Recall that the projected gradient ∇

Ω

f(x) is deﬁned by

∇

Ω

f(x) ≡ P

T (x)

[−∇f(x)] = argmin{v + ∇f(x) : v ∈ T (x)},

where T (x) is the tangent cone to Ω at x. Calamai and Mor´e (Lemma 3.1 of [8])

showed that x

∗

∈ Ω is a stationary point of (2.1) if and only if ∇

Ω

f(x

∗

) = 0. It is

worth mentioning that ∇

Ω

f(x) can be bounded away from zero in a neighborhood of

a stationary point x

∗

, since ∇

Ω

f(·) is not continuous, but only lower semicontinuous

on Ω according to Lemma 3.3 of [8]. That is, for any {x

} ⊂ Ω converging to x,

∇

Ω

f(x) ≤ lim inf

k→∞

∇

Ω

f(x

).

Downloaded 04/03/23 to 158.132.161.185 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

4 CHAO ZHANG AND XIAOJUN CHEN

A stationary point x

∗

of (2.1) is often characterized as

∗

) := P

Ω

∗

− ∇f(x

∗

)] − x

∗

= 0.

We ﬁnd that convergence of most existing active set methods for (2.1) is to

show lim inf

k→∞

d

) = 0, such as the active set method in [18]. However, since

the norm of projected gradient is not continuous, lim inf

k→∞

d

) = 0 does not

imply lim inf

k→∞

∇

Ω

f(x

) = 0. See Example 1 in section 2. The new active set

method proposed in this section aims to have

lim inf

k→∞

∇

Ω

f(x

) = 0,

which is essential for showing the convergence result of the smoothing active set

method for solving nonsmooth problem (1.1) proposed in section 3.

2.1. Structure of the new active set method. Now we introduce the neces-

sary notation used in the new active set method. Let us denote g(x) = ∇f(x). Given

an index set S satisfying M

⊆ S ⊆ M, we deﬁne g

(x) ∈ R

(x) = P

N (C

)

[g(x)] = arg min{y − g(x) : y ∈ R

and C

y = 0},(2.2)

where C

∈ R

n×|S|

is the matrix whose columns are c

, i ∈ S. In particular, we denote

(x) for S = A(x) and if A(x) = ∅, then g

(x) = g(x). Thus g

(x) is the unique

optimal solution of the strongly convex problem

min

y − g(x)

s.t. c

y = 0, i ∈ A(x).(2.3)

From the ﬁrst-order optimality conditions, it is easy to ﬁnd that for x ∈ Ω, g

(x) = 0

if and only if x is a stationary point of f on its associated face

Ω(x) := {y ∈ Ω : c

y = d

for all i ∈ A(x)}.(2.4)

Let x

∗

be a stationary point of (2.1) and Λ(x

∗

) be the set of Lagrange multipliers

associated with the constraints. That is, x

∗

∈ Ω and for any λ

∗

∈ Λ(x

∗

), (x

∗

, λ

∗

)

satisﬁes

(2.5)

g(x

∗

) +



i∈M

∗

= 0,

∗

≥ 0 if i ∈ M

∩ A(x

∗

), λ

∗

= 0 if i ∈ F(x

∗

− d

) = 0 for all i ∈ M

Consider

y(x, α) = P

Ω

[x − αg(x)] = argmin



x − αg(x) − y

: y ∈ Ω



,(2.6)

where α > 0 is a given number. Thus there exists λ ∈ R

such that (y(x, α), λ)

satisﬁes

(2.7)

y(x, α) − (x − αg(x)) +



i∈M

= 0,

≥ 0 if i ∈ M

∩ A(y(x, α)), λ

= 0 if i ∈ F(y(x, α)),

y(x, α) − d

) = 0 for all i ∈ M

Let Λ(x, α) be the set of Lagrange multipliers satisfying (2.7) at the solution y =

y(x, α) of (2.6). It is easy to see that

y(x

∗

, α) = x

∗

and Λ(x

∗

, α) = αΛ(x

∗

).(2.8)

Downloaded 04/03/23 to 158.132.161.185 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

SMOOTHING ACTIVE SET METHOD 5

In the new active set method, it employs either the iteration of the PG method

or the iteration of the LCO by given rules. Let x

be the current iterate and the LCO

be chosen to get the new iterate. Then the LCO solves the problem

min f(y) s.t. y ∈

Ω(x

),(2.9)

which operates on the faces of Ω. Compared to the original problem (2.1), there are

usually many more equality constraints in (2.9) which may lead the eﬃciency of the

LCO. This is obviously true when the feasible set is deﬁned by the bound constraints

or the simplex constraint (which are sometimes called “hard constraints” and it is

better to satisfy them strictly rather than penalize them into the objective function).

The PG step comes from the classic “piecewise PG method” proposed in [8], and an

arbitrary LCO can be chosen as long as it satisﬁes certain requirements listed below.

• PG method:

Given ρ, β ∈ (0, 1). For k = 1, 2, . . . ,

set d

= −g(x

) and let x

k+1

= P

Ω

+ α

], where α

is determined by

the Armijo line search, i.e., α

= max{ρ

, ρ

, . . .} is chosen such that

f(x

k+1

) ≤ f(x

) + βg(x

), x

k+1

− x

.(2.10)

• LCO requirements:

For k = 1, 2, . . . ,

F1: x

∈ Ω and f(x

k+1

) ≤ f(x

) for each k.

F2: A(x

) ⊆ A(x

k+1

) for each k.

F3: If ∃

k > 0 such that A(x

) ≡

A for all j ≥

k, then lim inf

j→∞

g

) =

F1 and F2 of the LCO requirements are satisﬁed, as long as the LCO adopts

a monotone line search, and whenever a new constraint becomes active, it changes

the corresponding inequality constraint to the equality constraint in (2.9). Later we

always assume the two strategies are incorporated into the LCO. F3 requires that if

the active set becomes stable as A(x

) ≡

A, then at least one accumulation point

∗

of the sequence {x

} generated by the LCO is a stationary point of problem (2.9)

with

Ω(x

) = {y ∈ Ω : c

y = d

for all i ∈

A}. Note that in this case x

∗

is a

stationary point if and only if g

∗

) = 0. And since g

(x) = P

N (C

)

[g(x)], we know

that g

(·) : R

→ R

is a continuous function. Thus g

∗

) = 0 indicates

lim inf

j→∞

g

) = lim inf

j→∞

g

) = 0.

Therefore the LCO requirements can be easily fulﬁlled by many algorithms based

on gradient or Newton type iterations that employ a monotone line search and add

constraints to the active set whenever a new constraint becomes active, e.g., the PG

method [8], the method of Zoutendijk (section 10.1 of [2]), the Frank–Wolfe algorithm

[16], the ﬁrst-order interior-point method [33], and the aﬃne-scaling interior-point

method [19]. When Ω = R

, we can employ the LCO using essentially unconstrained

optimization methods such as the conjugate gradient method as in [17].

Now we are ready to outline the new active set method for problem (2.1).

2.2. Convergence analysis.

Assumption 2.1. For any Γ ∈ R, the level set

= {x ∈ Ω : f(x) ≤ Γ}

is bounded.

Downloaded 04/03/23 to 158.132.161.185 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

剩余29页未读，继续阅读

评论收藏

内容反馈

版权申诉