0% found this document useful (0 votes)
143 views17 pages

Optimization With R - Tips and Tricks

This document discusses optimization techniques in R. It provides an overview of over 100 R packages for solving optimization problems and categorizes different types of optimization tasks such as constrained, nonlinear, and integer programming. The document also demonstrates univariate minimization using the optimize() and optim() functions in R and showcases minimizing the Rosenbrock function using the Nelder-Mead optimization method. The goal is to help users understand how to set up and solve various optimization problems in R.

Uploaded by

RMolina65
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
143 views17 pages

Optimization With R - Tips and Tricks

This document discusses optimization techniques in R. It provides an overview of over 100 R packages for solving optimization problems and categorizes different types of optimization tasks such as constrained, nonlinear, and integer programming. The document also demonstrates univariate minimization using the optimize() and optim() functions in R and showcases minimizing the Rosenbrock function using the Nelder-Mead optimization method. The goal is to help users understand how to set up and solve various optimization problems in R.

Uploaded by

RMolina65
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Optimization with R –Tips and Tricks

Hans W Borchers, DHBW Mannheim Introduction

R User Group Meeting, Köln, September 2017

Optimization Mathematical Optimization

“optimization : an act, process, or methodology of making


something (such as a design, system, or decision) as fully perfect, A mathematical optimization problem consists of maximizing (or
functional, or effective as possible; minimizing) a real objective function on a defined domain:
specifically: the mathematical procedures (such as finding the Given a set A ⊆ R n and a function f : A → R from A to the real
maximum of a function) involved in this.” numbers, find an element x0 ∈ A such that f (x0 ) ≤ f (x ) for all x in
– Merriam-Webster Online Dictionary, 2017 (∗ ) an environment of x0 .

Forms of optimization (cf. Netspeak: “? optimization”): Typical problems:

� Code / program / system optimization


� finding an optimum will be computationally expensive
� Search / website / server . . . optimization
� different types of objective functions and domains
� Business / process / chain . . . optimization
� need to compute the optimum with very high accuracy
� Engine / design / production optimization
� need to find a global optimum, restricted resources
� etc.
(∗ ) First Known Use: 1857
Classification of Optimization Tasks 100+ Packages on the Optimization TV

adagio alabama BB boot bvls cccp cec2005benchmark cec2013


� Unconstrained optimization CEoptim clpAPI CLSOCP clue cmaes cmaesr copulaedas cplexAPI
� Nonlinear least-squares fitting (parameter estimation) crs dclone DEoptim DEoptimR desirability dfoptim ECOSolveR GA
� Optimization with constraints genalg GenSA globalOptTests glpkAPI goalprog GrassmannOptim
gsl hydroPSO igraph irace isotone kernlab kofnGA lbfgs lbfgsb3
� Non-smooth optimization (e.g., minimax problems)
limSolve linprog localsolver LowRankQP lpSolve lpSolveAPI
� Global optimization (stochastic programming) matchingMarkets matchingR maxLik mcga mco minpack.lm minqa
� Linear and quadratic programming (LP, QP) neldermead NlcOptim nleqslv nlmrt nloptr nls2 NMOF nnls onls
� Convex optimization (resp. SOCP, SDP) optimx optmatch parma powell pso psoptim qap quadprog quantreg
rcdd RCEIM Rcgmin rCMA Rcplex RcppDE Rcsdp Rdsdp rgenoud
� Mixed-integer programming (MIP, MILP, MINLP)
Rglpk rLindo Rmalschains Rmosek rneos ROI Rsolnp Rsymphony
� Combinatorial optimization (e.g., graph problems) Rvmmin scs smoof sna soma subplex tabuSearch trust trustOptim
TSP ucminf

Optimization in Statistics Goals for this Talk

� Maximum Likelihood
� Parameter estimation � Overview of (large, rapidly changing, still incomplete)
� Quantile and density estimation set of tools for solving optimization problems in R
� Appreciation of the types of problems and types of methods to
� LASSO estimation
solve them
� Robust regression � Advice on setting up problems and solvers
� Nonlinear equations � Suggestions for interpreting results
� Geometric programming problems � Some almost real-world examples
� Deep Learning / Support Vector Machines
Unfortunately, there is no time to talk about the new and exciting
� Engineering and Design, e.g. optimal control developments in convex optimization and optimization modelling
� Operations Research, e.g. network flow problems languages.
� Economics, e.g. portfolio optimization
Univariate (1-dim.) Minimization

optimize(f = , interval = , ..., lower = min(interval),


upper = max(interval), maximum = FALSE,
tol = .Machine$double.eps^0.25)

optim(par = , fn = , gr = NULL, ...,


Unconstrained Optimization method = "Brent",
lower = -Inf, upper = Inf)

optimizeR(f, lower, upper, ..., tol = 1e-20,


method = c("Brent", "GoldenRatio"),
maximum = FALSE,
precFactor = 2.0, precBits = -log2(tol) * precFactor,
maxiter = 1000, trace = FALSE)

1-dimensional Example optim() and Friends


f <- function(x) exp(-0.5*x) * sin(10*pi*x)
curve(f, 0, 1, n = 200, col=4); grid()
optim(par, fn, gr = NULL, ...,
opt <- optimize(f, c(0, 1))
method = c("Nelder-Mead", "BFGS", "CG", "L-BFGS-B",
points(opt$minimum, opt$objective, pch = 20, col = 2)
"SANN", "Brent"),
lower = -Inf, upper = Inf,
1.0

control = list(), hessian = FALSE)

Methods / Algorithms:
0.5

� Nelder-Mead - downhill simplex method


BFGS - “variable metric” quasi-Newton method
f(x)


0.0

� CG - conjugate gradient method


� L-BFGS-B - Broyden-Fletcher-Goldfarb-Shannon
Brent - univariate minimization, same as optimize
−0.5


� SANN - Simulated Annealing [don’t use !]
−1.0

0.0 0.2 0.4 0.6 0.8 1.0


Nelder-Mead Nelder-Mead in Action

Nelder-Mead iteratively generates a sequence of simplices to


approximate a minimal point.
At each iteration, the vertices of the simplex are ordered according
to their objective function values and the simplex ‘distorted’
accordingly.

� Sort function values on simplex


� Reflect compute the reflection point
� Expand compute the expansion point
� Contract (outside | inside)
� Shrink the simplex

Stop when the simplex is small enough (‘tolerance’).

Figure 1: Source: de.wikipedia.org

Showcase Rosenbrock optim() w/ Nelder-Mead


As a showcase we use the Rosenbrock function, defined for n ≥ 2. It fn <- adagio::fnRosenbrock; gr <- adagio::grRosenbrock
has has a very flat valley leading to its minimal point. sol <- optim(rep(0, 2), fn, gr, control=list(reltol=1e-12))
sol$par
n

f (x1 , . . . , xn ) = [100(xi+1 − xi2 )2 + (1 − xi )2 ]
i=2 ## [1] 0.9999996 0.9999992

The global minimum obviously is (1, . . . , 1) with value 0. fn <- adagio::fnRosenbrock; gr <- adagio::grRosenbrock
sol <- optim(rep(0, 10), fn, gr,
fnRosenbrock <- function (x) { control=list(reltol=1e-12, maxit=10000))
n <- length(x) sol$par; sol$counts
x1 <- x[2:n]; x2 <- x[1:(n - 1)]
sum(100 * (x1 - x2^2)^2 + (1 - x2)^2)
} ## [1] 0.487650105 0.218747555 0.074772474 0.008069353 0.007936313
## [6] 0.037545739 0.013695922 0.027284322 0.023147646 0.043194172
Available in package adagio as fnRosenbrock(),
## function gradient
with exact gradient grRosenbrock().
## 9707 NA
Nelder-Mead Solvers Adaptive Nelder-Mead
anms in pracma implements a new (Gao and Han, 2012) adaptive
� dfoptim Nelder-Mead algorithm, adapting to the size of the problem (i.e.,
dimension of the objective function).
nmk(par, fn, control = list(), ...)
nmkb(par, fn, lower=-Inf, upper=Inf, fn <- adagio::fnRosenbrock
control = list(), ...) pracma::anms(fn, rep(0, 20), tol = 1e-12, maxfeval = 25000)
� adagio
neldermead(fn, x0, ..., adapt = TRUE, ## $xmin
tol = 1e-10, maxfeval = 10000, ## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
step = rep(1.0, length(x0))) ##
## $fmin
� pracma [new] ## [1] 5.073655e-25
anms(fn, x0, ..., ##
tol = 1e-10, maxfeval = NULL) ## $nfeval
## [1] 22628

Gradient-Based Approaches Line Searches

Exploiting the direction of “steepest descent” as computed by the Given a function f : R n → R and a direction d ∈ R n , a line search
negative gradient −∇f (x ) of a multivariate function. method approximately minimizes f along the line
{x + t d | t ∈ R}.
� Steepest descent
dk = −∇f (xk ) Armijo-Goldstein inequality: 0 < c, ν < 1
� Conjugate Gradient (GC) f (x0 + t ∗ d) ≤ f (x0 ) + c ν k f � (x0 ; d), k = 0, 1, 2, . . .
||∇f (xk+1 )||
dk = −∇f (xk ) + βk dk−1 , d0 = −∇f (x0 ), e.g., βk = ||∇f (xk )|| (Weak) Wolf condition: 0 < c1 < c2 < 1
(Fletcher and Reeves).
f (xk + tk dk ) ≤ f (xk ) + c1 tk f � (xk ; dk )
� BFGS and L-BFGS-B
dk = −Hf (xk )−1 ∇f (xk ), Hf (x ) Hessian of f in x . c2 f � (xk ; dk ) ≤ f � (xk + tk dk ; gk )
Rosenbrock with Line Search BFGS and L-BFGS-B
Steepest descent direction vs. BFGS direction
Wolfe line search these two directions

The Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm


Iteration: While �∇fk � > � do

� compute the search direction: dk = −Hk ∇fk


� proceed with line search: xk+1 = xk + αdk
� Update approximate Hessian inverse: Hk+1 ≈ Hf (xk+1 )−1

L-BFGS – low-memory BFGS stores matrix Hk in O(n) storage.


BFGS-B – BFGS with bound constraints (‘active set’ approach).

optim() w/ BFGS Best optim() usage


optim(rep(0, 20), fn, gr, method = "BFGS",
control=list(reltol=1e-12, maxit=1000))$par optim(par, fn, gr = function(x) pracma::grad(fn, x), ...,
method = "L-BFGS-B"",
## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 lower = -Inf, upper = Inf,
control = list(factr = 1e-10,
optim(rep(0, 20), fn, method = "L-BFGS-B", maxit = 50*length(par)))
control=list(factr=1e-12, maxit=1000))$par # factr vs.

� use only method = "L-BFGS-B"


## [1] 0.9999987 0.9999984 0.9999982 0.9999981 0.9999980 0.9999980 (faster, more accurate, less memory, bound constraints)
## [8] 0.9999979 0.9999977 0.9999974 0.9999969 0.9999958 0.9999935
� use factr = 1e-10 for tolerance, default 1e07
## [15] 0.9999797 0.9999613 0.9999243 0.9998500 0.9997011 0.9994022
� set maxit = 50*d ... 50*dˆ2 (default is 100)
optim(rep(0, 20), fn, gr, method = "L-BFGS-B", # works for � use dfoptim or pracma for gradients
control=list(factr=1e-12, maxit=1000))$par (if you don’t have an analytical or exact gradient)
� look carefully at the output
## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
More BFGS Packages More quasi-Newton type Algorithms
� lbfgsb3 interfaces the Nocedal et al. ‘L-BFGS-B.3.0’ (2011)
(FORTRAN) minimizer with bound constraints. � stats::nlm [don’t ever use!]
� stats::nlminb [PORT routine]
BUT: Options like “maximum number of function calls” are not
accessible. (And the result is returned as ‘invisible’.) nlminb(start, objective, gradient = NULL, hessian = NULL
scale = 1, control = list(), lower = -Inf, upper
sol <- lbfgsb3(par, fn, gr = NULL, lower=-Inf, upper=Inf
� trustOptim::trust.optim [trust-region approach]
sol
no linesearch, suitable for sparse Hessians
trust.optim(x, fn, gr, hs = NULL, control = list(),
� lbfgs interfaces the ‘libBFGS’ C library by Okazaki with method = c("SR1", "BFGS", "Sparse"), ...)
Wolfe line search (based on Nocedal).
� ucminf::ucminf [BFGS + line search + trust region]
BUT: Bound constraints are not accessible through the API.
ucminf(par, fn, gr = NULL, ..., control = list(), hessian=
lbfgs(fn, gr, par, invisible=1)

ucminf with Rosenbrock More John Nash Work


fn <- adagio::fnRosenbrock; gr <- adagio::grRosenbrock
sol <- ucminf::ucminf(rep(0, 100), fn, gr, control=list(maxeval=
list(par=sol$par, value = sol$value, conv = sol$conv, mess Thorough implementation of quasi-Newton solvers in pure R.

� Rcgmin (“conjugate gradient”“)


## $par
� Rvmmin (“variable metric”“)
## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
� Rtnmin (“truncated Newton”)
## [36] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [71] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Apply, test, and compare different nonlinear optimization solvers for
## smooth, possibly bound constrained multivariate functions:
## $value
## [1] 1.223554e-15 � optimx, optimr, or optimrx?
##
## $conv optimrx::opm(rep(0, 10), fnRosenbrock, grRosenbrock,
## [1] 1 method = "ALL")
##
## $mess
## [1] "Stopped by small gradient (grtol)."
Comparison of Nonlinear Solvers Excurse: Computing Gradients
method value fevals gevals convd xtime
BFGS 3.127628e-21 291 98 0 0.003
CG 1.916095e-12 1107 408 0 0.010 � manually
Nelder-Mead 8.147198e+00 1501 NA 1 0.008 � symbolically: package Deriv
L-BFGS-B 5.124035e-10 78 78 0 0.001
� numerically: packages numDeriv or pracma
nlm 4.342036e-13 NA 55 0 0.002
nlminb 4.243607e-18 121 97 0 0.002 gr <- function(x) numDeriv::grad(fn, x) # simple, or:
lbfgsb3 5.124035e-10 78 78 0 0.029
Rcgmin 3.656125e-19 300 136 0 0.004 gr <- function(x) pracma::grad(fn, x, heps=6e-06) # central
Rtnmin 5.403094e-13 105 105 0 0.013
Rvmmin 2.935561e-27 116 72 0 0.007
� complex step derivation
ucminf 1.470165e-15 77 77 0 0.002 gr <- function(x) pracma::grad_csd(fn, x)
newuoa 3.614733e-11 1814 NA 0 0.022
bobyqa 6.939585e-10 2142 NA 0 0.025 � automated differentiation [not yet available]
nmkb 9.099242e-01 1500 NA 1 0.083
hjkb 8.436900e-07 4920 NA 0 0.033
lbfgs 9.962100e-13 NA NA 0 0.001

Central-difference Formula
∇f (x ) = ( f∂x
(x )
1
, . . . , f∂x
(x )
n
) and df (x )
dx (x ) ≈ f (x +h)−f (x −h)
2·h

pracma::grad

function (f, x0, heps = .Machine$double.eps^(1/3), ...)


{
# [...input checking...]
n <- length(x0) Optimization with Constraints
hh <- rep(0, n)
gr <- numeric(n)
for (i in 1:n) {
hh[i] <- heps
gr[i] <- (f(x0 + hh) - f(x0 - hh))/(2 * heps)
hh[i] <- 0
}
return(gr)
}
Constraints The ‘transfinite’ Trick

If the solver does not support bound constraints li ≤ xi ≤ ui , the


� box/bound constraints: li ≤ xi ≤ ui transfinite approach will do the trick.
[trick: the ‘transfinite’ approach] Generate a smooth (surjective) function h : R n → [li , ui ], e.g.
� linear inequality constraints: A x ≤ 0
h : xi → li + (ui − li )/2 · (1 + tanh(xi ))
� linear equality constraints: A x = b
[trick: the ‘hyperplane’ approach] and optimize the composite function g(x ) = f (h(x )), i.e.
� quadratic constraints
� inequality constraints in general g : R n → [li , ui ] → R
� equality and inequality constraints x ∗ = argminx g(x) = f(h(x))
then xmin = h(x ∗ ) will be a minimum of f in [li , ui ].

Example: ‘Transfinite’ Approach Linear Inequality Constraints


Minimize the Rosenbrock function in 10 dimensions with
0 ≤ xi ≤ 0.5. Optimization with linear constraints only: A x ≥ 0 (or A x ≤ 0)

Tf <- adagio::transfinite(0, 0.5, 10) constrOptim(theta, f, grad, ui, ci, mu = 1e-04, control
h <- Tf$h; hinv <- Tf$hinv method = if(is.null(grad)) "Nelder-Mead" else
p0 <- rep(0.25, 10) outer.iterations = 100, outer.eps = 1e-05,
f <- function(x) fn(hinv(x)) # f: R^n --> R hessian = FALSE)
g <- function(x) pracma::grad(f, x)

sol <- lbfgs::lbfgs(f, g, p0, epsilon=1e-10, invisible=1) � ui %*% theta - ci >= 0 corresponds to Ax ≥ 0
hinv(sol$par); sol$value � Bounds formulated as linear constraints (even xi ≥ 0)
� theta must be in the interior of the feasible region
� Inner iteration still calls optim
## [1] 0.5000000000 0.2630659827 0.0800311137 0.0165742342
## [6] 0.0102120052 0.0102084108 0.0102042121 0.0100040850 Recommendation: Do not use constrOptim. Instead, use an
‘augmented Lagrangian’ solver, e.g. alabama::auglag.
## [1] 7.594813
Trick: Linear Equality Constraints Example: Linear Equality
A <- matrix(1, 1, 10) # x1 + ... + xn = 1
Task: min! f (x1 , ..., xn ) s.t. Ax = b N <- pracma::nullspace(A) # size 10 9
Let b1 , ..., bm be a basis of the nullspace of A, i.e. Abi = 0, and x0 x0 <- qr.solve(A, 1) # A x = 1
a special solution Ax0 = b. Define a new function fun <- function(x) fn(x0 + N %*% x) # length(x) = 9
g(s1 , ..., sm ) = f (x0 + s1 b1 + ... + sm bm ) and solve this as a sol <- ucminf::ucminf(rep(0, 9), fun)
minimization problem without constraints: xmin <- c(x0 + N %*% sol$par)
xmin; sum(xmin)
s = argmin g(s1 , ..., sm )
## [1] 0.559312323 0.314864715 0.102103618 0.013695782
Then xmin = x0 + s1 b1 + ... + sm bm is a (local) minimum.
## [6] 0.003318010 0.003316801 0.003316309 0.003252102
xmin <- lineqOptim(rep(0, 3), fnRosenbrock, grRosenbrock,
Aeq = c(1,1,1), beq = 1) ## [1] 1
xmin
[1] 0.5713651 0.3263519 0.1022830 fn(xmin)

## [1] 7.421543

Augmented Lagrangian Approach Augmented Lagrangian Solvers

Task: min! f (x ) s.t. gi (x ) ≥ 0, hj (x ) = 0 � alabama

Define the augmented Lagrangian function L as auglag(par, fn, gr, hin, hin.jac, heq, heq.jac,
control.outer=list(), control.optim = list(), ...)
� 1 � 2
L(x , λ; µ) = f (x ) − λj hj (x ) + h (x ) NLoptr
j
2µ j j �

auglag(x0, fn, gr = NULL, lower = NULL, upper = NULL,


The inequality constraints gi (x ) ≥ 0 are included by introducing hin = NULL, hinjac = NULL, heq = NULL, heqjac =
slack variables si and replacing the inequality constraints with localsolver = c("COBYLA"), localtol = 1e-6, ineq2local
nl.info = FALSE, control = list(), ...)
gi (x ) − si = 0, si ≥ 0
� Rsolnp
The bound constraints are treated differently (e.g., through the � NlcOptim (Sequential Quadratic programming, SQP)
LANCELOT algorithm).
� Rdonlp2 (removed from CRAN, see R-Forge’s Rmetrics)
Example with alabama::auglag The nloptr Package (NLopt Library)
Minimize the Rosenbrock function with constraints
x1 + . . . + xn = 1 and 0 ≤ xi ≤ 1 for all i = 1, . . . , n. � COBYLA (Constrained Optimization By Linear Approximation)

fheq <- function(x) sum(x) - 1 cobyla(x0, fn, lower = NULL, upper = NULL, hin = NULL,
fhin <- function(x) c(x) nl.info = FALSE, control = list(), ...)
� slsqp (Sequential Quadratic Programming, SQP)
sol <- alabama::auglag(rep(0, 10), fn, gr, heq = fheq, hin
control.outer = list(trace = FALSE, method = "nlminb" slsqp(x0, fn, gr = NULL, lower = NULL, upper = NULL,
print(sol$par, digits=5) hin = NULL, hinjac = NULL, heq = NULL, heqjac = NULL,
nl.info = FALSE, control = list(), ...)
## [1] 5.5707e-01 3.1236e-01 1.0052e-01 1.3367e-02 3.4742e-03 � auglag (Augmented Lagrangian)
## [6] 3.3082e-03 3.3071e-03 3.3069e-03 3.2854e-03 -7.6289e-09
auglag(x0, fn, gr = NULL, lower = NULL, upper = NULL,
hin = NULL, hinjac = NULL, heq = NULL, heqjac =
sum(sol$par)
localsolver = c("COBYLA", "LBFGS", "MMA", "SLSQP"
nl.info = FALSE, control = list(), ...)
## [1] 1

Quadratic Programming

Quadratic Programming (QP) is the problem of optimizing a


quadratic expression of several variables subject to linear constraints.

1 T
Minimize x Qx + c T x
2
Quadratic Optimization
s.t. Ax ≤ b
where Q is a symmetric, positive (semi-)definite n × n-matrix, c an
n-dim. vector, A an m × n-matrix, and b an m-dim. vector.
For some solvers, linear equality constraints are also allowed.
Example: The enclosing ball problem
Quadratic Solvers

Standard solver for quadratic problems in R is solve.QP in package


quadprog. The matrix Q has to be positive definite.

solve.QP(Dmat, dvec, Amat, bvec, meq=0, factorized=FALSE

Package Function Matrix Timings


Nonsmooth Optimization
quadprog solve.QP pdef 1
kernlab ipop spdef 50
LowRankQP LowRankQP spdef 2
DWD solve_QP_SOCP pdef 9500
coneproj qprog pdef –

Nonsmoothness: Minimax Problems

Functions defined as maximum are not smooth and cannot be


optimized through a straightforward gradient-based approach.
Task: min! f (x ) = max (f1 (x ), . . . , fm (x ))
Instead, define a smooth function g(x1 , . . . , xn , xn+1 ) = xn+1 and
minimze it under constraints Least Squares Solvers
xn+1 ≥ fi (x1 , . . . , xn ) for all i = 1, . . . , m

The solution (x1 , . . . , xn , xn+1 ) returns the minimum point


xmin = (x1 , . . . , xn ) as well as the minimal value fmin = xn+1 .
[Cf. the example in Chapter ?? in the bookdown text.]
Linear Least-squares Nonlinear Least-squares
The standard nonlinear LS estimator for model parameter, given
A linear least-squares (LS) problem means solving min �Ax = b�2 , some data, in Base R is:
possibly with bounds or linear constraints.
The function qr.solve(A, b) from Base R solves over- and nls(formula, data, start, control, algorithm[="plinear|port"],
underdetermined linear systems in the least-squares sense. trace, subset, weights, na.action, model,
lower, upper, ...)
� nnls (Lawson-Hansen algorithm)
linear LS with non-negative/-positive constraints Problems:
� bvls (Stark-Parker algorithm)
linear LS with bound constraints l ≤ x ≤ u � too small or zero residuals
� pracma::lsqlincon(A, b, ...)
� “singular gradient” error message (R-help, Stackoverflow)
linear LS with linear equality and inequality constraints (applies
� too many local minima, proper starting point
a quadratic solver) (cf. nls2 with random or grid-based start points)
� bounds require the ‘port’ algorithm (Port library)
(recommended anyway)

‘Stabilized’ Nonlinear LS Tip: Rosenbrock as LS Problem


Redefine Rosenbrock as vector-value function:
Modern nonlinear LS solvers use the Levenberg-Marquardt method
(not Gauss-Newton) to minimize sums of squares. fn <- function(x) {
n <- length(x)
� minpack.lm x1 <- x[2:n]; x2 <- x[1:(n - 1)]
c(10*(x1 - x2^2), 1 - x2)
nlsLM(formula, data = parent.frame(), start, jac = NULL
}
algorithm = "LM", control = nls.lm.control(),
lower = NULL, upper = NULL, trace = FALSE, ...)
and now apply pracma’s lsqnonlin:
� nlmrt
nlxb(formula, start, trace=FALSE, data, lower=-Inf, upper= lsqnonlin(fn, rep(0, 20))
masked=NULL, control, ...) ## $x
## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Cf. also pracma::lsqnonlin(fun, x0, options = list(), . . . ) ## $ssq
## [1] 3.037124e-19
Quantile Regression

Median (or: L1 ) Regression: min! |y − A x |
(aka “least absolute deviation”" (LAD) regression)

� quantreg

rq(formula, tau = 0.5, data, subset, weights, na.action,


method = "br", model = TRUE, contrasts, ...)
Global Optimization

� pracma

L1linreg(A, b, p = 1, tol = 1e-07, maxiter = 200)

solves the linear system A x = b in an Lp sense, i.e. minimizes the



term |b − A x |p (for 0 < p ≤ 1) by applying an
“iteratively reweighted least square” (IRLS) method.

DE Solvers CMA-ES Solvers

Differential Evolution (DE) is a relatively simple genetic algorithm Covariance Matrix Adaptation – Evolution Strategy (CMA-ES) is an
variant, specialized for real-valued functions (10-20 dims). evolutionary algorithm for continuous optimization problems
(adapting the covariance matrix). It is quite difficult to implement,
� DEoptim but is applicable to dimensions up to 50 or more.

DEoptim(fn, lower, upper, � Packages that contain CMA-ES solvers:


control = DEoptim.control(trace = FALSE), ..., fnMap cmaes
cmaesr
� RcppDE
rCMA
DEoptim(fn, lower, upper, control = DEoptim.control(), parma::cmaes
Rmalschains
� DEoptimR adagio::pureCMAES
JDEoptim(lower, upper, fn, pureCMAES(par, fun, lower = NULL, upper = NULL, sigma =
constr = NULL, meq = 0, eps = 1e-05, NP = 10*d[, ...]) stopfitness = -Inf, stopeval = 1000*length(par)
More Evolutionary Approaches The gloptim Package
Package gloptim incorporates and compares 25 stochastic solvers.
The following is a typical output, here only showing the results of
CMA-ES and DE solvers for the ‘Runge’ problem:
� Simulated Annealing (SA)
GenSA solver package fmin time
1 purecmaes adagio 0.06546780 43.583
� Genetic Algorithms (GA)
2 cmaes parma 0.06546780 23.523
GA, genalg, SOMA, rgenoud
3 cmaoptim rCMA 0.06546780 91.257
� Particle Swarm Optimization (PSO) 4 malschains Rmalschains 0.06546781 76.457
pso, psoptim, hydroPSO 5 deopt NMOF 0.06546876 75.809
6 deoptimr DEoptimR 0.06549435 57.712
NMOF: DEopt, GAopt, PSopt 7 simplede adagio 0.06573988 84.000
NLoptr: crs2lm, direct, mlsl, isres, stogo 8 cma_es cmaes 0.07430865 7.208
9 cmaes cmaesr 0.07503498 8.305
...
22 cppdeoptim RcppDE 6.82525344 17.050
23 deoptim DEoptim 7.28454226 39.287

ROI – R Optimization Infrastructure


Available Plugins:
glpk, symphony, quadprog, ipop, ecos, scs, nloptr, cplex, . . .

library(ROI); library(ROI.plugin.glpk) # ...


v <- c(15, 100, 90, 60, 40, 15, 10, 1)
w <- c( 2, 20, 20, 30, 40, 30, 60, 10)

Future Developments mat <- matrix(w, nrow = 1)


con <- L_constraint(L = mat, dir = "<=", rhs = 105)

pro <- OP(objective = v, constraints = con,


types = rep("B", 8), maximum = TRUE)

ROI_applicable_solvers(pro) # [1] "clp" "glpk" ...


sol <- ROI_solve(pro, solver = "ecos")
## Optimal solution found.
## The objective value is: 2.800000e+02
CVXR Using Julia Solvers
Ipopt (Interior Point OPTimizer) is a software package for
CVXR provides an R modeling language for convex optimization
large-scale nonlinear optimization (with nonlinear equality and
problems (announced UseR!2016, not yet ready).
inequality constraints).
Example: Estimating a discrete distribution, e.g.

m
� difficult to install (extra components needed)
max! i log wi
i=1 −w� � ECLIPSE license (not allowed on CRAN?)
s.t. wi ≥ 0, wi = 1, XT w = b
There is an easy-to-install Ipopt.jl package for Julia.
library(CVXR)
w <- Variable(m) With the R packages XR and XRJulia (John Chambers, 2016)
obj <- SumEntries(Entr(w)) # entropy function it will be possible to utilize this with a new R package ipoptjlr.
constr <- list(w >= 0, SumEntries(w) == 1, t(X) %*% w == b)
pro <- Problem(Maximize(obj), constr) library(ipoptjlr)
sol <- solve(pro) julia_setup("path_to_julia")
sol$w IPOPT(x, x_L, x_U, g_L, g_U, eval_f, eval_g,
eval_grad_f, jac_g1, jac_g2, h1, h2)

Using the NEOS Solvers


“The NEOS Server https://neos-server.org/neos/ is a free
internet-based service for solving numerical optimization problems.
[It] provides access to more than 60 state-of-the-art [free and
commercial] solvers.”
rneos: XML-RPC Interface to NEOS

# submit job to the NEOS solver


neosjob <- NsubmitJob(xmlstring, user = "hwb", interface =
Epilogue
id = 8237, nc = CreateNeosComm())
neosjob
# The job number is: 3838832
# The pass word is : wBgHomLT

# getting info about job


NgetJobInfo(neosjob) # "nco" "MINOS" "AMPL" "Done"
NgetFinalResults(neosjob)
“What can go wrong?” References

� Theussl, S., and H. W. Borchers (2017). CRAN Task View:


� Modell, constraints, gradients, . . . Optimization and Mathematical Programming.
� Local: bad starting values URL: https://CRAN.R-project.org/view=Optimization
Global: no guaranteed optimum � Nash, J. C. (2014). Nonlinear Parameter Optimization
� Applying appropriate solvers Using R Tools. John Wiley and Sons, Chichester, UK.
� Setting solver controls � Varadhan, R., Editor (2014). Special Issue: Numerical
� Special problems, e.g. Optimization in R: Beyond optim. Journal of Statistical
Non-smooth objective functions, noise, . . . Software, Vol. 60.
� Understanding solver output (and error messages)
convergence, accuracy, no. of loops and function calls
� Bloomfield, V. A. (2014). Using R for Numerical Analysis
� Checking results in Science and Engineering. CRC Press, USA. (Chapter 7,
40 pp.)
“Most methods work most of the time.” – John Nash � Cortez, P. (2014). Modern Optimization With R. Use R!
Series, Springer Intl. Publishing, Switzerland.

You might also like