Skip to main content
Cornell University

In just 5 minutes help us improve arXiv:

Annual Global Survey
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > math.OC

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Optimization and Control

  • New submissions
  • Cross-lists
  • Replacements

See recent articles

Showing new listings for Friday, 7 November 2025

Total of 46 entries
Showing up to 2000 entries per page: fewer | more | all

New submissions (showing 14 of 14 entries)

[1] arXiv:2511.03955 [pdf, html, other]
Title: Hidden Convexity in Queueing Models
Xin Chen, Linwei Xin, Minda Zhao
Subjects: Optimization and Control (math.OC); Probability (math.PR)

We study the joint control of arrival and service rates in queueing systems with the objective of minimizing long-run expected cost minus revenue. Although the objective function is non-convex, first-order methods have been empirically observed to converge to globally optimal solutions. This paper provides a theoretical foundation for this empirical phenomenon by characterizing the optimization landscape and identifying a hidden convexity: the problem admits a convex reformulation after an appropriate change of variables. Leveraging this hidden convexity, we establish the Polyak-Lojasiewicz-Kurdyka (PLK) condition for the original control problem, which excludes spurious local minima and ensures global convergence for first-order methods. Our analysis applies to a broad class of $GI/GI/1$ queueing models, including those with Gamma-distributed interarrival and service times. As a key ingredient in the proof, we establish a new convexity property of the expected queue length under a square-root transformation of the traffic intensity.

[2] arXiv:2511.04230 [pdf, html, other]
Title: Towards optimal control of ensembles of discrete-time systems
Christian Fiedler, Alessandro Scagliotti
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY); Dynamical Systems (math.DS)

The control of ensembles of dynamical systems is an intriguing and challenging problem, arising for example in quantum control. We initiate the investigation of optimal control of ensembles of discrete-time systems, focusing on minimising the average finite horizon cost over the ensemble. For very general nonlinear control systems and stage and terminal costs, we establish existence of minimisers under mild assumptions. Furthermore, we provide a $\Gamma$-convergence result which enables consistent approximation of the challenging ensemble optimal control problem, for example, by using empirical probability measures over the ensemble. Our results form a solid foundation for discrete-time optimal control of ensembles, with many interesting avenues for future research.

[3] arXiv:2511.04232 [pdf, html, other]
Title: An Efficient Algorithm for Learning-Based Visual Localization
Jindi Zhong, Ziyuan Guo, Hongxia Wang, Huanshui Zhang
Subjects: Optimization and Control (math.OC)

This paper addresses the visual localization problem in Global Positioning System (GPS)-denied environments, where computational resources are often limited. To achieve efficient and robust performance under these constraints, we propose a novel algorithm. The algorithm stems from the optimal control principle (OCP). It incorporates diagonal information estimation of the Hessian matrix, which results in training a higher-performance deep neural network and accelerates optimization convergence. Experimental results on public datasets demonstrate that the final model achieves competitive localization accuracy and exhibits remarkable generalization capability. This study provides new insights for developing high-performance offline positioning systems.

[4] arXiv:2511.04287 [pdf, html, other]
Title: Some obstacle problems for partially hinged plates and related optimization issues
Elvise Berchio, Filomena Feo, Antonio Giuseppe Grimaldi
Subjects: Optimization and Control (math.OC)

We study optimization problems for partially hinged rectangular plates, modeling bridge roadways, in the presence of real and artificial obstacles. Real obstacles represent structural constraints to avoid, while artificial ones are introduced to enhance stability. For the former, aiming to prevent collisions, we set up a worst-case optimization problem in which we minimize the amplitude of oscillations with respect to the density distribution; for the latter, aiming to improve the torsional stability, we minimize, with respect to the obstacles, the maximum of a gap function quantifying the displacement between the long edges of the plate. For both problems, existence results are provided, along with a discussion about qualitative properties of optimal density distributions and obstacles.

[5] arXiv:2511.04303 [pdf, html, other]
Title: Signature-Based Universal Bilinear Approximations for Nonlinear Systems and Model Order Reduction
Martin Redmann, Justus Werner
Subjects: Optimization and Control (math.OC); Classical Analysis and ODEs (math.CA); Numerical Analysis (math.NA); Probability (math.PR)

This paper deals with non-Lipschitz nonlinear systems. Such systems can be approximated by a linear map of so-called signatures, which play a crucial role in the theory of rough paths and can be interpreted as collections of iterated integrals involving the control process. As a consequence, we identify a universal bilinear system, solved by the signature, that can approximate the state or output of the original nonlinear dynamics arbitrarily well. In contrast to other (bi)linearization techniques, the signature approach remains feasible in large-scale settings, as the dimension of the associated bilinear system grows only with the number of inputs. However, the signature model is typically of high order, requiring an optimization process based on model order reduction (MOR). We derive an MOR method for unstable bilinear systems with non-zero initial states and apply it to the signature, yielding a potentially low-dimensional bilinear model. An advantage of our method is that the original nonlinear system need not be known explicitly, since only data are required to learn the linear map of the signature. The subsequent MOR procedure is model-oriented and specifically designed for the signature process. Consequently, this work has two main applications: (1) efficient modeling/data fitting using small-scale bilinear systems, and (2) MOR for nonlinear systems. We illustrate the effectiveness of our approach in the second application through numerical experiments.

[6] arXiv:2511.04350 [pdf, other]
Title: On the relationship between MESP and 0/1 D-Opt and their upper bounds
Gabriel Ponte, Marcia Fampa, Jon Lee
Subjects: Optimization and Control (math.OC); Computational Engineering, Finance, and Science (cs.CE); Information Theory (cs.IT); Statistics Theory (math.ST)

We establish strong connections between two fundamental nonlinear 0/1 optimization problems coming from the area of experimental design, namely maximum entropy sampling and 0/1 D-Optimality. The connections are based on maps between instances, and we analyze the behavior of these maps. Using these maps, we transport basic upper-bounding methods between these two problems, and we are able to establish new domination results and other inequalities relating various basic upper bounds. Further, we establish results relating how different branch-and-bound schemes based on these maps compare. Additionally, we observe some surprising numerical results, where bounding methods that did not seem promising in their direct application to real-data MESP instances, are now useful for MESP instances that come from 0/1 D-Optimality.

[7] arXiv:2511.04364 [pdf, other]
Title: Lower and Upper Bounds for Small Canonical and Ordered Ramsey Numbers
Daniel Brosch, Bernard Lidický, Sydney Miyasaki, Diane Puges
Subjects: Optimization and Control (math.OC); Combinatorics (math.CO)

In this paper, we investigate three extensions of Ramsey numbers to other combinatorial settings.
We first consider ordered Ramsey numbers. Here, we ask for a monochromatic copy of a linearly ordered graph $G$ in every $2$-edge-coloring of a linearly ordered complete graph $K_n$. The smallest such $n$ is denoted by $\vec{R}(G)$.
Next, we study canonical Ramsey numbers. A canonical coloring of a linearly ordered graph $G$ is an edge-coloring in which $G$ is monochromatic, rainbow, or min/max-lexicographic. In the latter case, each pair of edges receives the same color if and only if they share the same first (respectively, second) vertex. Erdős and Rado showed that for every $p$ there exists $n$ such that every edge-coloring of a linearly ordered $K_n$ contains a canonical copy of $K_p$; the smallest such $n$ is denoted by $ER(G)$.
Finally, we examine unordered canonical Ramsey numbers, introduced by Richer. An edge-coloring of $G$ is orderable if there exists a linear ordering of its vertices such that the color of each edge is determined by its first vertex. Unlike lexicographic colorings, this notion also includes monochromatic colorings. Richer proved that for all $s$ and $t$, there exists $n$ such that every edge-coloring of $K_n$ contains an orderable copy of $K_s$ or a rainbow $K_t$. The smallest such $n$ is denoted by $CR(s,t)$.
In all three settings, we focus on determining the corresponding Ramsey numbers for small graphs $G$. We use tabu search and integer programming to obtain lower bounds, and flag algebras or integer programming to establish upper bounds. Among other results, we determine $\vec{R}(G)$ for all graphs $G$ on up to four vertices except $K_4^-$, $ER(P_4)$ for all orderings of $P_4$, and the exact values $CR(6,3)=26$ and $CR(3,5)=13$.

[8] arXiv:2511.04515 [pdf, html, other]
Title: Robust mean-field control under common noise uncertainty
Mathieu Laurière, Ariel Neufeld, Kyunghyun Park
Subjects: Optimization and Control (math.OC); Probability (math.PR); Mathematical Finance (q-fin.MF)

We propose and analyze a framework for discrete-time robust mean-field control problems under common noise uncertainty. In this framework, the mean-field interaction describes the collective behavior of infinitely many cooperative agents' state and action, while the common noise -- a random disturbance affecting all agents' state dynamics -- is uncertain. A social planner optimizes over open-loop controls on an infinite horizon to maximize the representative agent's worst-case expected reward, where worst-case corresponds to the most adverse probability measure among all candidates inducing the unknown true law of the common noise process. We refer to this optimization as a robust mean-field control problem under common noise uncertainty. We first show that this problem arises as the asymptotic limit of a cooperative $N$-agent robust optimization problem, commonly known as propagation of chaos. We then prove the existence of an optimal open-loop control by linking the robust mean field control problem to a lifted robust Markov decision problem on the space of probability measures and by establishing the dynamic programming principle and Bellman--Isaac fixed point theorem for the lifted robust Markov decision problem. Finally, we complement our theoretical results with numerical experiments motivated by distribution planning and systemic risk in finance, highlighting the advantages of accounting for common noise uncertainty.

[9] arXiv:2511.04549 [pdf, html, other]
Title: On the feasibility of generalized inverse linear programs
Christoph Buchheim, Lowig T. Duer
Subjects: Optimization and Control (math.OC)

We investigate the feasibility problem for generalized inverse linear programs. Given an LP with affinely parametrized objective function and right-hand side as well as a target set Y, the goal is to decide whether the parameters can be chosen such that there exists an optimal solution that belongs to Y (optimistic scenario) or such that all optimal solutions belong to Y (pessimistic scenario). We study the complexity of this decision problem and show how it depends on the structure of the set Y, the form of the LP, the adjustable parameters, and the underlying scenario. For a target singleton Y = {y}, we show that the problem is tractable if the given LP is in standard form, but NP-hard if the LP is given in natural form. If instead we are given a target basis B, the problem in standard form becomes NP-complete in the optimistic case, while remaining tractable in the pessimistic case. For partially fixed target solutions, the problem gets almost immediately NP-hard, but we prove fixed-parameter tractability in the number of non-fixed variables. Moreover, we give a rigorous proof of membership in NP for any polyhedral target set, and discuss how this property can be extended to more general target sets using an oracle-based approach.

[10] arXiv:2511.04569 [pdf, html, other]
Title: Unified Theory of Adaptive Variance Reduction
Aleksandr Shestakov, Valery Parfenov, Aleksandr Beznosikov
Subjects: Optimization and Control (math.OC)

Variance reduction is a family of powerful mechanisms for stochastic optimization that appears to be helpful in many machine learning tasks. It is based on estimating the exact gradient with some recursive sequences. Previously, many papers demonstrated that methods with unbiased variance-reduction estimators can be described in a single framework. We generalize this approach and show that the unbiasedness assumption is excessive; hence, we include biased estimators in this analysis. But the main contribution of our work is the proposition of new variance reduction methods with adaptive step sizes that are adjusted throughout the algorithm iterations and, moreover, do not need hyperparameter tuning. Our analysis covers finite- sum problems, distributed optimization, and coordinate methods. Numerical experiments in various tasks validate the effectiveness of our methods.

[11] arXiv:2511.04579 [pdf, html, other]
Title: Knothe-Rosenblatt maps via soft-constrained optimal transport
Ricardo Baptista, Franca Hoffmann, Minh Van Hoang Nguyen, Benjamin Zhang
Comments: 29 pages
Subjects: Optimization and Control (math.OC); Probability (math.PR); Methodology (stat.ME)

In the theory of optimal transport, the Knothe-Rosenblatt (KR) rearrangement provides an explicit construction to map between two probability measures by building one-dimensional transformations from the marginal conditionals of one measure to the other. The KR map has shown to be useful in different realms of mathematics and statistics, from proving functional inequalities to designing methodologies for sampling conditional distributions. It is known that the KR rearrangement can be obtained as the limit of a sequence of optimal transport maps with a weighted quadratic cost. We extend these results in this work by showing that one can obtain the KR map as a limit of maps that solve a relaxation of the weighted-cost optimal transport problem with a soft-constraint for the target distribution. In addition, we show that this procedure also applies to the construction of triangular velocity fields via dynamic optimal transport yielding optimal velocity fields. This justifies various variational methodologies for estimating KR maps in practice by minimizing a divergence between the target and pushforward measure through an approximate map. Moreover, it opens the possibilities for novel static and dynamic OT estimators for KR maps.

[12] arXiv:2511.04580 [pdf, html, other]
Title: Computational Modeling and Learning-Based Adaptive Control of Solid-Fuel Ramjets
Gohar T. Khokhar, Kyle Hanquist, Parham Oveissi, Alex Dorsey, Ankit Goel
Subjects: Optimization and Control (math.OC); Computational Physics (physics.comp-ph); Fluid Dynamics (physics.flu-dyn)

Solid-fuel ramjets offer a compact, energy-dense propulsion option for long-range, high-speed flight but pose significant challenges for thrust regulation due to strong nonlinearities, limited actuation authority, and complex multi-physics coupling between fuel regression, combustion, and compressible flow. This paper presents a computational and control framework that combines a computational fluid dynamics model of an SFRJ with a learning-based adaptive control approach. A CFD model incorporating heat addition was developed to characterize thrust response, establish the operational envelope, and identify the onset of inlet unstart. An adaptive proportional-integral controller, updated online using the retrospective cost adaptive control (RCAC) algorithm, was then applied to regulate thrust. Closed-loop simulations demonstrate that the RCAC-based controller achieves accurate thrust regulation under both static and dynamic operating conditions, while remaining robust to variations in commands, hyperparameters, and inlet states. The results highlight the suitability of RCAC for SFRJ control, where accurate reduced-order models are challenging to obtain, and underscore the potential of learning-based adaptive control to enable robust and reliable operation of SFRJs in future air-breathing propulsion applications.

[13] arXiv:2511.04607 [pdf, html, other]
Title: Closing the Gap: Efficient Algorithms for Discrete Wasserstein Barycenters
Jiaqi Wang, Weijun Xie
Subjects: Optimization and Control (math.OC)

The Wasserstein barycenter problem seeks a probability measure that minimizes the weighted average of the Wasserstein distances to a given collection of probability measures. We study the discrete setting, where each measure has finite support-- a regime that frequently arises in machine learning and operations research. The discrete Wasserstein barycenter problem is known to be NP-hard, which motivates us to study approximation algorithms with provable guarantees. The best-known algorithm to date achieves an approximation ratio of two. We close this gap by developing a polynomial-time approximation scheme (PTAS) for the discrete Wasserstein barycenter problem that generalizes and improves upon the 2-approximation method. In addition, for the special case of equally weighted measures, we obtain a strictly tighter approximation guarantee. Numerical experiments show that the proposed algorithms are computationally efficient and produce near-optimal barycenter solutions.

[14] arXiv:2511.04622 [pdf, html, other]
Title: ODE approximation for the Adam algorithm: General and overparametrized setting
Steffen Dereich, Arnulf Jentzen, Sebastian Kassing
Subjects: Optimization and Control (math.OC); Machine Learning (cs.LG); Probability (math.PR)

The Adam optimizer is currently presumably the most popular optimization method in deep learning. In this article we develop an ODE based method to study the Adam optimizer in a fast-slow scaling regime. For fixed momentum parameters and vanishing step-sizes, we show that the Adam algorithm is an asymptotic pseudo-trajectory of the flow of a particular vector field, which is referred to as the Adam vector field. Leveraging properties of asymptotic pseudo-trajectories, we establish convergence results for the Adam algorithm. In particular, in a very general setting we show that if the Adam algorithm converges, then the limit must be a zero of the Adam vector field, rather than a local minimizer or critical point of the objective function.
In contrast, in the overparametrized empirical risk minimization setting, the Adam algorithm is able to locally find the set of minima. Specifically, we show that in a neighborhood of the global minima, the objective function serves as a Lyapunov function for the flow induced by the Adam vector field. As a consequence, if the Adam algorithm enters a neighborhood of the global minima infinitely often, it converges to the set of global minima.

Cross submissions (showing 7 of 7 entries)

[15] arXiv:2511.03972 (cross-list from cs.LG) [pdf, html, other]
Title: Non-Asymptotic Optimization and Generalization Bounds for Stochastic Gauss-Newton in Overparameterized Models
Semih Cayci
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)

An important question in deep learning is how higher-order optimization methods affect generalization. In this work, we analyze a stochastic Gauss-Newton (SGN) method with Levenberg-Marquardt damping and mini-batch sampling for training overparameterized deep neural networks with smooth activations in a regression setting. Our theoretical contributions are twofold. First, we establish finite-time convergence bounds via a variable-metric analysis in parameter space, with explicit dependencies on the batch size, network width and depth. Second, we derive non-asymptotic generalization bounds for SGN using uniform stability in the overparameterized regime, characterizing the impact of curvature, batch size, and overparameterization on generalization performance. Our theoretical results identify a favorable generalization regime for SGN in which a larger minimum eigenvalue of the Gauss-Newton matrix along the optimization path yields tighter stability bounds.

[16] arXiv:2511.03983 (cross-list from cs.LG) [pdf, html, other]
Title: TwIST: Rigging the Lottery in Transformers with Independent Subnetwork Training
Michael Menezes, Barbara Su, Xinze Feng, Yehya Farhat, Hamza Shili, Anastasios Kyrillidis
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC)

We introduce TwIST, a distributed training framework for efficient large language model (LLM) sparsification. TwIST trains multiple subnetworks in parallel, periodically aggregates their parameters, and resamples new subnetworks during training. This process identifies high-quality subnetworks ("golden tickets") without requiring post-training procedures such as calibration or Hessian-based recovery. As a result, TwIST enables zero-cost pruning at deployment time while achieving perplexity competitive with state-of-the-art post-training sparsification methods. The benefits are most pronounced under aggressive sparsity (e.g., 50%+), where TwIST significantly outperforms baseline methods; for example, reaching 23.14 PPL compared to 31.64 for the closest prior approach. Unlike unstructured pruning, TwIST produces structured, dense matrices that offer practical inference speedups and memory reductions on commodity hardware (e.g., CPUs) that do not support efficient sparse computation. TwIST provides an efficient training-time path to deployable sparse LLMs without additional fine-tuning or recovery overhead.

[17] arXiv:2511.04252 (cross-list from math.DS) [pdf, html, other]
Title: Koopman Kalman Filter (KKF): An asymptotically optimal nonlinear filtering algorithm with error bounds and its application to parameter estimation
Diego Olguín, Axel Osses, Héctor Ramírez
Comments: 27 pages, 5 figures
Subjects: Dynamical Systems (math.DS); Optimization and Control (math.OC)

In this article, we propose a new filtering algorithm based in the Koopman operator, showing that a nonlinear filtering problem can be seen as an equivalent problem where the dynamics is infinite dimensional, but linear. Using Extended Dynamic Mode Decomposition (EDMD), we create a finite dimensional approximation of the filtering problem of dimension $N$, in state and error covariance matrix, that accomplishes an error bound of order \(O(N^{-1/2})\) in both where $N$ denotes the number of points used in the Koopman approximation. The algorithm is denominated Koopman Kalman Filter (KKF), and has computational complexity \(O(T\cdot N^3)\) in time, and \(O(T \cdot N^2)\) in space, where \(T\) is the number of iterations of the filtering problem. We test the algorithm in linear and nonlinear dynamics cases, showing and effective error bound with respect to the Kalman filter, that corresponds to the optimal solution in the linear case, and equals the error performance of other methods in the state of the art, but with a much lower execution time. Also, we propose a parameter estimation algorithm based in KKF, comparing it with Markov Chain Monte Carlo techniques, showing similar performance with lower execution time.

[18] arXiv:2511.04369 (cross-list from math.NA) [pdf, html, other]
Title: Normalized tensor train decomposition
Renfeng Peng, Chengkai Zhu, Bin Gao, Xin Wang, Ya-xiang Yuan
Comments: 26 pages, 9 figures, 4 tables
Subjects: Numerical Analysis (math.NA); Optimization and Control (math.OC); Quantum Physics (quant-ph)

Tensors with unit Frobenius norm are fundamental objects in many fields, including scientific computing and quantum physics, which are able to represent normalized eigenvectors and pure quantum states. While the tensor train decomposition provides a powerful low-rank format for tackling high-dimensional problems, it does not intrinsically enforce the unit-norm constraint. To address this, we introduce the normalized tensor train (NTT) decomposition, which aims to approximate a tensor by unit-norm tensors in tensor train format. The low-rank structure of NTT decomposition not only saves storage and computational cost but also preserves the underlying unit-norm structure. We prove that the set of fixed-rank NTT tensors forms a smooth manifold, and the corresponding Riemannian geometry is derived, paving the way for geometric methods. We propose NTT-based methods for low-rank tensor recovery, high-dimensional eigenvalue problem, estimation of stabilizer rank, and calculation of the minimum output Rényi 2-entropy of quantum channels. Numerical experiments demonstrate the superior efficiency and scalability of the proposed NTT-based methods.

[19] arXiv:2511.04454 (cross-list from cs.CE) [pdf, html, other]
Title: Fitting Reinforcement Learning Model to Behavioral Data under Bandits
Hao Zhu, Jasper Hoffmann, Baohe Zhang, Joschka Boedecker
Subjects: Computational Engineering, Finance, and Science (cs.CE); Machine Learning (cs.LG); Optimization and Control (math.OC); Neurons and Cognition (q-bio.NC)

We consider the problem of fitting a reinforcement learning (RL) model to some given behavioral data under a multi-armed bandit environment. These models have received much attention in recent years for characterizing human and animal decision making behavior. We provide a generic mathematical optimization problem formulation for the fitting problem of a wide range of RL models that appear frequently in scientific research applications, followed by a detailed theoretical analysis of its convexity properties. Based on the theoretical results, we introduce a novel solution method for the fitting problem of RL models based on convex relaxation and optimization. Our method is then evaluated in several simulated bandit environments to compare with some benchmark methods that appear in the literature. Numerical results indicate that our method achieves comparable performance to the state-of-the-art, while significantly reducing computation time. We also provide an open-source Python package for our proposed method to empower researchers to apply it in the analysis of their datasets directly, without prior knowledge of convex optimization.

[20] arXiv:2511.04485 (cross-list from cs.LG) [pdf, html, other]
Title: Q3R: Quadratic Reweighted Rank Regularizer for Effective Low-Rank Training
Ipsita Ghosh, Ethan Nguyen, Christian Kümmerle
Journal-ref: 39th Conference on Neural Information Processing Systems (NeurIPS 2025)
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Optimization and Control (math.OC)

Parameter-efficient training, based on low-rank optimization, has become a highly successful tool for fine-tuning large deep-learning models. However, these methods fail at low-rank pre-training tasks where maintaining the low-rank structure and the objective remains a challenging task. We propose the Quadratic Reweighted Rank Regularizer dubbed Q3R, which leads to a novel low-rank inducing training strategy inspired by the iteratively reweighted least squares (IRLS) framework. Q3R is based on a quadratic regularizer term which majorizes a smoothed log determinant serving as rank surrogate objective. Unlike other low-rank training techniques, Q3R is able to train weight matrices with prescribed, low target ranks of models that achieve comparable predictive performance as dense models, with small computational overhead, while remaining fully compatible with existing architectures. For example, we demonstrated one experiment where we are able to truncate $60\%$ and $80\%$ of the parameters of a ViT-Tiny model with $~1.3\%$ and $~4\%$ accuracy drop in CIFAR-10 performance respectively. The efficacy of Q3R is confirmed on Transformers across both image and language tasks, including for low-rank fine-tuning.

[21] arXiv:2511.04522 (cross-list from cs.LG) [pdf, html, other]
Title: End-to-End Reinforcement Learning of Koopman Models for eNMPC of an Air Separation Unit
Daniel Mayfrank, Kayra Dernek, Laura Lang, Alexander Mitsos, Manuel Dahmen
Comments: manuscript (8 pages, 5 figures, 1 table), supplementary materials (5 pages, 1 figure, 1 table)
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC)

With our recently proposed method based on reinforcement learning (Mayfrank et al. (2024), Comput. Chem. Eng. 190), Koopman surrogate models can be trained for optimal performance in specific (economic) nonlinear model predictive control ((e)NMPC) applications. So far, our method has exclusively been demonstrated on a small-scale case study. Herein, we show that our method scales well to a more challenging demand response case study built on a large-scale model of a single-product (nitrogen) air separation unit. Across all numerical experiments, we assume observability of only a few realistically measurable plant variables. Compared to a purely system identification-based Koopman eNMPC, which generates small economic savings but frequently violates constraints, our method delivers similar economic performance while avoiding constraint violations.

Replacement submissions (showing 25 of 25 entries)

[22] arXiv:2109.05059 (replaced) [pdf, html, other]
Title: The Speed-Robustness Trade-Off for First-Order Methods with Additive Gradient Noise
Bryan Van Scoy, Laurent Lessard
Comments: 32 pages
Subjects: Optimization and Control (math.OC)

We study the trade-off between convergence rate and sensitivity to stochastic additive gradient noise for first-order optimization methods. Ordinary Gradient Descent (GD) can be made fast-and-sensitive or slow-and-robust by increasing or decreasing the stepsize, respectively. However, it is not clear how such a trade-off can be navigated when working with accelerated methods such as Polyak's Heavy Ball (HB) or Nesterov's Fast Gradient (FG) methods. We consider two classes of functions: (1) strongly convex quadratics and (2) smooth strongly convex functions. For each function class, we present a tractable way to compute the convergence rate and sensitivity to additive gradient noise for a broad family of first-order methods, and we present algorithm designs that trade off these competing performance metrics. Each design consists of a simple analytic update rule with two states of memory, similar to HB and FG. Moreover, each design has a scalar tuning parameter that explicitly trades off convergence rate and sensitivity to additive gradient noise. We numerically validate the performance of our designs by comparing their convergence rate and sensitivity to those of many other algorithms, and through simulations on Nesterov's "bad function".

[23] arXiv:2311.05045 (replaced) [pdf, html, other]
Title: Exact Solutions for the NP-hard Wasserstein Barycenter Problem using a Doubly Nonnegative Relaxation and a Splitting Method
Woosuk L. Jung, Henry Wolkowicz
Subjects: Optimization and Control (math.OC)

The so-called \emph{simplified} Wasserstein barycenter problem, also known as the cheapest hub problem, consists in selecting one point from each of $k$ given sets, each set consisting of $n$ points, with the aim of minimizing the sum of distances to the barycenter of the $k$ chosen points. This problem is known to be NP-hard. We compute the Wasserstein barycenter by exploiting the Euclidean distance matrix structure to obtain a facially reduced doubly nonnegative, DNN, relaxation. The facial reduction provides a natural splitting for applying the symmetric alternating directions method of multipliers (sADMM) to the DNN relaxation. The sADMM method exploits structure in the subproblems to find strong upper and lower bounds. In addition, we extend the problem to allow varying $n_j$ points for the $j$-th set.
The purpose of this paper is twofold. First we want to illustrate the strength of this DNN relaxation with the natural splitting approach mentioned above. Our numerical tests then illustrate the surprising success on random problems, as we generally, efficiently, find the provable exact solution of this NP-hard problem. Comparisons with current commercial software illustrate this surprising efficiency. However, we demonstrate and prove that there is a duality gap for problems with \emph{enough} multiple optimal solutions, and that this arises from problems with highly symmetrized structure.

[24] arXiv:2403.02835 (replaced) [pdf, html, other]
Title: Low-rank Tensor Autoregressive Predictor for Third-Order Time-Series Forecasting
Haoning Wang, Liping Zhang
Comments: Accepted for publication in Expert Systems with Applications
Subjects: Optimization and Control (math.OC); Statistics Theory (math.ST)

Recently, tensor time-series forecasting has gained increasing attention, whose core requirement is how to perform dimensionality reduction. In this paper, we establish a least square optimization model by combining tensor singular value decomposition (t-SVD) with autoregression (AR) to forecast third-order tensor time-series, which has great benefit in computational complexity and dimensionality reduction. We divide such an optimization problem using fast Fourier transformation and t-SVD into four decoupled subproblems, whose variables include regressive coefficient, f-diagonal tensor, left and right orthogonal tensors, and propose an efficient forecasting algorithm via alternating minimization strategy, called Low-rank Tensor Autoregressive Predictor (LOTAP), in which each subproblem has a closed-form solution. Numerical experiments indicate that, compared to Tucker-decomposition-based algorithms, LOTAP achieves a speed improvement ranging from $2$ to $6$ times while maintaining accurate forecasting performance in all four baseline tasks. In addition, this algorithm is applicable to a wider range of tensor forecasting tasks because of its more effective dimensionality reduction ability.

[25] arXiv:2411.15776 (replaced) [pdf, html, other]
Title: Proximal methods for structured nonsmooth optimization over Riemannian submanifolds
Qia Li, Na Zhang, Junyu Feng, Hanwei Yan
Subjects: Optimization and Control (math.OC)

In this paper, we consider a class of structured nonsmooth optimization problems over an embedded submanifold of a Euclidean space, where the first part of the objective is the sum of a difference-of-convex (DC) function and a smooth function, while the remaining part is a weakly convex function over a smooth function. This model problem has many important applications in machine learning and scientific computing, for example, the sparse Fisher discriminant analysis. We propose a manifold proximal-gradient-subgradient algorithm (MPGSA) and show that under mild conditions any accumulation point of the solution sequence generated by it is a critical point of the underlying problem. By assuming the Kurdyka-Łojasiewicz property of an auxiliary function, we further establish the convergence of the full sequence generated by MPGSA under some suitable conditions. When the second component of the DC function involved is the maximum of finite continuously differentiable convex functions, we also propose an enhanced MPGSA with guaranteed subsequential convergence to a lifted B-stationary points of the optimization problem. Finally, some preliminary numerical experiments are conducted to illustrate the efficiency of the proposed algorithms.

[26] arXiv:2412.14903 (replaced) [pdf, html, other]
Title: Long Time Behavior and Stabilization for Displacement Monotone Mean Field Games
Marco Cirant, Alpár R. Mészáros
Comments: 44 pages
Subjects: Optimization and Control (math.OC); Analysis of PDEs (math.AP)

This paper is devoted to the study of the long time behavior of Nash equilibria in Mean Field Games within the framework of displacement monotonicity. We first show that any two equilibria defined on the time horizon $[0,T]$ must be close as $T \to \infty$, in a suitable sense, independently of initial/terminal conditions. The way this stability property is made quantitative involves the $L^2$ distance between solutions of the associated Pontryagin system of FBSDEs that characterizes the equilibria. Therefore, this implies in particular the stability in the 2-Wasserstein distance for the two flows of probability measures describing the agent population density and the $L^2$ distance between the co-states of agents, that are related to the optimal feedback controls. We then prove that the value function of a typical agent converges as $T \to \infty$, and we describe this limit via an infinite horizon MFG system, involving an ergodic constant. All of our convergence results hold true in a unified way for deterministic and idiosyncratic noise driven Mean Field Games, in the case of strongly displacement monotone non-separable Hamiltonians. All these are quantitative at exponential rates.

[27] arXiv:2501.00080 (replaced) [pdf, other]
Title: A Data-driven Approach to Risk-aware Robust Design
Luis G. Crespo, Bret Stanford, Natalia Alexandrov
Journal-ref: Reliability engineering and system safety, 2025
Subjects: Optimization and Control (math.OC)

This paper proposes risk-averse and risk-agnostic formulations to robust design in which solutions that satisfy the system requirements for a set of scenarios are pursued. These scenarios, which correspond to realizations of uncertain parameters or varying operating conditions, can be obtained either experimentally or synthetically. The proposed designs are made robust to variations in the training data by considering perturbed scenarios. This practice allows accounting for error and uncertainty in the measurements, thereby preventing data overfitting. Furthermore, we use relaxation to trade-off a lower optimal objective value against lesser robustness to uncertainty. This is attained by eliminating a given number of optimally chosen outliers from the dataset, and by allowing the perturbed scenarios to violate the requirements with an acceptably small probability. For instance, we can seek a design that satisfies the requirements for as many perturbed scenarios as possible, or pursue a riskier design that attains a lower objective value in exchange for a few scenarios violating the requirements. These ideas are illustrated by considering the design of an aeroelastic wing.

[28] arXiv:2501.02752 (replaced) [pdf, html, other]
Title: Douglas--Rachford algorithm for nonmonotone multioperator inclusion problems
Jan Harold Alcantara, Akiko Takeda
Comments: 35 pages. Major update: added numerical experiment; code available at this https URL
Subjects: Optimization and Control (math.OC)

The Douglas--Rachford algorithm is a classic splitting method for finding a zero of the sum of two maximal monotone operators. It has also been applied to settings that involve one weakly and one strongly monotone operator. In this work, we extend the Douglas--Rachford algorithm to address multioperator inclusion problems involving $m$ ($m\geq 2$) weakly and strongly monotone operators, reformulated as a two-operator inclusion in a product space. By selecting appropriate parameters, we establish the convergence of the algorithm to a fixed point, from which solutions can be extracted. Furthermore, we illustrate its applicability to sum-of-$m$-functions minimization problems characterized by weakly convex and strongly convex functions. For general nonconvex problems in finite-dimensional spaces, comprising Lipschitz continuously differentiable functions and a proper closed function, we provide global subsequential convergence guarantees.

[29] arXiv:2501.04225 (replaced) [pdf, html, other]
Title: A black-box optimization method with polynomial-based kernels and quadratic-optimization annealing
Yuki Minamoto, Yuya Sakamoto
Comments: 32 pages, 11 figures, and 1 table
Subjects: Optimization and Control (math.OC)

We introduce kernel-QA, a black-box optimization (BBO) method that constructs surrogate models analytically using low-order polynomial kernels within a quadratic unconstrained binary optimization (QUBO) framework, enabling efficient utilization of Ising machines. The method has been evaluated on artificial landscapes, ranging from uni-modal to multi-modal, with input dimensions extending to 80 for real variables and 640 for binary variables. The results demonstrate that kernel-QA is particularly effective for optimizing black-box functions characterized by local minima and high-dimensional inputs, showcasing its potential as a robust and scalable BBO approach.

[30] arXiv:2504.13063 (replaced) [pdf, html, other]
Title: An exact approach for the multi-depot electric vehicle scheduling problem
Xenia Haslinger, Elisabeth Gaar, Sophie N. Parragh
Subjects: Optimization and Control (math.OC); Discrete Mathematics (cs.DM)

The "avoid - shift - improve" framework and the European Clean Vehicles Directive set the path for improving the efficiency and ultimately decarbonizing the transport sector. While electric buses have already been adopted in several cities, regional bus lines may pose additional challenges due to the potentially longer distances they have to travel. In this work, we model and solve the electric bus scheduling problem, lexicographically minimizing the size of the bus fleet, the number of charging stops, and the total energy consumed, to provide decision support for bus operators planning to replace their diesel-powered fleet with zero emission vehicles. We propose a graph representation which allows partial charging without explicitly relying on time variables and derive 3-index and 2-index mixed-integer linear programming formulations for the multi-depot electric vehicle scheduling problem. While the 3-index model can be solved by an off-the-shelf solver directly, the 2-index model relies on an exponential number of constraints to ensure the correct depot pairing. These are separated in a cutting plane fashion. We propose a set of instances with up to 80 service trips to compare the two approaches, showing that, with a small number of depots, the compact 3-index model performs very well. However, as the number of depots increases the developed branch-and-cut algorithm proves to be of value. These findings not only offer algorithmic insights but the developed approaches also provide actionable guidance for transit agencies and operators, allowing to quantify trade-offs between fleet size, energy efficiency, and infrastructure needs under realistic operational conditions.

[31] arXiv:2504.15914 (replaced) [pdf, html, other]
Title: Continuity Conditions for Piecewise Quadratic Functions on Simplicial Conic Partitions are Equivalent
Magne Erlandsen, Tomas Meijer, W. P. M. H. Heemels, Sebastiaan van den Eijnden
Comments: 8 pages, 3 figures. Nov 2025: Fixed author name typo; no other content changes
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY)

Analysis of continuous-time piecewise linear systems based on piecewise quadratic (PWQ) Lyapunov functions typically requires continuity of these functions over a partition of the state space. Several conditions for guaranteeing continuity of PWQ functions over state space partitions can be found in the literature. In this technical note, we show that these continuity conditions are equivalent over so-called simplicial conic partitions. As a consequence, the choice of which condition to impose can be based solely on practical considerations such as specific application or numerical aspects, without introducing additional conservatism in the analysis.

[32] arXiv:2504.18798 (replaced) [pdf, html, other]
Title: Anticipated backward stochastic evolution equations and maximum principle for path-dependent systems in infinite dimensions
Guomin Liu, Jian Song, Meng Wang
Subjects: Optimization and Control (math.OC)

For a class of path-dependent stochastic evolution equations driven by cylindrical $Q$-Wiener process, we study the Pontryagin's maximum principle for the stochastic recursive optimal control problem. In this infinite-dimensional control system, the state process depends on its past trajectory, the control is delayed via an integral with respect to a general finite measure, and the final cost relies on the delayed this http URL obtain the maximum principle, we introduce a functional adjoint operator for the non-anticipative path derivative and establish the well-posedness of an anticipated backward stochastic evolution equation in the path-dependent form, which serves as the adjoint equation.

[33] arXiv:2505.21274 (replaced) [pdf, html, other]
Title: Sample complexity of optimal transport barycenters with discrete support
Léo Portales, Edouard Pauwels, Elsa Cazelles
Subjects: Optimization and Control (math.OC); Statistics Theory (math.ST)

Computational implementation of optimal transport barycenters for a set of target probability measures requires a form of approximation, a widespread solution being empirical approximation of measures. We provide an $O(\sqrt{N/n})$ statistical generalization bounds for the empirical sparse optimal transport barycenters problem, where $N$ is the maximum cardinality of the barycenter (sparse support) and $n$ is the sample size of the target measures empirical approximation. Our analysis includes various optimal transport divergences including Wasserstein, Sinkhorn and Sliced-Wasserstein. We discuss the application of our result to specific settings including K-means, constrained K-means, free and fixed support Wasserstein barycenters.

[34] arXiv:2510.00801 (replaced) [pdf, html, other]
Title: Global Convergence of Oja's Component Flow for General Square Matrices and Its Applications
Daiki Tsuzuki, Kentaro Ohki
Comments: 15 pages, 6 figures. Added two references and fixed errors and typos
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY)

In this study, the global convergence properties of the Oja flow, a continuous-time algorithm for principal component extraction, was established for general square matrices. The Oja flow is a matrix differential equation on the Stiefel manifold designed to extract a dominant subspace. Although its analysis has traditionally been restricted to symmetric positive-definite matrices, where it acts as a gradient flow, recent applications have extended its use to general matrices. In this non-symmetric case, the flow extracts the invariant subspace corresponding to the eigenvalues with the largest real parts. However, prior convergence results have been purely local, leaving the global behavior as an open problem. The findings of this study fill this gap by providing a comprehensive global convergence analysis, establishing that the flow converges exponentially for almost all initial conditions. We also propose a modification to the algorithm that enhances its numerical stability. As an application of this theory, we developed novel methods for model reduction of linear dynamical systems and the synthesis of low-rank stabilizing controllers. The study advances the theoretical understanding of the Oja flow and demonstrates its potential as a reliable and versatile tool for analyzing and controlling complex linear systems.

[35] arXiv:2510.05455 (replaced) [pdf, html, other]
Title: Optimization via a Control-Centric Framework
Liraz Mudrik, Isaac Kaminer, Sean Kragelund, Abram H. Clark
Comments: This work has been submitted to the IEEE for possible publication. 12 pages, 3 figures
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY)

Optimization plays a central role in intelligent systems and cyber-physical technologies, where speed and reliability of convergence directly impact performance. In control theory, optimization-centric methods are standard: controllers are designed by repeatedly solving optimization problems, as in linear quadratic regulation, $H_\infty$ control, and model predictive control. In contrast, this paper develops a control-centric framework for optimization itself, where algorithms are constructed directly from Lyapunov stability principles rather than being proposed first and analyzed afterward. A key element is the stationarity vector, which encodes first-order optimality conditions and enables Lyapunov-based convergence analysis. By pairing a Lyapunov function with a selectable decay law, we obtain continuous-time dynamics with guaranteed exponential, finite-time, fixed-time, or prescribed-time convergence. Within this framework, we introduce three feedback realizations of increasing restrictiveness: the Hessian-gradient, Newton, and gradient dynamics. Each realization shapes the decay of the stationarity vector to achieve the desired rate. These constructions unify unconstrained optimization, extend naturally to constrained problems via Lyapunov-consistent primal-dual dynamics, and broaden the results for minimax and generalized Nash equilibrium seeking problems beyond exponential stability. The framework provides systematic design tools for optimization algorithms in control and game-theoretic problems.

[36] arXiv:2510.06112 (replaced) [pdf, html, other]
Title: Lagrangian Dual Sections: A Topological Perspective on Hidden Convexity
Venkat Chandrasekaran, Timothy Duff, Jose Israel Rodriguez, Kevin Shu
Subjects: Optimization and Control (math.OC)

Hidden convexity is a powerful idea in optimization: under the right transformations, nonconvex problems that are seemingly intractable can be solved efficiently using convex optimization. We introduce the notion of a Lagrangian dual section of a nonlinear program defined over a topological space, and we use it to give a sufficient condition for a nonconvex optimization problem to have a natural convex reformulation. We emphasize the topological nature of our framework, using only continuity and connectedness properties of a certain Lagrangian formulation of the problem to prove our results. We demonstrate the practical consequences of our framework in a range of applications and by developing new algorithmic methodology. First, we present families of nonconvex problem instances that can be transformed to convex programs in the context of spectral inverse problems -- which include quadratically constrained quadratic optimization and Stiefel manifold optimization as special cases -- as well as unbalanced Procrustes problems. In each of these applications, we both generalize prior results on hidden convexity and provide unifying proofs. For the case of the spectral inverse problems, we also present a Lie-theoretic approach that illustrates connections with the Kostant convexity theorem. Second, we introduce new algorithmic ideas that can be used to find globally optimal solutions to both Lagrangian forms of an optimization problem as well as constrained optimization problems when the underlying topological space is a Riemannian manifold.

[37] arXiv:2510.23791 (replaced) [pdf, html, other]
Title: A Family of Convex Models to Achieve Fairness through Dispersion Control
Abhay Singh Bhadoriya, Deepjyoti Deka, Kaarthik Sundar
Comments: 16 pages, 4 figures
Subjects: Optimization and Control (math.OC)

Controlling the dispersion of a subset of decision variables in an optimization problem is crucial for enforcing fairness or load-balancing across a wide range of applications. Building on the well-known equivalence of finite-dimensional norms, the note develops a family of parameterized convex models that regulate the dispersion of a vector of decision-variable values through its coefficient of variation. Each model contains a single parameter that takes a value in the interval [0,1]. When the parameter is set to zero, the model imposes only a trivial constraint on the optimization problem; when set to one, it enforces equality of all the decision variables. As the parameter varies, the coefficient of variation is provably bounded above by a monotonic function of that parameter. The note also presents theoretical results that relate the space of feasible solutions to all the models.

[38] arXiv:2510.24489 (replaced) [pdf, html, other]
Title: Nonlinear forward-backward-half forward splitting with momentum for monotone inclusions
Liqian Qin, Yuchao Tang, Jigen Peng
Comments: 34 pages
Subjects: Optimization and Control (math.OC)

In this work, we propose a new splitting algorithm for solving structured monotone inclusion problems composed of a maximally monotone operator, a maximally monotone and Lipschitz continuous operator and a cocoercive operator. Our method augments the forward-backward-half forward splitting algorithm with a nonlinear momentum term. Under appropriate conditions on the step-size, we prove the weak convergence of the proposed algorithm. A linear convergence rate is also obtained under the strong monotonicity assumption. Furthermore, we investigate a stochastic variance-reduced forward-backward-half forward splitting algorithm with momentum for solving finite-sum monotone inclusion problems. Weak almost sure convergence and linear convergence are also established under standard condition. Preliminary numerical experiments on synthetic datasets and real-world quadratic programming problems in portfolio optimization demonstrate the effectiveness and superiority of the proposed algorithm.

[39] arXiv:2511.00288 (replaced) [pdf, html, other]
Title: A non-exchangeable mean field control problem with controlled interactions
Mao Fabrice Djete
Subjects: Optimization and Control (math.OC); Probability (math.PR)

This paper introduces and analyzes a new class of mean-field control (\textsc{MFC}) problems in which agents interact through a \emph{fixed but controllable} network structure. In contrast with the classical \textsc{MFC} framework -- where agents are exchangeable and interact only through symmetric empirical distributions -- we consider systems with heterogeneous and possibly asymmetric interaction patterns encoded by a structural kernel, typically of graphon type. A key novelty of our approach is that this interaction structure is no longer static: it becomes a genuine \emph{control variable}. The planner therefore optimizes simultaneously two distinct components: a \emph{regular control}, which governs the local dynamics of individual agents, and an \emph{interaction control}, which shapes the way agents connect and influence each other through the fixed structural kernel.
\medskip We develop a generalized notion of relaxed (randomized) control adapted to this setting, prove its equivalence with the strong formulation, and establish existence, compactness, and continuity results for the associated value function under minimal regularity assumptions. Moreover, we show that the finite $n$-agent control problems with general (possibly asymmetric) interaction matrices converge to the mean-field limit when the corresponding fixed step-kernels converge in cut-norm, with asymptotic consistency of the optimal values and control strategies. Our results provide a rigorous framework in which the \emph{interaction structure itself is viewed and optimized as a control object}, thereby extending mean-field control theory to non-exchangeable populations and controlled network interactions.

[40] arXiv:2511.03566 (replaced) [pdf, html, other]
Title: Improving Directions in Mixed Integer Bilevel Linear Optimization
Federico Battista, Ted K. Ralphs
Subjects: Optimization and Control (math.OC); Mathematical Software (cs.MS)

We consider the central role of improving directions in solution methods for mixed integer bilevel linear optimization problems (MIBLPs). Current state-of-the-art methods for solving MIBLPs employ the branch-and-cut framework originally developed for solving mixed integer linear optimization problems. This approach relies on oracles for two kinds of subproblems: those for checking whether a candidate pair of leader's and follower's decisions is bilevel feasible, and those required for generating valid inequalities. Typically, these two types of oracles are managed separately, but in this work, we explore their close connection and propose a solution framework based on solving a single type of subproblem: determining whether there exists a so-called improving feasible direction for the follower's problem. Solution of this subproblem yields information that can be used both to check feasibility and to generate strong valid inequalities. Building on prior works, we expose the foundational role of improving directions in enforcing the follower's optimality condition and extend a previously known hierarchy of optimality-based relaxations to the mixed-integer setting, showing that the associated relaxed feasible regions coincide exactly with the closure associated with intersection cuts derived from improving directions. Numerical results with an implementation using a modified version of the open source solver MibS show that this approach can yield practical improvements.

[41] arXiv:2112.09408 (replaced) [pdf, html, other]
Title: Numerical method to solve impulse control problems for partially observed piecewise deterministic Markov processes
Alice Cleynen, Benoîte de Saporta
Subjects: Statistics Theory (math.ST); Optimization and Control (math.OC)

Designing efficient and rigorous numerical methods for sequential decision-making under uncertainty is a difficult problem that arises in many applications frameworks. In this paper we focus on the numerical solution of a subclass of impulse control problem for piecewise deterministic Markov process (PDMP) when the jump times are hidden. We first state the problem as a partially observed Markov decision process (POMDP) on a continuous state space and with controlled transition kernels corresponding to some specific skeleton chains of the PDMP. Then we proceed to build a numerically tractable approximation of the POMDP by tailor-made discretizations of the state spaces. The main difficulty in evaluating the discretization error comes from the possible random jumps of the PDMP between consecutive epochs of the POMDP and requires special care. Finally we discuss the practical construction of discretization grids and illustrate our method on simulations.

[42] arXiv:2304.09575 (replaced) [pdf, html, other]
Title: Approximate non-linear model predictive control with safety-augmented neural networks
Henrik Hose, Johannes Köhler, Melanie N. Zeilinger, Sebastian Trimpe
Subjects: Systems and Control (eess.SY); Machine Learning (cs.LG); Optimization and Control (math.OC)

Model predictive control (MPC) achieves stability and constraint satisfaction for general nonlinear systems, but requires computationally expensive online optimization. This paper studies approximations of such MPC controllers via neural networks (NNs) to achieve fast online evaluation. We propose safety augmentation that yields deterministic guarantees for convergence and constraint satisfaction despite approximation inaccuracies. We approximate the entire input sequence of the MPC with NNs, which allows us to verify online if it is a feasible solution to the MPC problem. We replace the NN solution by a safe candidate based on standard MPC techniques whenever it is infeasible or has worse cost. Our method requires a single evaluation of the NN and forward integration of the input sequence online, which is fast to compute on resource-constrained systems. The proposed control framework is illustrated using two numerical non-linear MPC benchmarks of different complexity, demonstrating computational speedups that are orders of magnitude higher than online optimization. In the examples, we achieve deterministic safety through the safety-augmented NNs, where a naive NN implementation fails.

[43] arXiv:2502.02132 (replaced) [pdf, other]
Title: How Memory in Optimization Algorithms Implicitly Modifies the Loss
Matias D. Cattaneo, Boris Shigida
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Optimization and Control (math.OC); Machine Learning (stat.ML)

In modern optimization methods used in deep learning, each update depends on the history of previous iterations, often referred to as memory, and this dependence decays fast as the iterates go further into the past. For example, gradient descent with momentum has exponentially decaying memory through exponentially averaged past gradients. We introduce a general technique for identifying a memoryless algorithm that approximates an optimization algorithm with memory. It is obtained by replacing all past iterates in the update by the current one, and then adding a correction term arising from memory (also a function of the current iterate). This correction term can be interpreted as a perturbation of the loss, and the nature of this perturbation can inform how memory implicitly (anti-)regularizes the optimization dynamics. As an application of our theory, we find that Lion does not have the kind of implicit anti-regularization induced by memory that AdamW does, providing a theory-based explanation for Lion's better generalization performance recently documented.

[44] arXiv:2502.04582 (replaced) [pdf, other]
Title: The Mini Wheelbot: A Testbed for Learning-based Balancing, Flips, and Articulated Driving
Henrik Hose, Jan Weisgerber, Sebastian Trimpe
Subjects: Robotics (cs.RO); Systems and Control (eess.SY); Optimization and Control (math.OC)

The Mini Wheelbot is a balancing, reaction wheel unicycle robot designed as a testbed for learning-based control. It is an unstable system with highly nonlinear yaw dynamics, non-holonomic driving, and discrete contact switches in a small, powerful, and rugged form factor. The Mini Wheelbot can use its wheels to stand up from any initial orientation - enabling automatic environment resets in repetitive experiments and even challenging half flips. We illustrate the effectiveness of the Mini Wheelbot as a testbed by implementing two popular learning-based control algorithms. First, we showcase Bayesian optimization for tuning the balancing controller. Second, we use imitation learning from an expert nonlinear MPC that uses gyroscopic effects to reorient the robot and can track higher-level velocity and orientation commands. The latter allows the robot to drive around based on user commands - for the first time in this class of robots. The Mini Wheelbot is not only compelling for testing learning-based control algorithms, but it is also just fun to work with, as demonstrated in the video of our experiments.

[45] arXiv:2503.04620 (replaced) [pdf, other]
Title: Interpolation-based coordinate descent method for parameterized quantum circuits
Zhijian Lai, Jiang Hu, Taehee Ko, Jiayuan Wu, Dong An
Comments: 29+20 pages, 13 figures
Subjects: Quantum Physics (quant-ph); Optimization and Control (math.OC)

Parameterized quantum circuits (PQCs) are ubiquitous in the design of hybrid quantum-classical algorithms. In this work, we propose an interpolation-based coordinate descent (ICD) method to address the parameter optimization problem in PQCs. The ICD method provides a unified framework for existing structure optimization techniques such as Rotosolve, sequential minimal optimization, ExcitationSolve, and others. ICD employs interpolation to approximate the PQC cost function, effectively recovering its underlying trigonometric structure, and then performs an argmin update on a single parameter in each iteration. In contrast to previous studies on structure optimization, we determine the optimal interpolation nodes to mitigate statistical errors arising from quantum measurements. Moreover, in the common case of $r$ equidistant frequencies, we show that the optimal interpolation nodes are equidistant nodes with spacing $2\pi/(2r+1)$ (under constant variance assumption), and that our ICD method simultaneously minimizes the mean squared error, the condition number of the interpolation matrix, and the average variance of the approximated cost function. We perform numerical simulations and test on the MaxCut problem, the transverse field Ising model, and the XXZ model. Numerical results imply that our ICD method is more efficient than the commonly used gradient descent and random coordinate descent method.

[46] arXiv:2509.17595 (replaced) [pdf, html, other]
Title: Impossibility Results of Card-Based Protocols via Mathematical Optimization
Shunnosuke Ikeda, Kazumasa Shinagawa
Subjects: Cryptography and Security (cs.CR); Optimization and Control (math.OC)

This paper introduces mathematical optimization as a new method for proving impossibility results in the field of card-based cryptography. While previous impossibility proofs were often limited to cases involving a small number of cards, this new approach establishes results that hold for a large number of cards. The research focuses on single-cut full-open (SCFO) protocols, which consist of performing one random cut and then revealing all cards. The main contribution is that for any three-variable Boolean function, no new SCFO protocols exist beyond those already known, under the condition that all additional cards have the same color. The significance of this work is that it provides a new framework for proving impossibility results and delivers a proof that is valid for any number of cards, as long as all additional cards have the same color.

Total of 46 entries
Showing up to 2000 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status