This repository contains the code for Learning Multiple Initial Solutions to Optimization Problems by Elad Sharony, Heng Yang, Tong Che, Marco Pavone, Shie Mannor, Peter Karkus.
The main idea of MISO is to train a single neural network to predict multiple initial solutions to
an optimization problem, such that the initial solutions cover promising regions of the optimization
landscape, eventually allowing a local optimizer to find a solution close to the global optimum.
To set up the project environment, follow these steps:
-
Update Paths
Run the
setup.pyscript to configure necessary paths:python setup.py
-
Install Dependencies
Create and activate the Conda environment using
environment.yml:conda env create -f environment.yml conda activate miso
-
Set Up nuPlan
If you plan to use the nuPlan environment:
-
Install nuPlan Dataset
Follow the nuPlan installation guide to set up the dataset.
-
Install tuPlan Garage
Install tuPlan Garage by following their getting started instructions.
-
Configure Dataset Path
Update
NUPLAN_ROOTinnuplan/config.pyto point to your nuPlan dataset root directory.
-
This codebase supports all three environments discussed in the paper, cart-pole, reacher, and autonomous driving, using different optimizers: DDP, MPPI, and iLQR.
We first run the warm-start heuristic and then use the problem instances to generate a dataset of (near-)optimal solutions using an oracle proxy:
-
Closed-Loop with Warm-Start
python eval.py --env [env_name] --exp closed_loop --method warm_start --eval_set train
-
Open-Loop with Oracle
python eval.py --env [env_name] --exp open_loop --method oracle --eval_set train
Replace
[env_name]withcartpole,reacher, ornuplan.
After data collection, generate the dataset required for training:
python training/dataset.py --env [env_name]The files generated are:
- Dataset File:
open_loop_oracle.pth - Scaler File:
open_loop_oracle_scaler.pkl(contains standardization statistics)
These files are saved under the data directory of the corresponding environment, e.g., ${MISO_ROOT}/envs/[env_name]/data.
Train the MISO model with the specified parameters:
python train.py --env [env_name] --num_predictions 16 --miso_method [miso_method] --seed 0--miso_method options:
miso-pd: Penalize the pairwise distance between all outputs. The overall loss combines this dispersion-promoting term with the regression loss,
where
miso-wta: Select the best-predicted output at training time and only minimize the regression loss for this specific prediction,
miso-mix: A combination of the previous two approaches to potentially enhance performance, as it provides some measure of dispersion we can tune,
here,
-
none: A simple regression loss without. For$K>1$ , the loss of each prediction is summed up.
Evaluate the trained model on the test set:
python eval.py --env [env_name] --exp closed_loop --method [miso_method] --optimizer_mode [optimizer_mode] --eval_set test--optimizer_modeoptions:singlemultiple
- Each training session runs for 125 epochs, consistent with the settings in the paper.
- The model reads data from preconfigured paths specified in
training/configs/dataset.yaml, these paths correspond to where the data is generated during the data preparation steps. No changes are needed unless you have customized the data directories.
While efforts have been made to ensure reproducibility, results may slightly vary from those reported in the paper, especially in the nuPlan environment with the PDM planner. Yet, the relative improvements between different methods and baselines should remain consistent.
If you find this project useful in your research, please cite:
@article{sharony2024learningmultipleinitialsolutions,
title={Learning Multiple Initial Solutions to Optimization Problems},
author={Elad Sharony and Heng Yang and Tong Che and Marco Pavone and Shie Mannor and Peter Karkus},
journal={arXiv preprint arXiv:2411.02158},
year={2024},
}