Black-box optimization and machine learning

A principal challenge in optimization practice is how to optimize in the absence of an algebraic model of the system to be optimized. We are interested in problems for which algebraic models are (1) intractable to conventional optimization software (for instance, due to discontinuities, non-smoothness, or excessive computational cost of a function evaluation), or are (2) entirely unavailable (as in the case of many experimental systems and operating processes to be optimized). We refer to systems of this sort as ``black-box systems'' and assume that the black box can be queried through a simulation or experimental measurements that provide a system output for specified values of system inputs.

Our work in this area focuses on the development of methodologies that rely on statistical and machine learning techniques to handle experimental and simulation data in conjunction with deterministic optimization methods to aid decision-making and model-building for black-box systems.

The methods we develop have been applied to problems in bioinformatics, portfolio optimization, protein-ligand docking, carbon capture and sequestration, parameter tuning for optimization solvers, antenna optimization, optimization of polymerase chain reactions, protein structure alignment, and powder diffraction.

Derivative-free optimization

Obtaining derivative information for many complex and expensive simulations is impractical. To tackle such systems, we maintain a comprehensive library of existing derivative-free algorithms, and perform extensive studies of their performance in various domains. Results from a recent comparison of 22 software implementations are summarized here.

In addition, we are developing novel algorithms to address classes of derivative-free optimization problems while performing the fewest number of experiments.

Learning process models

We are developing a model generation methodology that uses derivative-based and derivative-free optimization alongside machine learning and statistical techniques to learn algebraic models of detailed simulations and experimental systems. Once a candidate set of models is defined, they are tested, exploited, and improved through the use of derivative-free solvers to adaptively sample new points. Of particular importance is the algorithm's ability to generate models that are simple yet accurate. We have implemented this strategy in ALAMO, a code for the Automatic Learning of Algebraic Models for Optimization.

Simulation optimization

An added complication in the case of discrete-event simulations is the inherent stochasticity associated with their outputs. Our goals in this area are to (1) compile and compare existing methods for simulation optimization on new problem testbeds, and to (2) provide algorithms and implementations that handle uncertainty in data in a robust manner while locating optimal solutions.

Related Publications

  1. Amaran, S., N. V. Sahinidis, B. Sharda, and S. J. Bury, Simulation optimization: A review of algorithms and applications, Annals of Operations Research, 240, 351–380, 2016.
  2. Rios, L. M. and N. V. Sahinidis, Derivative-free optimization: A review of algorithms and comparison of software implementations, Journal of Global Optimization, 56, 1247–1293, 2013.
  3. Shah, S. B. and N. V. Sahinidis, SAS-Pro: Simultaneous residue assignment and structure superposition for protein structure alignment, PLoS ONE, 7(5): e37493, 2012.