Optimization Research in AMCR
Optimization Algorithm Development
Hyperparameter Optimization for Deep Learning Models
Juliane Mueller: JulianeMueller@lbl.gov
We are developing efficient an effective optimization methods for tuning the hyperparameters (architectures, defined by the number of layers, nodels, etc.) of deep learning (DL) models. With the increasing interest in utilizing AI and DL in DOE-relevant science applications for prediction, pattern recognition and classification, the question arises which DL model should be used. Our goal is to help scientists answer this question by automating the search over DL model hyperparameters. We have developed a surrogate model based optimizer that helps guide the iterative optimization search by using active learning strategies. The optimizer can use either Gaussian process models or radial basis functions. We have demonstrated the performance of the method on a regression task with the goal to predict groundwater levels in California (see Figure 1). We also used the method for CT image reconstruction (see Figure 2). In both cases, we were able to obtain good agreeement with the ground truth data.
Figure 1: Prediction of groundqater levels (red curves) obtained with a multilayer perceptron whose architecture was optimized with our surrogate model algorithm. The ground truth data (blue) from 2016-2018 was not used in training, validating or optimizing the architecture of the model.
Figure 2: A DL model's architecture was optimized with the goal to find an inpainted sinogram (right) that is as close as possible to the ground truth (left)
Large-Dimensional Optimization
Juliane Mueller: JulianeMueller@lbl.gov
Many parameter tuning tasks, such as event generator tuning often involve many parameters whose effect on the outcome is not well understood. Within the scope of the SciDAC-5 FASTMath Institute research, we are exploiting dimension reduction methods (sensitivity analysis, active subspaces) to enable efficient optimization.
Optimizers for Quantum Computing Applications
Juliane Mueller: JulianeMueller@lbl.gov
The variational quantum eigensolver (VQE) is a hybrid quantum-classical optimizer, and it combines classical optimizers with function evaluations on a quantum chip. The VQE algorothm minimizes the expectation value of a Hamiltonian (the energy in the system), which is evaluated on the quantum chip, which is computationally expensive. The energy landscape is for many cases multimodal and the energy is impacted by noise. Therefore, gradient-based, stencil and mesh based search methods are not suitable because their iterative sampling decisions are impacted by the noise, which can lead to premature convergence. Within the AIDE-QC project, we are developing optimizers that are robust to the noise. We employ a combination of local and global searches based on Gaussian process models with white kernels to smooth the noise (see Figure 3), thus allowing us to find promising regions in the search space that contain the true optimum.
Figure 3: Interpolating Gaussian process model trained on noisy data (left) leads to bumpy surface with noise-induced local minima. Noninterpolating Gaussian process model (right) trained on the same data leads to a smooth surface that captures the global shape and allows us to quickly find the vicinity of the true global optimum.
Collaborations with Other Divisions
Analysis of Solid-Liquid Interfaces with Standing Waves
With Osman Karslioglu, Hendrik Bluhm, Chuck Fadley, Mathias Gehlmann, Slavomír Nemšák
Analysis of solid-liquid interfaces can provide important insights into electrochemical devices such as batteries, fuel-cells and electrolyzers, as well as electrochemical processes such as corrosion.
We would like to study solid-liquid interfaces using X-ray photoelectron spectroscopy, which has been nearly impossible until recently due to strong interaction of electrons with matter. Using ultrathin liquid layers (10-20 nm) and hard X-rays, this goal has been achieved. However, the signal is integrated over the whole thickness of the liquid, which makes interpretation difficult.
In order to obtain “depth-resolved” information, we use X-ray standing waves* to create photoemission. The signal obtained then needs to be deconvoluted by an iterative trial-error process involving the simulation of experimental data using an X-ray optics simulation code.
* Spectroscopic techniques utilizing X-ray standing waves are useful for obtaining structural information at the atomic scale. The standing waves are formed due to diffraction of X-rays from a periodic structure. This structure can be the atomic lattice of a single-crystal or the layers of a multilayered solid prepared in a laboratory. In our case it is the latter.
This work has been published in the Journal of Electron Spectroscopy and Related Phenomena.
A comparison of the experimental rocking curves for various core-level intensities with theoretical calculations based on the optimized sample configuration defined in the Figure on the right.
The final depth distributions in the sample as derived by fitting experimental rocking curves to theory.
Figures by S. Nemšák
Current projects
As part of the Simulation Toolkit team, we are developing an optimization tool ("co-optimizer") that allows for a multi-objective bilevel optimization of fuels and engine operating conditions with the goal to maximize efficiency and minimize fuel costs. We are using statistical surrogate models that we train on experimental data in order to guide future experimentation.
More to come. Check back soon..