# Schedule for: 18w5102 - Distributionally Robust Optimization

Beginning on Sunday, March 4 and ending Friday March 9, 2018

All times in Banff, Alberta time, MST (UTC-7).

Sunday, March 4 | |
---|---|

16:00 - 17:30 | Check-in begins at 16:00 on Sunday and is open 24 hours (Front Desk - Professional Development Centre) |

17:45 - 19:30 |
Dinner ↓ A buffet dinner is served daily between 5:30pm and 7:30pm in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |

20:00 - 22:00 | Informal gathering (Corbett Hall Lounge (CH 5210)) |

Monday, March 5 | |
---|---|

07:00 - 08:45 |
Breakfast ↓ Breakfast is served daily between 7 and 9am in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |

08:45 - 09:00 | Introduction and Welcome by BIRS Station Manager (TCPL 201) |

09:00 - 09:35 |
Yinyu Ye: Distributionally Robust Stochastic and Online Optimization ↓ We present decision/optimization models/problems driven by uncertain and online data, and show how analytical models and computational algorithms can be used to achieve solution efficiency and near optimality. First, we describe recent applications of the Distributionally Robust Optimization in medical decision making. Secondly,we consider a common practice of estimating only marginal distributions and substituting joint distribution by independent (product) distribution in stochastic optimization, where we study possible loss incurred on ignoring correlations and quantify that loss as Price of Correlations (POC). Thirdly, we describe an online combinatorial auction problem using online linear programming technologies. We discuss near-optimal algorithms for solving this surprisingly general class of distribution-free online problems under the assumption of random order of arrivals and some conditions on the data and size of the problem. (TCPL 201) |

09:35 - 10:10 |
Melvyn Sim: A Tractable Format for Distributionally Robust Optimization ↓ We present a unified and tractable framework for distributionally robust optimization that could encompass a variety of statistical information including, among others things, constraints on expectation, scenario-wise expectations, Wasserstein metric, and uncertain probabilities defined by $phi$-divergence. To address a distributionally robust optimization problem with recourse, we introduce the scenario wise linear decision rule, which is based on the classical linear decision rule and can also be applied in situations where the recourse decisions are discrete. Based in this format, we has also developed a new Matlab based algebraic modeling language to model and solve distributionally robust optimization problems with recourse.
This is a joint work with Zhi Chen and Peng Xiong. (TCPL 201) |

10:10 - 10:40 | Coffee Break (TCPL Foyer) |

10:40 - 11:15 |
Huifu Xu: Quantitative Stability Analysis in Distributionally Robust Optimization ↓ Ambiguity set is a key element in distributionally robust optimization models. Here we investigate the impact of perturbation of ambiguity set on the optimal value and the optimal solutions. We consider the case where the ambiguity set is defined through generalized prior moment conditions and the perturbation is caused by (a) increasing sample data to be used in the moment system and (b) discretization of the moment system. We quantify the perturbation against change of sample data or refinement of discretization and its impact on the underlying optimization problem. We also consider the case where the ambiguity set is constructed through zeta-ball and extend the analysis to a two-stage distributionally robust risk minimization problem and a distributionally robust chance constrained optimization problem. (TCPL 201) |

11:15 - 11:50 |
Krzysztof Postek: Distributionally Robust Optimization with SOS Polynomial Density Functions and Moment Conditions ↓ Numerous decision problems are solved using the tools of distributionally robust optimization. In this framework, the distribution of the problem's random parameter is assumed to be known only partially in the form of, for example, the values of its first moments. The aim is to minimize the expected value of a function of the decision variables, assuming the worst-possible realization of the unknown probability measure. In the general moment problem approach, the worst-case distributions are atomic. We propose to model smooth uncertain density functions using sum-of-squares polynomials with known moments over a given domain. We show that in this setup, one can evaluate the worst-case expected values of the functions of the decision variables in a computationally tractable way.
Joint work with Etienne de Klerk and Daniel Kuhn. (TCPL 201) |

11:50 - 12:25 |
JianQiang Cheng: Distributionally Robust Optimization with Principal Component Analysis ↓ In this talk, we propose a new approximation method to solve distributionally robust optimization problems with moment-based ambiguity sets. Our approximation method relies on principal component analysis (PCA) for optimal lower dimensional representation of variability in random samples. We show that the PCA approximation yields a relaxation of the original problem and derive theoretical bounds on the gap between the original problem and its PCA approximation. Furthermore, an extensive numerical study shows the strength of the proposed approximation method in terms of solution quality and runtime. (TCPL 201) |

12:25 - 12:35 |
Group Photo ↓ Meet in foyer of TCPL to participate in the BIRS group photo. The photograph will be taken outdoors, so dress appropriately for the weather. Please don't be late, or you might not be in the official group photo! (TCPL Foyer) |

12:35 - 13:45 |
Lunch ↓ Lunch is served daily between 11:30am and 1:30pm in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |

13:45 - 14:20 |
Tito Homem-de-Mello: Distributionally Robust Newsvendor Problems with Variation Distance ↓ We use distributionally robust stochastic programs (DRSP) to model a general class of newsvendor problems where the underlying demand distribution is unknown, and so the goal is to find an order quantity that minimizes the worst-case expected cost among an ambiguity set of distributions. The ambiguity set consists of those distributions that are not far---in the sense of the so-called variation distance---from a nominal distribution. The maximum distance allowed in the ambiguity set (called level of robustness) places the DRSP between the ``classical" stochastic programming and robust optimization models, which correspond to setting the level of robustness to zero and infinity, respectively. The structure of the newsvendor problem allows us to analyze the problem from multiple perspectives: First, we derive explicit formulas and properties of the optimal solution as a function of the level of robustness. Moreover, we determine the regions of demand that are critical (in a precise sense) to optimal cost from the viewpoint of a risk-averse decision maker. Finally, we establish quantitative relationships between the distributionally robust model and the corresponding risk-neutral and classical robust optimization models, which include the price of optimism/pessimism, and the nominal/worst-case regrets, among others. Our analyses can help the decision maker better understand the role of demand uncertainty in the problem and can guide him/her to choose an appropriate level of robustness. We illustrate our results with numerical experiments on a variety of newsvendor problems with different characteristics. (TCPL 201) |

14:20 - 14:55 |
Jonathan Yu-Meng Li: Worst-Case Law Invariant Risk Measures and Distributions: The Case of Nonlinear DRO ↓ The class of law invariant coherent risk measures contains many risk measures that one would encounter in a distribution setting. In this talk, we present some general results about worst-case law invariant risk measures where a set of distributions sharing the same first few moments are considered for estimating the worst possible risk. In particular, its distributionally robust optimization (DRO) formulation is generally nonlinear in distribution and thus requires additional care in studying its tractability. We show cases where worst-case risk measures and distributions admit closed-form expressions and discuss their implication for future research. Our analysis exploits the structure of spectral risk measure and its connection to law invariant risk measures. (TCPL 201) |

14:55 - 15:30 |
Ruiwei Jiang: Ambiguous Risk Constraints with Moment and Structural Information ↓ Optimization problems face random constraint violations when uncertainty arises in constraint parameters. Effective ways of controlling such violations include risk constraints, e.g., chance constraints and conditional Value-at-Risk (CVaR) constraints. This talk discusses risk constraints when the distributional information of the uncertain parameters consists of moment information (e.g., mean, covariance, support) and certain structural information, for which we mention two specific examples: logconcavity and dominance on the tail. We find that the ambiguous risk constraints in these settings can be recast or approximated using conic constraints that facilitate computation. Finally, we demonstrate the theoretical results via case studies on power system operation and appointment scheduling. (TCPL 201) |

15:30 - 16:00 | Coffee Break (TCPL Foyer) |

16:00 - 16:35 |
Chung Piaw Teo: Disruption Risk Mitigation in Supply Chains - The Risk Exposure Index Revisited ↓ Simchi-levi et al. (2014, 2015a) proposed a novel approach using the Time-To-Recover (TTR) parameters to analyze the Risk Exposure Index (REI) of supply chains under disruption. This approach is able to capture the cascading eﬀects of disruptions in the supply chains, albeit in simpliﬁed environments – TTRs are deterministic, and at most one node in the supply chain can be disrupted. In this paper, we proposed a new method to integrate probabilistic assessment of disruption risks into the REI approach, and measure supply chain resiliency by analyzing the Worst-case CVaR (WCVaR) of total lost sales under disruptions. We show that the optimal strategic inventory positioning strategy in this model can be fully characterized by a conic program. We identify appropriate cuts that can be added to the formulation to ensure zero duality gap in the conic program. In this way, the optimal primal and dual solutions to the conic program can be used to shed light on comparative statics in the supply chain risk mitigation problem. This information can help supply chain risk managers focus their mitigation eﬀorts on critical suppliers and/or installations that will have greater impact on the performance of the supply chain when disrupted.
This is joint work with Sarah Gao (SMU), Zhenzhen Yan (NUS) and David Simchi-Levi (MIT). (TCPL 201) |

16:35 - 17:10 |
Selin Damla Ahipasaoglu: Flexibility of Distributionally Robust Choice Models in Traffic Equilibrium ↓ Traffic equilibrium models are fundamental to the analysis of transportation systems. The stochastic user equilibrium (SUE) model relaxes the perfect information assumption of the deterministic user equilibrium. SUE models predict traffic equilibrium flow assuming that users choose their perceived maximum utility paths (or perceived shortest paths) while accounting for the effects of congestion that arise due to users sharing links. Inspired by recent work on distributionally robust optimization, we develop two new user equilibrium models. The CMM-SUE model uses the mean and covariance information on path utilities but does not assume a particular form for the distribution. In the MDM-SUE model, the marginal distributions of the path utilities are specified, but the joint distribution is not. Robustness to distributional assumptions is obtained by minimizing the worst-case expected cost over all distributions with fixed two moments for the CMM model and over all distributions with given marginals for the MDM model. We show that under mild conditions, both equilibria exist and are unique. We provide convex formulations for both and develop customized algorithms to calculate the equilibrium flows. Preliminary computational results indicate that CMM-SUE provides a practical alternative to the well-known MNP-SUE (Multinomial Probit-Stochastic User Equilibrium) model that requires distributional (normality) assumptions to model correlation effects from overlapping paths. For specific choices of marginal distributions, the MDM-SUE model recreates the optimization formulation of logit SUE and weibit SUE. Moreover, the model is flexible since it can capture perception variance scaling at the route level and allows for modeling different user preferences by allowing for skewed distributions and heavy tailed distributions. (TCPL 201) |

17:10 - 17:45 |
Phebe Vayanos: Robust Multiclass Queuing Theory for Wait Time Estimation in Resource Allocation Systems ↓ In this paper we study systems that allocate different types of scarce resources to heterogeneous allocatees based on predetermined priority rules, e.g., the U.S. deceased-donor kidney allocation system or the public housing program. We tackle the problem of estimating the wait time of an allocatee who possesses incomplete system information with regard, for example, to his relative priority, other allocatees’ preferences, and resource availability. We model such systems as multiclass, multiserver queuing systems that are potentially unstable or in transient regime. We propose a novel robust optimization solution methodology that builds on the assignment problem. For first-come, first-served systems, our approach yields a mixed-integer programming formulation. For the important case where there is a hierarchy in the resource types, we strengthen our formulation through a drastic variable reduction and also propose a highly scalable heuristic, involving only the solution of a convex optimization problem (usually a second-order cone problem). We back the heuristic with an approximation guarantee that becomes tighter for larger problem sizes. We illustrate the generalizability of our approach by studying systems that operate under different priority rules, such as class priority. Numerical studies demonstrate that our approach outperforms simulation.
We showcase how our methodology can be applied to assist patients in the U.S. deceased- donor kidney waitlist. We calibrate our model using historical data to estimate patients’ wait times based on their kidney quality preferences, blood type, location and rank in the waitlist.
This is joint work with Chaitanya Bandi and Nikolaos Trichakis. (TCPL 201) |

17:45 - 19:30 |
Dinner ↓ A buffet dinner is served daily between 5:30pm and 7:30pm in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |

Tuesday, March 6 | |
---|---|

07:00 - 09:00 | Breakfast (Vistas Dining Room) |

09:00 - 09:35 |
Dimitris Bertsimas: From Data to Decisions ↓ In this talk, I review some developments in my research group at MIT regarding taking decisions directly from data.
Starting with my paper with Nathan Kallus: ``From predictive to prescriptive analytics’’, first written in 2014 that presents a framework from extending predictive Machine Algorithms of considerable generality to prescriptive ones for two stage problems, we discuss a number of exiting new developments:
With Chris Mc Cord, in the paper ``From Predictions to Prescriptions in Multistage Optimization Problems’ written in 2017, we extend the earlier framework to multistage problems and also prove rates of convergence.
With Bart van Parys in the paper, ``Bootstrap Robust Prescriptive Analytics’’ written in 2017, we provide an approach to make our prescriptive approaches immune to overfitting phemonema.
With Nihal Koduri in the paper, ``Data-Driven Optimization: A kernel regression approach’’, written in 2018 we provide non-parametric methods that outperform earlier approaches for two stage problems. (TCPL 201) |

09:35 - 10:10 |
Anton Kleywegt: Distributional Robustness and Regularization in Statistical Learning ↓ A central problem in statistical learning is to design prediction algorithms that not only perform well on training data, but also perform well on new and unseen, but similar, data. We approach this problem by formulating a distributionally robust stochastic optimization (DRSO) problem, which seeks a solution that minimizes the worst-case expected loss over a family of distributions that are close to the empirical distribution as measured by Wasserstein distance. We establish a connection between such Wasserstein DRSO and regularization. Specifically, we identify a broad class of loss functions, for which the Wasserstein DRSO is asymptotically equivalent to a regularization problem with a gradient-norm penalty. Such relation provides a new interpretation for approaches that use regularization, including a variety of statistical learning problems and discrete choice models. The connection also suggests a principled way to regularize high-dimensional, non-convex problems, which is demonstrated with the training of Wasserstein generative adversarial networks (WGANs) in deep learning.
This is joint work with Rui Gao and Xi Chen. (TCPL 201) |

10:10 - 10:40 | Coffee Break (TCPL Foyer) |

10:40 - 11:15 |
Andrew Lim: Calibration of Distributionally Robust Empirical Optimization Problems ↓ We study the out-of-sample properties of robust empirical optimization and develop a theory for data-driven calibration of the ``robustness parameter"" for worst-case maximization problems with concave reward functions. Building on the intuition that robust optimization reduces the sensitivity of the expected reward to errors in the model by controlling the spread of the reward distribution, we show that the first-order benefit of ``little bit of robustness"" is a significant reduction in the variance of the out-of-sample reward while the corresponding impact on the mean is almost an order of magnitude smaller. One implication is that a substantial reduction in the variance of the out-of-sample reward (i.e. sensitivity of the expected reward to model misspecification) is possible at little cost if the robustness parameter is properly calibrated. To this end, we introduce the notion of a robust mean-variance frontier to select the robustness parameter and show that it can be approximated using resampling methods like the bootstrap. Our examples show that robust solutions resulting from ``open loop"" calibration methods (e.g. selecting a 90% confidence level regardless of the data and objective function) can be very conservative out-of-sample, while those corresponding to the ambiguity parameter that optimizes an estimate of the out-of-sample expected reward (e.g. via the bootstrap) with no regard for the variance are often insufficiently robust.
This is joint work with Jun-ya Gotoh and Michael J. Kim. (TCPL 201) |

11:15 - 11:50 |
Vishal Gupta: Small-Data, Large-Scale Linear Optimization ↓ Optimization applications often depend upon a huge number of uncertain parameters. In many contexts, however, the amount of relevant data per parameter is small, and hence, we may have only imprecise estimates. We term this setting -- where the number of uncertainties is large, but all estimates have fixed and low precision -- the ``small-data, large-scale regime."" We formalize a model for this regime, focusing on linear programs with uncertain objective coefficients, and prove that the small-data, large-scale regime is distinct from the traditional large-sample regime. Consequently, methods like sample average approximation, data-driven robust optimization, regularization, and ``estimate-then-optimize"" policies can perform poorly. We propose a novel framework that, given a policy class, identifies an asymptotically best-in-class policy, where the asymptotics hold as the number of uncertain parameters grows large, but the amount of data per uncertainty (and hence the estimate's precision) remains small. We apply our approach to two natural policy classes for this problem: the first inspired by the empirical Bayes literature in statistics and the second by the regularization literature in optimization and machine learning. In both cases, the sub-optimality gap between our proposed method and the best-in-class policy decays exponentially fast in the number of uncertain parameters, even for a fixed amount of data. We also show that in the usual large-sample regime our policies are comparable to the sample average approximation. Thus, our policies retain the strong large-sample performance of traditional methods, and additionally enjoy provably strong performance in the small-data, large-scale regime. Numerical experiments confirm the significant benefits of our methods.
Joint work with Prof. Paat Rusmevichientong, USC Marshall. (TCPL 201) |

12:10 - 12:45 |
Karthyek R. A. Murthy: Distributionally Robust Optimization with Optimal Transport Distances: Some Statistical and Algorithmic Advances ↓ The objective of this talk is to showcase optimal transport distances, which include the well-known Wasserstein distances as special case, as an attractive tool to model distributional ambiguities when performing optimization under uncertainty. Unlike traditional stochastic optimization methods where we look for optimal decision choices assuming a fixed probability model, here we look for decision choices that perform uniformly good for any choice of probability distribution that is within a fixed radius (quantified by optimal transport distance) from a baseline model. Interestingly, we show that various popular machine learning algorithms that employ regularization can be recovered as particular examples of this optimal transport based DRO approach.
Drawing inspiration from empirical likelihood theory in Statistics, we develop a systematic approach to choose the radius of ambiguity sets in data-driven optimization settings. Along with beating the curse of dimensionality that is associated with the traditional approaches based on concentration inequalities, the proposed choice of radius also allows us to systematically choose regularization parameters in large scale machine learning problems without having to use the brute-force tuning approach of cross-validation.
We also develop a stochastic gradient descent based iterative scheme to solve the proposed DRO formulation for a large class of problems involving affine decision rules. Interestingly, the proposed computational algorithm is at least ""as fast"" as the non-robust sample average approximation, thus making optimal transport based DRO scalable and attractive for computational purposes. (TCPL 201) |

12:25 - 13:45 |
Lunch ↓ Lunch is served daily between 11:30am and 1:30pm in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |

14:00 - 14:35 |
Bart Paul Gerard Van Parys: Bootstrap Robust Prescriptive Analytics ↓ We discuss prescribing optimal decisions in a framework where their cost depends on uncertain problem parameters that need to be learned from supervised data. Prescriptive analytics consists in making optimal decisions specific to a particular covariate context. Prescriptive methods need to factor in additional observed contextual information on a potentially large number of covariates as opposed to static decision methods who only use sample data. Any naive use of training data may, however, lead to gullible decisions over-calibrated to one particular data set. In this presentation, we use ideas from distributionally robust optimization and the statistical bootstrap to propose two novel prescriptive methods based on (nw) Nadaraya-Watson and (nn) nearest neighbors learning which safeguard against overfitting and lead to improved out-of-sample performance. Both resulting robust prescriptive methods reduce to tractable convex optimization problems and enjoy a limited disappointment on bootstrap data. We illustrate the data-driven decision-making framework and our novel robustness notion on a small news vendor problem as well as a small portfolio allocation problem. (TCPL 201) |

14:35 - 15:10 |
Henry Lam: Parameter Calibration for Optimization under Uncertainty ↓ Optimization formulations to handle decision-making under uncertainty often contain parameters needed to be calibrated from data. Examples include uncertainty set sizes in robust optimization, and Monte Carlo sample sizes in constraint sampling or scenario generation. We investigate strategies to select good parameter values based on data splitting and the validation of their performances in terms of feasibility and optimality. We analyze the effectiveness of these strategies in relation to the complexity of the optimization class and problem dimension. (TCPL 201) |

15:10 - 15:45 |
Nathan Kallus: Distributionally Robust Optimization for Learning Causal-Effect-Maximizing Policies ↓ Policy learning from observational data seeks to extract personalized interventions from passive interaction data to maximize causal effects. The aim is to transform electronic health records to personalized treatment regimes, transactional records to personalized pricing strategies, and click-streams to personalized advertising campaigns. The task is made difficult by the observational nature of the data: only outcomes of the interventions performed are available and the distribution of units exposed to one intervention or another differ systematically. In such purely observational setting, existing methods adapted from experimental settings tenuously rely on unstable plug-in approaches and heuristic stopgaps to address ensuing complications. In this talk I will describe a new approach based on distributionally robust optimization that overcomes these failures and its application to personalized medicine. By showing that estimation error reduces to the discrepancy in a moment of a particular unknown function, the approach relies on protecting against any possible realization thereof. On the one hand, this leads to unparalleled finite-sample performance, as demonstrated by experiments. On the other hand, theoretical results show that the asymptotic optimality and convergence rates of plug-in approaches are preserved. Time permitting, I will also outline advances in handling continuous treatments and in representation learning for causal inference using deep neural networks. (TCPL 201) |

15:30 - 16:00 | Coffee Break (TCPL Foyer) |

16:20 - 16:55 |
Sanjay Mehrotra: Decomposition Methods For Solving Distributionally Robust Two-Stage Stochastic Integer Programs ↓ We introduce and study a two-stage distributionally robust mixed binary problem (TSDR-MBP) where the random parameters follow the worst-case distribution belonging to an uncertainty set of probability distributions. We present a decomposition algorithm, which utilizes distribution separation procedure and parametric cuts within Benders’ algorithm or L-shaped method, to solve TSDR-MBPs with binary variables in the first stage and mixed binary programs in the second stage. We refer to this algorithm as distributionally robust integer (DRI) L-shaped algorithm. Using similar decomposition framework, we provide another algorithm to solve TSDR linear problem where both stages have only continuous variables. We investigate conditions and the families of ambiguity set for which our algorithms are finitely convergent. We present two examples of ambiguity set, defined using moment matching, or Kantorovich-Rubinstein distance (Wasserstein metric), which satisfy the foregoing conditions. We also present a cutting surface algorithm to solve TSDR-MBPs. We computationally evaluate the performance of the DRI Lshaped algorithm and the cutting surface algorithm in solving distributionally robust versions of a few instances from the Stochastic Integer Programming Library, in particular stochastic server location and stochastic multiple binary knapsack problem instances. We also discuss the usefulness of incorporating partial distribution information in two-stage stochastic optimization problems. (TCPL 201) |

16:55 - 17:30 |
John Gunnar Carlsson: Wasserstein Distance and the Distributionally Robust TSP ↓ Motivated by a districting problem in multi-vehicle routing, we consider a distributionally robust version of the Euclidean travelling salesman problem in which we compute the worst-case spatial distribution of demand against all distributions whose Wasserstein distance to an observed demand distribution is bounded from above. This constraint allows us to circumvent common overestimation that arises when other procedures are used, such as fixing the center of mass and the covariance matrix of the distribution. Numerical experiments confirm that our new approach is useful when used in a decision support tool for dividing a territory into service districts for a fleet of vehicles when limited data is available. (TCPL 201) |

17:30 - 18:05 |
Dan Iancu: Discrete Convexity and Dynamic Robust Optimization ↓ We discuss necessary and sufficient conditions for the optimality of specific classes of policies in dynamic robust optimization. We then focus on the specific case of affine policies, and show how our conditions can be used to recover and generalize several existing results in the literature. Our treatment draws interesting connections with the theory of discrete convexity (L-natural / M-natural convexity and multimodularity) and global concave envelopes, which may be of independent interest. Time permitting, we also discuss some related applications of the results in the context of a learning and stopping problem.
This is joint work with Yehua Wei. (TCPL 201) |

17:45 - 19:30 |
Dinner ↓ A buffet dinner is served daily between 5:30pm and 7:30pm in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |

Wednesday, March 7 | |
---|---|

07:00 - 09:00 | Breakfast (Vistas Dining Room) |

09:00 - 09:35 |
Georg Pflug: Distributional Robustness for Multiperiod Stochastic Programs ↓ We introduce ambiguity sets based on the nested distance for stochastic processes. We show how these sets can be constructed as nonparametric confidence sets from statistical observations with a prescribed confidence level. For the problem of finding the acceptability price for a contingent claim under model ambiguity, we derive a duality result and show the relationship between size of the ambiguity set and the distributionally robust price. Since we consider both, the optimal ask and the optimal bid price, we can relate the bid-ask spread to the size of the ambiguity set and indirectly to the sample size of the observations. It turns out that lowering the acceptance level decreases the bid-ask spread, while increasing the model uncertainty increases this spread. We do not assume a complete market situation. We also may report on similar results for an ambiguity model for the multiperiod optimal management problem of hydrostorage electricity production. (TCPL 201) |

09:35 - 10:10 |
Huan Xu: Practicable Robust Markov Decision Processes ↓ Markov decision processes (MDP) is a standard modeling tool for sequential decision making in a dynamic and stochastic environment. When the model parameters are subject to uncertainty, the "optimal strategy" obtained from MDP can significantly under-perform than the model's prediction. To address this, robust MDP has been developed which is based on worst-case analysis. However, several restrictions of the robust MDP model prevent it from practical success, which I will address in this talk. The first restriction of standard robust MDP is that the modeling of uncertainty is not flexible and can lead to conservative solution. In particular, it requires that the uncertainty set is "rectangular" - i.e., it is a Cartesian product of uncertainty sets of each state. To lift this assumption, we propose an uncertainty model which we call “k-rectangular" that generalizes the concept of rectangularity, and we show that this can be solved efficiently via state augmentation. The second restriction is that it does not take into account the learning issue - i.e., how to adapt the model in an efficient way to reduce the uncertainty. To address this, we devise an algorithm inspired by reinforcement learning that, without knowing the true uncertainty model, is able to adapt its level of protection to uncertainty, and in the long run performs as good as the minimax policy as if the true uncertainty model is known. Indeed, the algorithm achieves similar regret bounds as standard MDP where no parameter is adversarial, which shows that with virtually no extra cost we can adapt robust learning to handle uncertainty in MDPs. (TCPL 201) |

10:10 - 10:25 | Coffee Break (TCPL Foyer) |

10:25 - 11:00 |
Angelos Georghiou: Robust Dual Dynamic Programming ↓ We propose a Robust Dual Dynamic Programming (RDDP) scheme for multi-stage robust optimization problems. The RDDP scheme takes advantage of the decomposable nature of these problems by bounding the costs arising in the future stages through inner and outer approximations. In contrast to Stochastic Dual Dynamic Programming, we refine the approximations using as a devise our inner approximations to determine the points of refinement. We prove that RDDP converges deterministically in finite time. We demonstrate the promising performance of our algorithm in stylized instances of inventory management problems. (TCPL 201) |

11:00 - 11:35 |
Jin Qi: Distributionally Robust Adaptive Decision Making in Inventory Routing Problems ↓ We study a finite horizon stochastic inventory routing problem. The supplier acts as a central planner who determines the replenishment quantities as well as the times and routes to all retailers. We allow ambiguity in the probability distribution of uncertain demand of each retailer. We consider from a service-level point of view and minimize the risk of the uncertain inventory levels in violating the pre-specified range. To quantify this risk, we propose a decision criterion, which takes into account both the frequency and magnitudes of violation of the inventory requirement. The solutions are fully-adaptable at each period and vary with the realizations of uncertain demand. We provide algorithms to solve the problem exactly and compare the performance of our solutions with several benchmarks to show their benefits. (TCPL 201) |

11:35 - 12:30 |
Lunch ↓ Lunch is served daily between 11:30am and 1:30pm in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |

13:45 - 17:45 | Free Afternoon (Banff National Park) |

17:45 - 19:30 |
Dinner ↓ |

Thursday, March 8 | |
---|---|

07:00 - 09:00 | Breakfast (Vistas Dining Room) |

09:00 - 09:35 |
Dick den Hertog: Is DRO the Only Approach for Optimization Problems with Convex Uncertainty? ↓ Uncertain constraints with convex uncertainty are in general difficult to tackle for "normal" RO. However, several DRO approaches are well suited for such cases. In this talk we first discuss two of such DRO approaches. Then, as a non-DRO alternative, we propose an RO method to obtain approximate solutions for some problems with convex uncertainty and polyhedral uncertainty region. For example, an uncertain SOC constraint with polyhedral uncertainty is reformulated as an adjustable robust linear optimization problem with ellipsoidal uncertainty region, for which linear and non-linear decision rules can be used to obtain approximate solutions. For two numerical examples it appeared that linear decision rules already lead to (near) optimal solutions. (TCPL 201) |

09:35 - 10:10 |
Patrick Jaillet: Satisficing Models to Mitigate Uncertainty ↓ Satisficing, as an approach to decision-making under uncertainty, aims at achieving solutions that satisfy the problem's constraints as well as possible. Mathematical optimization problems that are related to this form of decision-making include the P-model of Charnes and Cooper (1963), where satisficing is the objective, as well as chance-constrained and robust optimization problems, where satisficing is articulated in the constraints. In this paper, we first propose the R-model, where satisficing is the objective, and where the problem consists in finding the most "robust" solution, feasible in the problem's constraints when uncertain outcomes arise over a maximally sized uncertainty set. We then study the key features of satisficing decision making that are associated with these problems and provide the complete functional characterization of a satisficing decision criterion. As a consequence, we are able to provide the most general framework of a satisficing model, termed the S-model, which seeks to maximize a satisficing decision criterion in its objective, and the corresponding satisficing-constrained optimization problem that generalizes robust optimization and chance-constrained optimization problems. Next, we focus on a tractable probabilistic S-model, termed the T-model whose objective is a lower bound of the P-model. We show that when probability densities of the uncertainties are log-concave, the T-model can admit a tractable concave objective function. In the case of discrete probability distributions, the T-model is a linear mixed integer program of moderate dimensions. We also show how the T-model can be extended to multi-stage decision-making and present the conditions under which the problem is computationally tractable. Our computational experiments on a stochastic maximum coverage problem strongly suggest that the T-model solutions can be highly effective, thus allaying misconceptions of having to pay a high price for the satisficing models in terms of solution conservativeness. This is joint work with Sanjay Dominik Jena, Tsan Sheng Ng, and Melvyn Sim. (TCPL 201) |

10:10 - 10:40 | Coffee Break (TCPL Foyer) |

10:40 - 11:15 |
Vineet Goyal: On the Power of Affine Policies in Two-Stage Adjustable Robust Optimization ↓ Affine policies are widely used as a solution approach in dynamic optimization where computing an optimal adjustable solution is usually intractable. While the worst case performance of affine policies can be significantly bad, the empirical performance is observed to be near-optimal for a large class of problem instances. For instance, in the two-stage dynamic robust optimization problem with linear covering constraints and uncertain right hand side, the worst-case approximation bound for affine policies is O(√m) that is also tight (see Bertsimas and Goyal [8]), whereas observed empirical performance is near-optimal. This work aims to address this stark-contrast between the worst-case and the empirical performance of
affine policies.
We show that affine policies are provably a good approximation for the two-stage adjustable robust optimization problem with high probability on random instances where the constraint coefficients are generated i.i.d. from a large class of distributions; thereby, providing a theoretical justification of the observed empirical performance. We also consider the performance of affine policies for an important class of uncertainty sets, namely the budget of uncertainty and intersection of budget of uncertainty sets. We show that surprisingly affine policies provide nearly the best possible approximation for this class of uncertainty sets that matches the hardness of approximation; thereby, further confirming the power of affine policies.
This talk is based is joint work with my student Omar El Housni. (TCPL 201) |

11:15 - 11:50 |
Anthony Man-Cho So: On the Approximation Guarantee for a Semidefinite Relaxation of a Class of Robust Quadratic Optimization Problems ↓ We consider a class of robust quadratic optimization problems that arise in various applications in signal processing and wireless communications. Although the class of problems under consideration is NP-hard in general, by applying the well-known lifting technique and S-lemma, one can obtain a semidefinite relaxation that yields an approximate solution to the original problem in polynomial time. However, so far there is no approximation guarantee for such semidefinite relaxation. In fact, despite the many available approximation results for semidefinite relaxations of quadratically constrained quadratic optimization problems, none of them apply to the setting where robust constraints are present. In this talk, we present the first approximation guarantee for the aforementioned class of robust quadratic optimization problems. The key to establishing such guarantee is the so-called epsilon-net technique from functional analysis, which allows us to handle the semi-infinite robust quadratic constraints in the problem. If time permits, we will illustrate our result via the problem of robust beamforming with cognitive radio constraints in wireless communications. (TCPL 201) |

11:50 - 12:25 |
Grani A. Hanasusanto: Robust Quadratic Programming with Mixed-Integer Uncertainty ↓ We study robust convex quadratically constrained quadratic programs where the uncertain problem parameters can contain both continuous and integer components. Under the natural boundedness assumption on the uncertainty set, we show that the generic problems are amenable to exact copositive programming reformulations of polynomial size. The emerging convex optimization problems are NP-hard but admit a conservative semidefinite programming (SDP) approximation that can be solved efficiently. We prove that this approximation is stronger than the popular approximate S-lemma scheme for problem instances with only continuous uncertainty. We also show that all results can be extended to the two-stage robust optimization setting if the problem has complete recourse. We assess the effectiveness of our proposed SDP reformulations and demonstrate their superiority over the state-of-the-art solution schemes on stylized instances of least squares, project management, and multi-item newsvendor problems. (TCPL 201) |

12:25 - 13:45 |
Lunch ↓ |

13:45 - 14:20 |
Marco Campi: Risk and Complexity in Scenario Optimization ↓ In scenario optimization, decisions are made in the light of past situations (or scenarios), and the “risk” associated to a scenario decision refers to the possibility that the decision does not perform as expected in a new case at hand. In the presentation, we will discuss a deep, and universal, link that relates the risk to the “complexity” of the decision, where complexity is a concept associated to the amount of information by which the decision can be constructed. This theoretical result provides a support to the use of inductive methods in decision making problems. (TCPL 201) |

14:20 - 14:55 |
Simone Garatti: Scenario Optimization: The Performance-Risk Tradeoff ↓ Scenario optimization considers decisions made on the grounds of past experience. A scenario decision is characterized by two fundamental elements: one is the performance the decision is expected to achieve and the other is the risk that this expected performance will not be met in a new case at hand. A mathematical theory offers an opportunity to study the trade-off between these two elements and in this presentation we shall try to highlight the principal conclusions of this theory. The significance of these results in the practical use of scenario optimization will also be discussed. (TCPL 201) |

15:10 - 15:45 |
Guzin Bayraksan: Effective Scenarios in Distributionally Robust Optimization ↓ Traditional stochastic optimization assumes that the probability distribution of uncertainty is known. However, in practice, the probability distribution oftentimes is not known or cannot be accurately approximated. One way to address such distributional ambiguity is to work with distributionally robust optimization (DRO), which minimize the worst-case expected cost with respect to a set of probability distributions. In this talk, we illustrate that not all, but only some scenarios might have an effect on the optimal value, and we formally define this notion for DRO. We also examine the properties of effective scenarios. In particular, we investigate problems where the distributional ambiguity is modeled by the total variation distance with a finite number of scenarios under convexity assumptions. We propose easy-to-check conditions to identify effective and ineffective scenarios for this class of DRO. Computational results show that identifying effective scenarios provides useful insight on the underlying uncertainties of the problem. (TCPL 201) |

15:30 - 16:00 | Coffee Break (TCPL Foyer) |

16:15 - 16:50 |
Abdel Lisser: Distributionally Robust Chance Constrained Geometric Optimization ↓ In this talk, we discuss distributionally robust geometric programs with individual and joint chance constraints. We consider three groups of uncertainty sets, namely uncertainty sets with known two first order moments information, uncertainty sets considering the uncertainties in terms of the distribution and the moments, and uncertainty sets constrained by the Kullback-Leibler divergence distance with a normal reference distribution. We present tractable reformulations for geometric programs with individual chance constraints for the three uncertainty sets. Efficient approximations are given for distributionally robust programs with joint chance constraints using piecewise linear functions. Numerical results are given on a shape optimization problem. (TCPL 201) |

16:50 - 17:25 |
Siqian Shen: Ambiguous Chance-Constrained Bin Packing under Mean-Covariance Information ↓ The bin packing structure arises in a wide range of service operational applications, where a set of items are assigned to multiple bins with fixed capacities. With random item weights, a chance-constrained bin packing problem bounds, for each bin, the probability that the total weight of packed items exceeds the bin's capacity. Different from the stochastic programming approaches relying on full distributional information of the random item weights, we assume that only the information of the mean and covariance matrix is available, and consider distributionally robust chance-constrained bin packing (DCBP) models in this paper. Using two types of ambiguity sets, we equivalently reformulate the DCBP models as 0-1 second-order cone (SOC) programs. We further exploit the submodularity of the 0-1 SOC constraints under special and general covariance matrices, and utilize the submodularity as well as lifting and bin-packing structure to derive extended polymatroid inequalities to strengthen the 0-1 SOC formulations. We incorporate the valid inequalities in a branch-and-cut algorithm for efficiently solving the DCBP models. Finally, we demonstrate the computational efficacy of our approaches and performance of DCBP solutions on diverse test instances.
This is joint work with Yiling Zhang and Ruiwei Jiang . (TCPL 201) |

17:45 - 19:30 |
Dinner ↓ |

Friday, March 9 | |
---|---|

07:00 - 09:00 | Breakfast (Vistas Dining Room) |

09:00 - 09:30 |
Wolfram Wiesemann: "Dice"-sion Making under Uncertainty: When Can a Random Decision Reduce Risk? ↓ Consider an Ellsberg experiment in which one can win by calling the color (red or blue) of the ball that will be drawn from an urn in which the balls are of unknown proportions. It is well known (yet rarely advertised) that selecting the color based on a fair sided coin completely eradicates the ambiguity about the odds of winning. In this talk, we explore what are conditions under which a decision maker that employs a risk measure should have his action depend on the outcome of an independent random device. Surprisingly, we show that for any ambiguity averse risk measure there always exists a decision problem in which a randomized decision strictly dominates all deterministic decisions. This is joint work with Erick Delage and Daniel Kuhn. (TCPL 201) |

09:30 - 10:00 |
Erick Delage: Measuring the Value of Randomized Solutions in Distributionally Robust Optimization ↓ This talk studies the value of randomized solutions (VRS) in distributionally robust mixed integer programming problems. We show different methods for obtaining upper bounds on VRS and identify conditions under which some of them are tight. We also devise and implement a column-generation algorithm for identifying optimal randomized solutions in two-stage distributionally robust optimization with right-hand-side uncertainty. We empirically illustrate our findings in a capacitated facility location problem where the distribution is known to be part of a Wasserstein ambiguity set. This is joint work with Ahmed Saif. (TCPL 201) |

10:00 - 10:30 | Coffee Break (TCPL Foyer) |

10:30 - 11:00 |
Daniel Kuhn: From Data to Decisions: Distributionally Robust Optimization is Optimal ↓ Data-driven stochastic programming aims to find a procedure that transforms time series data to a near-optimal decision (a prescriptor) and to a prediction of this decision's expected cost under the unknown data-generating distribution (a predictor). We propose a meta-optimization problem to find the least conservative predictors and prescriptors subject to constraints on their out-of-sample disappointment. Leveraging tools from large deviations theory, we prove that the best predictor-prescriptor pair is obtained by solving a distributionally robust optimization problem. (TCPL 201) |

11:00 - 11:30 |
Karthik Natarajan: On the Heavy-Tail Behavior of the Distributionally Robust Newsvendor Model with Moment Constraints ↓ Since the seminal work of Scarf (1958) on the newsvendor problem with ambiguity in the demand distribution, there has been a growing interest in the study of the distributionally robust newsvendor problem. Scarf's solution is criticized at times for being overly conservative since the worst-case distribution is discrete with a few support points. A simple observation however indicates that the optimal order quantity in this two moment model is also optimal for a censored student-t distribution with infinite variance for all possible values of the critical ratio. In this paper, we generalize this ``heavy tail optimality" property of the distributionally robust newsvendor to the case when information on the first and the nth moment is known for any rational number n > 1. This is joint work with Anulekha Dhara (University of Michigan) and Bikramjit Das (SUTD). (TCPL 201) |

11:30 - 12:00 |
Checkout by Noon ↓ 5-day workshop participants are welcome to use BIRS facilities (BIRS Coffee Lounge, TCPL and Reading Room) until 3 pm on Friday, although participants are still required to checkout of the guest rooms by 12 noon. (Front Desk - Professional Development Centre) |

12:00 - 13:30 | Lunch (Vistas Dining Room) |