19w5091 Home Confirmed Participants Schedule Workshop Videos

Final Report (PDF) Testimonials

Schedule for: 19w5091 - Multi-Stage Stochastic Optimization for Clean Energy Transition

Beginning on Sunday, September 22 and ending Friday September 27, 2019

All times in Oaxaca, Mexico time, CDT (UTC-5).

Sunday, September 22
14:00 - 23:59	Check-in begins (Front desk at your assigned hotel)
19:30 - 22:00	Dinner (Restaurant Hotel Hacienda Los Laureles)
20:30 - 21:30	Informal gathering (Hotel Hacienda Los Laureles)

Monday, September 23
07:30 - 08:45	Breakfast (Restaurant at your assigned hotel)
08:45 - 09:00	Introduction and Welcome (Conference Room San Felipe)
09:00 - 09:30	Roundtable (Conference Room San Felipe)
09:30 - 10:00	Energy transition panorama and challenges (Conference Room San Felipe)
10:00 - 10:30	Jyoti U. Devkota: Results from Sample Surveys of Renewable Energy Users of Nepal ↓ Nepal is an agriculture based society. According to 2011 Census, 64% of the households use wood/fire wood for meeting their energy needs. With an aim of conducting a detailed study of energy consumption dynamics of rural Nepal, three surveys were conducted. A survey of 400 households of biogas consumers, 300 households of national grid electricity users and 51 households of micro hydro project users was conducted. This presentation is based on results obtained from these surveys. Conduction of such surveys has a great significance for country like Nepal; as it is without a strong backbone of good quality official data. Remote geographical locations, lack of awareness and lack of incentives are the main reasons behind this plight of official statistics. This survey generated more than 350 multivariate data. In these surveys, the base questionnaire is the same. It is a structured questionnaire with answers given as multiple choices that resulted in categorical data. This categorical data could be analyzed on ordinal scale. Because of large sample size this could be treated as a continuous data by the application of central limit theorem. The questionnaire was pretested. Multivariate statistics is used here to quantify and explain the relationship between these variables. Consumer profile databases were constructed. Data based research is evidence based research. Evidence based research is objective and undisputable. This presentation highlights this evidence based approach of handling an issue. As what gets measured also gets done. (Conference Room San Felipe)
10:30 - 11:00	Coffee Break (Conference Room San Felipe)
11:00 - 11:45	Andy Philpott: Multistage Stochastic Capacity Planning Using JuDGE ↓ Julia Dynamic Generation Expansion (JuDGE) is a Julia package for solving stochastic capacity expansion problems formulated in a "coarse-grained" scenario tree that models long-term uncertainties. The user provides JuDGE with a coarse-grained tree and a JuMP formulation of a stage problem to be solved in each node of this tree. JuDGE then applies Dantzig-Wolfe decomposition to this framework based on the general model of Singh et al. (2009). The stage problems are themselves single-stage capacity expansion problems with integer capacity variables, but quite general constraints that can model, for example, operations in random environments, or even equilibrium constraints, as long as they can be solved exactly (e.g. via reformulation as mixed integer programs). This presentation outlines the theoretical background for JuDGE, and shows the results of applying it to several problem instances: i. a knapsack problem with expanding capacity; ii. optimal capacity expansion in an electricity distribution network subject to reliability constraints; iii. national capacity expansion to meet renewable energy targets; iv. optimal transmission expansion for an electricity wholesale market with imperfectly competitive agents. References Singh, K., Philpott, A.B. and Wood, K., Dantzig-Wolfe decomposition for solving multi-stage stochastic capacity planning problems, Operations Research, 57, 1271-1286, 2009. (Conference Room San Felipe)
11:45 - 12:30	Michel Gendreau: A Hybrid Dynamic Programming-Tabu Search Approach for the Long-Term Energy Planning Problem ↓ We consider the long-term energy planning of an extensive hydroelectric power system. The problem ultimately aims at evaluating the impact of additional firm load contracts on the energy reliability of the system and the future revenues for the next fifteen years, taking into account the uncertainty of future energy inflows. Energy inflows are obtained from water inflows using an energy aggregation process and therefore behave like hydrological models. Various forms of long-term persistence assumptions can be modeled in the energy inflow process using hydrological models. However, these forms can be challenging to include in the framework required by state-of-the-art methods. We propose a method combining stochastic dynamic programming and Tabu Search approaches to solve the long-term energy-planning problem without the need to assume a prior form for the long-term persistence of future energy inflows. We compare the policies resulting from this hybrid method with policies obtained from two others versions of stochastic dynamic programming known to explicitly handle persistence of inflows: one with an additional state variable and the other coupled with a Hidden Markov Model. The results show the effectiveness of the hybrid method in long-term persistence cases. (Conference Room San Felipe)
12:30 - 13:00	Rüdiger Schultz: Towards a Decomposition Method for Linear Multi-Stage Stochastic Integer Programs with Discrete Distributions ↓ Thanks to linear programming duality, the classical linear stochastic` program with recourse, as introduced individually by Beale and Dantzig in 1955, obeys convexity and duality properties facilitating the development of theory and the design of algorithms considerably.\\ Practical needs in operations research and elsewhere quickly drove the modeling beyond the convex case. For instance, $0$-$1$ decisions and related integer variables to model switching and indivisibility or nonlinearites of physical nature such as squared-differences of square expressions marking drop of potential related to flow along the pipes.\\ In the talk extensions of the classical model concerning all key ingredients, namely the measure, the integrand, the second-stage optimization problem, and the nonanticipative first-stage variable will be put into perspective. Since convexity is lost instantly when extending the model, identification of proper mathematical alternatives is becoming crucial. \\ A promising source for such alternatives is provided by computer algebra which opens up a rich world of approaches in mathematical structures not having been in the research focus until very recently. Foremostly, it is Ideal Theory that has been addressed in this respect. In the talk, some first successful approaches using algebraically motivated structures and methods will be addressed. (Conference Room San Felipe)
13:20 - 13:30	Group Photo (Hotel Hacienda Los Laureles)
13:30 - 15:00	Lunch (Restaurant Hotel Hacienda Los Laureles)
16:00 - 16:30	Coffee Break (Conference Room San Felipe)
16:30 - 17:15	Michael C. Ferris: Computation in Markets with Risk ↓ We consider modeling of coupled systems using an equilibrium framework. The systems are coupled by market constraints and can be effectively modeled in the Extended Mathematical Programming (EMP) framework. Standard and new mechanisms to model resulting complementarity constraints will be outlined, and the agents problems will be formulated to allow risk preferences. Some computational results related to energy systems and other applications will be given. (Conference Room San Felipe)
17:15 - 17:45	Napat Rujeerapaiboon: A Day-Ahead Decision Rule Method for Multi-Market Multi-Reservoir Management ↓ Peak/off-peak spreads in European electricity spot markets are eroding due to the nuclear phaseout and the recent growth in photovoltaic capacity. The reduced profitability of peak/off-peak arbitrage forces hydropower producers to participate in the reserve markets. We propose a two-layer stochastic programming framework for the optimal operation of a multi-reservoir hydropower plant which sells energy on both the spot and the reserve markets. The backbone of this approach is a combination of decomposition and decision rule techniques. Numerical experiments demonstrate its effectiveness. (Conference Room San Felipe)
17:45 - 18:15	Julio Deride: A computation strategy for a two-stage stochastic equilibrium problem ↓ In this talk, we present a problem of strategic planning for capacity investment and production under uncertainty, in a competitive market. We model it as a general equilibrium problem, i.e., a collection of multi-agent optimization problems with an equilibrium constraint (supply meets demand), and we propose a solution method based on a scheme that considers: i) a stagewise-decomposition procedure using Progressive Hedging, ii) a representative agent representation, and iii) a decomposition-type method for its solution, such as the Alternating Direction Method of Multipliers (ADMM). We illustrate the numerical performance of the algorithm by solving a stochastic infrastructure planning problem for the electric vehicule fast-charging station problem over a small network. (Conference Room San Felipe)
19:00 - 21:00	Dinner (Restaurant Hotel Hacienda Los Laureles)

Tuesday, September 24
07:30 - 09:00	Breakfast (Restaurant at your assigned hotel)
09:00 - 09:45	Alejandro Jofre: Massive Entry of Nonconventional Renewal Energies, Strategic Behavior and Risk Analysis for Network Electricity Markets ↓ In this talk we describe some of the key issues in the operational and strategic decisions when an energy system or market is stressed by a massive entry of nonconventional renewal energy production (NREP), such as the case of the Independent System Operator (ISO), the producer reactions, trade-off between cheap and uncertain generation sources and the risk analysis of producers versus generators and consumers. We use a combination of game theory, stochastic optimization and risk analysis techniques for modeling and trying to understand some of the behaviors and perturbations induced by the entry of NREP. (Conference Room San Felipe)
09:45 - 10:30	Darinka Dentcheva: Statistical Estimation of Composite Risk Functionals ↓ We analyze composite functionals representing distributional characteristics of random data. The functionals depend on the decision maker's choice when used as objectives in optimization problems. Very frequently, models of risk are nonlinear with respect to the underlying distributions, however, we can represent them as structured compositions. Composite functionals also arise in the context of machine learning problems. We consider the use of smooth estimators with particular attention being paid to kernel estimators for composite functionals and for the optimal value of optimization problems using those as objectives. Strong law of large numbers for the estimators, for the optimal values and the optimal solutions are established under mild conditions on the functions involved. Central limit theorems for the estimated composite functionals and the optimal value of composite optimization problems are presented as well. We compare the performance of the estimators to the empirical estimators numerically. Several popular risk measures are discussed as illustrative examples. While we show that many known coherent measures of risk can be cast in the presented structures, we emphasize that the results are of more general nature with a wider applicability. Applications of the results to hypothesis testing of stochastic orders and portfolio efficiency are outlined. (Conference Room San Felipe)
10:30 - 11:00	Coffee Break (Conference Room San Felipe)
11:00 - 11:45	Andrzej Ruszczynski: Risk-Averse Optimization and Control of Partially Observable Systems ↓ We introduce the concept of a risk form, which is a real functional on the product of two spaces: the space of measurable functions and the space of measures on a Polish space. We present a dual representation of risk forms and generalize the classical Kusuoka representation to this setting. For a risk form acting on a product space, we define marginal and conditional forms and we prove a disintegration formula, which represents a risk form as a composition of its marginal and conditional forms. We apply the proposed approach to two-stage optimization problems with partial information and decision-dependent observation distribution. Next, we consider risk measurement in controlled partially observable Markov systems in discrete time. In such systems, part of the state vector is not observed, but affects the transition kernel and the costs. We introduce new concepts of risk filters and study their properties. We also introduce the concept of conditional stochastic time consistency. We derive the structure of risk filters enjoying this property and prove that they can be represented by a collection of law invariant risk measures on the space of functions of the observable part of the state. We also derive the corresponding dynamic programming equations. (Conference Room San Felipe)
11:45 - 12:05	Thomas Martin: Towards a stochastic dynamic formulation of procurement problems ↓ Procurement is the way a company acquires the ressources needed for its activity: components, primary ressources and/or natural ressources. With goods available on a large scale market, companies often need to either buy what is available at the moment, or wait for later opportunities. In our work we try to leverage the stopping time like structure of such problems to scale them, and add more complexity in the models. Naturally, what a company buys has a direct impact on what it sells. We first investigate the case in which selling prices are stochastic, before adding stochasticity and dynamics in the buying process. Finally, we discuss replacing the mathematical expectation by a risk measure to capture the possible risk aversion of a company. (Conference Room San Felipe)
12:05 - 12:15	Cyrille Vessaire: Optimization of reservoir development and design under uncertainty ↓ In this talk, I presented my thesis' subject. Consider given resources in a reservoir. How can we design a network to extract it while minimizing an economic indicator in a context of high uncertainty? (Conference Room San Felipe)
12:15 - 13:00	Michael Ludkovski: Stochastic Control with Local Probabilistic Constraints for Microgrid Management ↓ We investigate microgrid management where the controller tries to optimally dispatch a diesel generator as backup to primary renewable sources while maintaining low probability of blackouts. Dispatch takes place at discrete epochs (15 min in our example), while balancing takes place continuously, so only probabilistic guarantees are possible. Moreover, the likelihood of a blackout during the next dispatch period is not available analytically and can only be estimated. We formulate the problem as stochastic control where the Bellman equation features local probabilistic constraints that lead to an implicit state-dependent admissible control set. To tackle this challenge we develop novel Monte Carlo based algorithms, in particular empirical simulation procedures for learning the admissible control set as a function of system state. We propose a variety of relevant statistical tools including logistic regression, Gaussian process regression, quantile regression and support vector machines, which we then incorporate into an overall Regression Monte Carlo (RMC) framework for approximate dynamic programming. Our results indicate that using logistic or Gaussian process regression to estimate the admissibility probability outperforms the other options. Our algorithms offer an efficient and reliable extension of RMC to probability-constrained control. We illustrate our findings with two case studies for the microgrid setup with time-stationary and daily-seasonal net load dynamics. (Conference Room San Felipe)
13:30 - 15:00	Lunch (Restaurant Hotel Hacienda Los Laureles)
16:00 - 16:30	Coffee Break (Conference Room San Felipe)
16:30 - 17:15	Michel De Lara: An Overview of Decomposition-Coordination Methods in Multistage Stochastic Optimization ↓ Multistage stochastic optimization problems are, by essence, complex because their solutions are indexed both by stages (time) and by uncertainties (scenarios). Quite often, solutions are also indexed by decision units, like nodes in a graph (space), or agents in a team. Hence, their large scale nature makes decomposition methods appealing. We present, in an unified framework, three main approaches and methods to decompose multistage stochastic optimization problems for numerical resolution: time decomposition (and state-based resolution methods, like Stochastic Dynamic Programming, in Stochastic Optimal Control); scenario decomposition (like Progressive Hedging in Stochastic Programming); spatial decomposition (price or resource decompositions). We show how writing a dynamic programming equation on the increasing sets of histories paves the way for state reduction at specified stages; this makes it possible to develop what we call time block decomposition. We also show how price or resource decompositions quite naturally provide decomposed lower and upper bounds for minimization problems. Finally, we point to some mathematical questions raised by the mixing (blending) of different decompositions methods to tackle large scale problems. We hint at the potential of blending for the management of new energy systems (smart grids), as they will be developed in the next two talks. (Conference Room San Felipe)
17:15 - 17:45	Jean-Philippe Chancelier: Mixing Time Blocks and Price/Resource Decompositions Methods ↓ We provide a method to decompose multistage stochastic optimization problems by time blocks. This method is based on reducing the so-called history space using a compressed ``state'' variable. It leads to a reduced dynamic programming equation. Then, we apply the reduction method by time blocks to two time-scales stochastic optimization problems arising from long term storage management of batteries. We present a stochastic optimization model aiming at minimizing the investment and maintenance costs of batteries for a house with solar panels. For any given capacity of battery it is necessary to compute a charge/discharge strategy as well as maintenance to maximize revenues provided by intraday energy arbitrage while ensuring a long term aging of the storage devices. Long term aging is a slow process while charge/discharge control of a storage handles fast dynamics. For this purpose, we have designed algorithms that take into account this two time scales aspect in the decision making process. We show on instances with huge time steps how one of our algorithm can be used for the optimal sizing of a storage taking into account charge/discharge strategy as well as aging. Numerical results show that it is economically significant to control aging. We also compare our algorithms to Stochastic Dynamic Programming and to Stochastic Dual Dynamic Programming on small instances and we observe that they are less computationally costly while displaying similar performances on the control of a storage. (Conference Room San Felipe)
17:45 - 18:30	Pierre Carpentier: Mixing Dynamic Programming and Spatial Decomposition Methods ↓ We consider a stochastic optimization problem in which different units are connected together via a network. Each unit is a (small) control system, located at a node. Each unit state evolution is affected by uncertainties and controls of the neighboring nodes transmitted through edges. Static constraints couple all units at each time. We formulate the associated global stochastic optimization problem. We propose two decomposition methods, whether we decouple the constraints by prices or by resources. We show that the optimal value of the global problem can be bounded above by a sum of resource-decomposed nodal value, and below by a sum of price-decomposed nodal value. We provide conditions under which these nodal values can be computed by dynamic programming. We illustrate these results with numerical studies that tackle the optimization of urban micro-grids of large size. Finally, we introduce two different information structures for these microgrids, namely the centralized and the decentralized ones, and we analyze the lower and upper bounds when considering these information structures. (Conference Room San Felipe)
19:00 - 21:00	Dinner (Restaurant Hotel Hacienda Los Laureles)

Wednesday, September 25
07:30 - 09:00	Breakfast (Restaurant at your assigned hotel)
09:00 - 09:45	R.Terry Rockafellar: Progressive Hedging in Nonconvex Stochastic Programming ↓ The progressive hedging algorithm minimizes an expected "cost" by iteratively decomposing into separate subproblems for each scenario. Up to now it has depended on convexity of the underlying "cost" function with respect to the decision variables and the constraints on them. However, a new advance makes it possible to obtain convergence to a locally optimal solution when the procedure is executed close enough to it and a kind of second-order local sufficiency condition is satisfied. Besides applications in which costs and associated constraints may directly be nonconvex, there are applications to stochastic programming problems in which those are convex but the probabilities for the scenarios may be decision-dependent. For example, in a two-stage problem the probabilities in the recourse stage could be influenced by the first-stage decision. (Conference Room San Felipe)
09:45 - 10:30	Claudia Sagastizabal: Algorithms for Two-Stage Stochastic Optimization Problems ↓ For nonconvex optimization problems with nonlinear constraints, possibly nonsmooth, a convergent primal-dual solution algorithm is proposed. The approach applies a proximal bundle method to a dual problem that arises in the context of generalized augmented Lagrangians and that yields zero duality gap. The methodology is tailored so that Lagrangian subproblems can be solved inexactly without hindering the primal-dual convergence properties of the algorithm. Primal convergence is ensured even when the dual solution set is empty. (Conference Room San Felipe)
10:30 - 11:00	Coffee Break (Conference Room San Felipe)
11:00 - 11:45	Tito Homem-de-Mello: Effective Scenarios in Multistage Stochastic Programs ↓ Traditional stochastic programs optimize the expected value of some function that depends on the decision variables as well as on some random variables that represent the uncertainty in the problem. Such formulations assume that the probability distribution of those random variables is known. However, in practice the probability distribution oftentimes is not known or cannot be accurately approximated. One way to address such ambiguity is to work with distributionally robust stochastic programs (DRSPs), which minimize the worst-case expected value with respect to a set of probability distributions. In this presentation we discuss some recent advances in the research on DRSPs. In particular, we study the question of how to identify the critical scenarios obtained after solving a multistage DRSP. Recent research has studied the notion of effective scenarios for static distributionally robust stochastic programs. Roughly speaking, a scenario is deemed effective if its removal changes the optimal value of the problem. We discuss the extension of these ideas to the case of multistage stochastic programs. Such extension requires proper ways of defining the meaning of removing a scenario in a dynamic context. Computational and analytical results show that identifying such effective scenarios may provide useful insight on the underlying uncertainties of the problem. (Conference Room San Felipe)
12:00 - 13:00	Lunch (Restaurant Hotel Hacienda Los Laureles)
13:00 - 19:00	Free Afternoon (Oaxaca)
19:00 - 21:00	Dinner (Restaurant Hotel Hacienda Los Laureles)

Thursday, September 26
07:30 - 09:00	Breakfast (Restaurant at your assigned hotel)
09:00 - 09:45	Alexander Shapiro: Computational Approaches to Solving Multistage Stochastic Programs ↓ In this talk we discuss computational approaches to solving convex stochastic programming problems. We start with a discussion of sample complexity of solving static problems and argue that this is essentially different from sample complexity of solving multistage programs. In some applications the considered multistage stochastic programs have a periodical behavior. We demonstrate that in such cases it is possible to drastically reduce the number of stages by introducing a periodical analog of the so-called Bellman equations, used in Markov Decision Processes and Stochastic Optimal Control. Furthermore, we describe a variant of the Stochastic Dual Dynamic Programming algorithm, applied to the constructed periodical Bellman equations, and show numerical experiments for the Brazilian interconnected power system problem. (Conference Room San Felipe)
09:45 - 10:30	Johannes Royset: Diametrical Stochastic Optimization with Application to Statistical Learning ↓ It is well known that Sample Average Approximations (or Empirical Risk Minimization) can lead to arbitrarily poor solutions and slow learning when the objective function is poorly behaved. The presentation describes a surprisingly simple remedy that we coin Diametrical Stochastic Optimization (Diametrical Risk Minimization). In contrast to common robustification strategies based on perturbing the data set and probability distribution, our approach “diametrically" modifies any solution and thereby obtains theoretical stability guarantees even for poorly behaved functions. We show that in challenging machine learning problems the approach generalizes even if obtained after aggressive minimization of the diametrical risk. (Conference Room San Felipe)
10:30 - 11:00	Coffee Break (Conference Room San Felipe)
11:00 - 11:45	Angelos Georghiou: Robust Dual Dynamic Programming ↓ Multi-stage robust optimization problems, where the decision maker can dynamically react to consecutively observed realizations of the uncertain problem parameters, pose formidable theoretical and computational challenges. As a result, the existing solution approaches for this problem class typically determine suboptimal solutions under restrictive assumptions. In this paper, we propose a robust dual dynamic programming (RDDP) scheme for multi-stage robust optimization problems. The RDDP scheme takes advantage of the decomposable nature of these problems by bounding the costs arising in the future stages through lower and upper cost to-go functions. For problems with uncertain technology matrices and/or constraint right-hand sides, our RDDP scheme determines an optimal solution in finite time. If also the objective function and/or the recourse matrices are uncertain, our method converges asymptotically (but deterministically) to an optimal solution. Our RDDP scheme does not require a relatively complete recourse, and it offers deterministic upper and lower bounds throughout the execution of the algorithm. We demonstrate the promising performance of our algorithm in a stylized inventorymanagement problem. (Conference Room San Felipe)
11:45 - 12:30	Vincent Leclere: Upper bounds for Stochastic Dual Dynamic Programming ↓ The Stochastic Dual Dynamic Programming (SDDP) algorithm has been used successfully for the past 25 years in the energy industry, especially for mid and long term hydromanagement problem. In essence SDDP is a dynamic programming algorithm that approximate de value function by outer polyhedral approximation. This outer polyhedral approximations yield exact lower bounds (minimization framework), but no upper bounds. The classical way of obtaining upper bounds consists in simulating the policy over multiple scenarios and estimate the expected cost. This approach has multiple drawbacks. First of the bound is only estimated, hence using it as a stopping test can lead to false positives. The theoretical probability of false positive become very high if it is often tested. A second way of obtaining upper bounds consists in leveraging the monotonicity of Bellman Operators and the convexity of the value function. More precisely, from an upper bound at time $t+1$ one can compute upper bound at time $t$ on a finite set of points. Convexity allow to extend the upper bound definition on the convex hull of this set. Finally, we present a third approach consisting in applying SDDP to the Fenchel transform of the value function. The exact lower bound of the Fenchel transform becomes a deterministic upper bound on the original value function. This approach leads to exact upper bounds converging toward the true value. Incidentally, this also define a new policy with guaranteed expected cost. (Conference Room San Felipe)
12:30 - 13:00	Regan Baucke: SDDP-Like Algorithm for Infinite Horizon Multistage Stochastic Programs ↓ In this talk, I will discuss how recent advances in SDDP methods for finite horizon problems can be transfered to the infinite horizon setting. These types of problems arise when modelling energy systems over long time horizons. We propose a convergent algorithm with several attractive properties; chiefly no Monte Carlo simulation is required to obtain an upper bound. (Conference Room San Felipe)
13:30 - 15:00	Lunch (Restaurant Hotel Hacienda Los Laureles)
16:00 - 16:30	Coffee Break (Conference Room San Felipe)
16:30 - 17:15	Bernardo Freitas Paulo da Costa: Stochastic Lipschitz Dynamic Programming ↓ In this talk we present a new algorithm for solving multistage stochastic mixed integer linear programming (MILP) problems with complete continuous recourse. A typical example of such problems is the energy planing with disjunctive operational constraints. Similar to cutting plane methods, we introduce nonlinear Lipschitz cuts that are building blocks for lower approximations for non-convex cost-to-go functions. An example of such cuts are those derived from (exact) Augmented Lagrangian Duality for MILPs. If one chooses a family of Lipschitz cuts that is MILP representable, the introduction of these cuts does not change the class of the original stochastic optimization problem. We illustrate the application of this algorithm, comparing our approach with the convex relaxation of the stagewise problem, for which we can apply SDDP, and for a discretized approximation, applying SDDiP. (Conference Room San Felipe)
17:15 - 18:00	Aditya Mahajan: Information State (and its Approximations) for Stochastic Control ↓ The standard approach for modeling partially observed systems is to model them as partially observable Markov decision processes (POMDPs) and obtain a dynamic program in terms of a belief state. The belief state formulation works well for planning but is not ideal for online reinforcement learning because the belief state depends on the model and, as such, is not observable when the model is unknown. In this talk, we present an alternative notion of an information state for obtaining a dynamic program in partially observed models. In particular, an information state is a sufficient statistic for the current reward which evolves in a controlled Markov manner. We show that such an information state leads to a dynamic programming decomposition. Then we present a notion of an approximate information state, and an approximate dynamic program based on the approximate information state. Approximate information state is defined in terms of properties that can be estimated using sampled trajectories. Therefore, they provide a constructive method for reinforcement learning in partially observed systems. (Conference Room San Felipe)
18:00 - 18:30	Wrap-up session (Conference Room San Felipe)
19:00 - 21:00	Dinner (Restaurant Hotel Hacienda Los Laureles)

Friday, September 27
07:30 - 09:00	Breakfast (Restaurant at your assigned hotel)
09:00 - 10:30	Open discussion (Conference Room San Felipe)
10:30 - 11:00	Coffee Break (Conference Room San Felipe)
11:00 - 12:00	Workshop wrap-up (Conference Room San Felipe)
12:00 - 14:00	Lunch (Restaurant Hotel Hacienda Los Laureles)

©2024 Banff International Research Station for Mathematical Innovation and Discovery. All Rights Reserved.