Levy Processes, Phase-type Distributions, and Martingales
Soren Asmussen
Aarhus University
Wednesday, January 11, 2017
4:15 - 5:15 PM
Location: Spilker 232
Abstract:
Levy processes are defined as processes with stationary independent increments and have become increasing popular as models in queueing, finance etc.; apart from Brownian motion and compound Poisson processes, some popular examples are stable processes, variance Gamma processes, CGMY Levy processes (tempered stable processes), NIG (normal inverse Gaussian) Levy processes, hyperbolic Levy processes. We consider here a dense class of Levy processes, compound Poisson processes with phase-type jumps in both directions and an added Brownian component. Within this class, we survey how to explicitly compute a number of quantities that are traditionally studied in the area of Levy processes, in particular two-sided exit probabilities and associated Laplace transforms, the closely related scale function, one-sided exit probabilities and associated Laplace transforms coming up in queueing problems, and similar quantities for a Levy process with reflection at 0. The solutions are in terms of roots to polynomials, and the basic equations are derived by purely probabilistic arguments using martingale optional stopping; a particularly useful martingale is the so-called Kella-Whitt martingale. Also, the relation to fluid models with a Brownian component is discussed.
-------------------------
Supercanonical Convergence Rates in Quasi-Monte Carlo Simulation of Markov Chains
Pierre L'Ecuyer
University of Montreal
Wednesday, January 18, 2017
4:15 - 5:15 PM
Location: Spilker 232
Abstract:
Consider a discrete-event model (or a Markov chain) that evolves over several time steps, with a state-dependent cost at each step. The goal is to estimate the expected cost (or perhaps the entire distribution of the cost), either at a given step number, or on average over all steps. For a majority of real-life applications, these quantities are too difficult to compute exactly and must be estimated by simulation. The standard Monte Carlo approach is to simulate n independent sample paths of the chain and take the average over those paths. For the mean cost, this gives an unbiased estimator whose error converges as O(n -1/2), which means that one must multiply n by 100 to get one additional decimal digit of accuracy. When estimating the density of the cost instead of just its mean, the convergence rate is even slower. In this talk, we discuss an approach named Array-RQMC which can provide faster convergence rates. The main idea is to simulate the n copies of the chain together and move them forward by using a randomized quasi-Monte Carlo (RQMC) point set of cardinality n at each step, after sorting the n copies in a particular (multidimensional) order. We review Array-RQMC, its variants, sorting strategies, and convergence results. We summarize known convergence rate results and show empirical results that suggest much better convergence rates than those that are proved. We also compare different types of multivariate sorts to match the chains with the RQMC points.
-------------------------
Multi-Agent Online Learning under Imperfect Information, Algorithms, Theory and Applications
Zhengyuan Zhou
Stanford University
Friday, October 6, 2017
1:15 - 2:15 PM
Location: Packard 202
Abstract:
We consider a model of multi-agent online learning under imperfect information, where the reward structures of agents are given by a general continuous game. After introducing a general equilibrium stability notion for continuous games, called variational stability, we examine the well-known online mirror descent (OMD) learning algorithm and show that the ¿last iterate¿ (that is, the actual sequence of actions) of OMD converges to variationally stable Nash equilibria provided that the feedback delays faced by the agents are synchronous and bounded. We then extend the result to almost sure convergence to variationally stable Nash equilibria under both unbiased noise and synchronous and bounded delays. Subsequently, to tackle fully decentralized, asynchronous environments with unbounded feedback delays, we propose a variant of OMD which we call delayed mirror descent (DMD), and which relies on the repeated leveraging of past information. With this modification, the algorithm converges to variationally stable Nash equilibria, with no feedback synchronicity assumptions, and even when the delays grow super-linearly relative to the game¿s horizon. We then again extend it to the case where there are both delays and noise.
In the second part of the talk, we present two applications of the multi-agent online learning framework. The first application concerns non-convex stochastic optimization, where we characterize almost sure convergence of the well-known stochastic mirror descent algorithm to global optima for a large class of non-convex stochastic optimization problems (strictly including convex, quasi-convex and start-convex problems). A step further, our results also include as a special case the large-scale stochastic optimization problem, where stochastic mirror descent is applied in a distributed, asynchronous manner across multiple machines/processors. Time permitting, we will discuss how these results help (at least in part) clarify and affirm the recent successes of mirror-descent type algorithms in large-scale machine learning. The second application concerns power management on random wireless networks, where we use a game-design approach to derive robust power control algorithms that converge (almost surely) to the optimal power allocation in the presence of randomly fluctuating networks.
This is joint work with Nick Bambos, Stephen Boyd, Peter Glynn, Panayotis Mertikopoulos and Claire Tomlin.
-------------------------
Beyond Gaussian universality class: traffic, growth, matrices and their universal fluctuations
Ivan Corwin
Columbia University
Friday, October 11, 2017
4:30 - 5:30 PM
Location: Y2E2 300
This is a joint OR/ICME seminar.
Abstract:
The Gaussian distribution is ubiquitous across science and society. Yet, there are many complex random systems which fail to be well describe by Gaussian processes. In this talk, we will consider certain models of traffic flow, interface growth and random matrices. In their large scale limits, they surprisingly all display the same limit behaviors describe by the so called Kardar-Parisi-Zhang universality class. This represents a rich universality class beyond that of the Gaussian which is widely applicable to many other types of spatial systems.
-------------------------
Polynomial Optimization and Dynamical Systems
Amir Ali Ahmadi
Princeton University
Monday, November 6, 2017
4:30 - 5:30 PM
Location: Spilker 317
Abstract:
In recent years, there has been a surge of exciting research activity at the interface of optimization (in particular polynomial, semidefinite, and sum of squares optimization) and the theory of dynamical systems. In this talk, we focus on two of our current research directions that are at this interface. In part (i), we propose more scalable alternatives to sum of squares optimization and show how they impact verification problems in control and robotics, as well as some classic questions in polynomial optimization and statistics. Our new algorithms do not rely on semidefinite programming, but instead use linear programming, or second-order cone programming, or are altogether free of optimization. In particular, we present the first Positivstellensatz that certifies infeasibility of a set of polynomial inequalities simply by multiplying certain fixed polynomials together and checking nonnegativity of the coefficients of the resulting product.
In part (ii), we introduce a new class of optimization problems whose constraints are imposed by trajectories of a dynamical system. As a concrete example, we consider the problem of optimizing a linear function over the set of initial conditions that forever remain inside a given polyhedron under the action of a linear, or a switched linear, dynamical system. We present a hierarchy of linear and semidefinite programs that respectively lower and upper bound the optimal value of such problems to arbitrary accuracy.
-------------------------
Scalable Gaussian Processes with Markovian Covariances
Xiaowei Zhang
Hong Kong University of Science and Technology
Friday, December 1, 2017
9:30 - 10:30 AM
Location: Shriram 108
Abstract:
Gaussian process (GP) is used in a wide variety of areas, including stochastic simulation, geostatistics, and machine learning. At the core of its related computation is inverting a covariance matrix, which takes O(n^3) time in general and becomes computationally prohibitive for large n, where n is the data size. In addition, the covariance matrix is often poorly conditioned, and thus the inversion is prone to numerical instability, resulting in inaccurate parameter estimation and prediction. These two numerical issues preclude use of GP at a large scale. We provide a novel perspective to address them in this talk. We construct a class of covariance functions having two properties: (i) the covariance matrices they induce can be inverted analytically, and (ii) the inverse matrices are sparse. Then, the inversion-related computational time can be reduced to O(n^2), without resorting to any approximation schemes. Further, if the observation noise is negligible, then numerical inversion is unnecessary and the computational time can be reduced O(n). The key in our approach is that we establish an explicit connection between covariance functions that can induce sparse precision matrices and certain ordinary differential equations. Such a connection entails a wide class of covariance functions that improve substantially both speed and numerical stability of GP, thereby permitting its use in large-scale problems.
-------------------------
Smart Factory: Manufacturing Execution Optimization
Leyuan Shi
University of Winsconsin-Madison
Wednesday, May 2, 2018
4:15 - 5:15 PM
Location: Shriram 108
Abstract:
Many manufacturing firms use aggregated data to provide scheduling
/decision solutions for handling their daily operations. Given the nature of shop floor operating in real
-time, these average-based scheduling systems cannot be fully executed since unexpected events will almost always occur such as rush orders, design changes, machine breakdowns, defective parts, and delivery delays etc. Currently, shop-floor responds to unexpected events via manually scheduling or by Excel, which leads to poor predictability and visibility of performance, slow response to uncertainties and market changes, and low efficiency of their production and supply chain systems.
In this talk, Manufacturing Execution Optimization (MEO) technologies developed by Dr. Shi and her team will be presented. MEO will enable the production system to be smart.
By establishing top floor to shop floor communication in real time, manufacturing firms will be able to significantly improve their production and supply chain efficiency while achieving a faster response to changes and disturbances in the most time-optimal manner.
MEO is developed based on Nested Partitions (NP) optimization framework. The coordination nature of the NP framework provides an efficient and effective platform for information sharing and exchange in real time. In this talk, two new simulation optimization methods will also be discussed and a case study will be presented.
-------------------------
Acceleration, Smoothing, and Variance Reduction for Convex Optimization and Nash games under Uncertainty
Uday Shanbhag
Pennsylvania State University
Wednesday, May 16, 2018
4:15 - 5:15 PM
Location: Shriram 368
Abstract:
We consider a stochastic convex problem of the form E[f(x,w)] + g(x) where g may be nonsmooth but deterministic.Traditional stochastic approximation (SA) schemes employ a single (or a fixed-batch) of sampled gradient f(x,\omega) and a prox evaluation with respect to g in computing a new iterate. As a consequence, practical implementations may require a large number of prox-evaluations, often rendering such schemes impractical. We present a variable sample-size stochastic approximation (VSSA) scheme where the batch of noisy gradients increases at a suitable rate at each iteration. We proceed to show that such schemes achieve deterministic rates of convergence for strongly convex as well as convex nonsmooth problems (by utilizing Nesterov acceleration and iterative smoothing). We further extend these avenues to gradient and inexact best-response to schemes for nonsmooth stochastic Nash games. Time permitting, we demonstrate how similar avenues allow for improving convergence rates of stochastic quasi-Newton methods for stochastic convex problems.
-------------------------
Making Robust Decision from Data
Anil Aswani
University of California at Berkeley
Wednesday, May 9, 2018
4:15 - 5:15 PM
Location: 200-030
Abstract:
Though machine learning has found success in decision-making contexts, these methods are fragile to model mismatch and malicious interference. This is a major impediment to the deployment of automated decision-making in safety-critical systems like those found in healthcare or physical infrastructure. This talk describes three methods we have developed for robust decision-making in different scenarios. The first is a framework for combining robust control with machine learning, and applications to energy efficiency and robotics are highlighted. The second is algorithms to solve inverse optimization (and inverse reinforcement learning) with noisy data. This problem arises when estimating utility functions or modeling human-automation systems, and we show it is NP-hard and that existing approaches are statistically inconsistent. We develop a polynomial time algorithm that is asymptotically optimal as more data is collected. Then we discuss applications of our inverse optimization approach to a clinical trial on personalized goal-setting through smartphone apps to increase physical activity, and to studying an incentive design problem in the Medicare Shared Savings Program where we show that an investment sharing plan could potentially save Medicare an additional $85 million per year. The third is an approach for bandit models where repeated application of an action causes habituation and a decrease of that action's rewards, while refraining from an action causes recovery and an increase of that action's awards. Though such problems are PSPACE-complete, we define a class of models called ROGUE bandits for which we can construct policies that achieve logarithmic regret. We describe an application of ROGUE bandits to a personalized healthcare problem of choosing an optimal sequence of daily messages to encourage an individual to increase their physical activity.
-------------------------
Block arrivals in the Bitcoin blockchain
Peter Taylor
University of Melbourne
Friday, June 1, 2018
3:15 - 4:15 PM
Location: Huang 305
This is a joint OR/AFTLab seminar.
Coauthors:
Rhys Bowden, Paul Keeler, Tony Krzesinski and Mirte van Weert
Abstract:
Bitcoin is an electronic payment system where payment transactions are verified and stored in a data structure called the blockchain. Bitcoin miners work individually to solve a computationally intensive problem, and with each solution a Bitcoin block is generated, resulting in a new arrival to the blockchain.
In order to control the rate at which blocks are generated, the difficulty of the computational problem is updated every 2,016 blocks In the original Bitcoin paper, it was suggested that the blockchain arrivals occur according to a homogeneous Poisson process.
Based on stochastic analysis of the block arrival process as well as blockchain block arrival data, we demonstrate that this is not the case. We present a refined mathematical model for block arrivals, focusing on both the block arrivals during a period of constant difficulty and how the difficulty level evolves over time.
-------------------------
Statistical Inference for Model Parameters with Stochastic Gradient Descent
Xi Chen
New York University
Wednesday, May 23, 2018
4:15 - 5:15 PM
Location: 370-370
Abstract:
In this talk, we investigate the problem of statistical inference of the true model parameters based on stochastic gradient descent (SGD) with Ruppert-Polyak averaging. To this end, based on the batch-means approach (Glynn & Whitt, 91), we propose a consistent estimator of the asymptotic covariance of the average iterate from SGD, which only uses the iterates from SGD. As the SGD process forms a time-inhomogeneous Markov chain, our batch-means estimator with carefully chosen increasing batch sizes generalizes the classical batch-means estimator designed for time-homogenous Markov chains. The proposed batch-means estimator allows us to construct asymptotically exact confidence intervals and hypothesis tests. We further discuss an extension to conducting inference based on SGD for high-dimensional linear regression. This is a joint work with Jason D. Lee, Xin T. Tong and Yichen Zhang.
-------------------------
Recent advances in dynamic decision making for renewable energy integration and flexible load management
Andy Sun
Georgia Institute of Technology
Thursday, May 31, 2018
3:15 - 4:15 PM
Location: Y2E2 101
This is a joint OR/ENERGY seminar.
Abstract:
Renewable energy integration and flexible load through demand side response have dramatically increased uncertainty in the electric power systems.
Traditional operation schemes of power systems heavily rely on static and deterministic decision models, and therefore are inevitably fragile to the forecast errors and unable to fully
realize the potential of renewable energy and demand-side responsiveness. We propose a suite of new dynamic decision making models and algorithms for solving multistage
stochastic and robust optimization problems with discrete decisions, continuous nonconvexity such as AC power flow, and temporally and spatially correlated uncertainty in renewable
energy sources and demand response portfolios. We demonstrate that the new decision tools significantly outperform the current practice in the real world large-scale power systems.
This points to a new direction for the modernization of the core operational principles of the modern electric power systems.
-------------------------
Taming the Devil of Gradient-based Optimization Methods with the Angel of Differential Equations
Weijie Su
University of Pennsylvania
Thursday, Nov 1, 2018
4:15 - 5:15 PM
Location: Packard 101
This is a joint OR/ISL seminar.
Abstract:
In this talk, we use ordinary differential equations to model, analyze, and interpret gradient-based optimization methods. In the first part of the talk, we derive a second-order ODE that is the limit of Nesterov's accelerated gradient method for non-strongly objectives (NAG-C). The continuous-time ODE is shown to allow for a better understanding of NAG-C and, as a byproduct, we obtain a family of accelerated methods with similar convergence rates. In the second part, we begin by recognizing that existing ODEs in the literature are inadequate to distinguish between two fundamentally different methods, Nesterov's accelerated gradient method for strongly convex functions (NAG-SC) and Polyak's heavy-ball method. In response, we derive high-resolution ODEs as more accurate surrogates for the three aforementioned methods. These novel ODEs can be integrated into a general framework that allows for a fine-grained analysis of the discrete optimization algorithms through translating properties of the amenable ODEs into those of their discrete counterparts. As the first application of this framework, we identify the effect of a term referred to as 'gradient correction' in NAG-SC but not in the heavy-ball method, shedding insight into why the former achieves acceleration while the latter does not. Moreover, in this high-resolution ODE framework, NAG-C is shown to boost the squared gradient norm minimization at the inverse cubic rate, which is the sharpest known rate concerning NAG-C itself. Finally, by modifying the high-resolution ODE of NAG-C, we obtain a family of new optimization methods that are shown to maintain the accelerated convergence rates as NAG-C for smooth convex functions. This is based on joint work with Stephen Boyd, Emmanuel Candes, Simon Du, Michael Jordan, and Bin Shi.
-------------------------
Quantile-based risk sharing, market equilibria, and belief heterogeneity
Ruodu Wang
University of Waterloo
Wednesday, Nov 14, 2018
4:15 - 5:15 PM
Location: Huang 305
Abstract:
We establish a general framework of risk sharing games for agents using quantile-based risk measures as their preferences. The family of quantile-based risk measures includes the Value-at-Risk (VaR) and the Expected Shortfall (ES), the two popular and competing regulatory risk measures, as special cases. Motivated by the extensive use of internal models in current banking and insurance regulation, we further allow for belief heterogeneity in the market. The Pareto-optimal risk sharing game is solved through explicit construction. Competitive equilibria are established for some simple, yet natural settings. Results in the new framework are in sharp contrast to the classic utility-based risk sharing framework, and this possibly explains some financial phenomena observed during the 2007 - 09 financial crisis. Further, we investigate the issue of model uncertainty in risk sharing, and show that, generally, a robust optimal allocation exists if and only if none of the underlying risk measures is a VaR. Practical implications of our main results for risk management and policy makers are discussed, and several novel advantages of ES over VaR from the perspective of a regulator are thereby revealed.
-------------------------
Under the Hood of Bike Sharing
Shane Henderson
Cornell University
Wednesday, Oct 2, 2019
4:30 - 5:30 PM
Location: Shriram 262
Abstract:
Cornell's work on bike sharing with Citi Bike and its parent company
Motivate relies on a combination of data analysis, stochastic modeling
and optimization to help inform both the design and operation of the
largest bike-sharing operations in North America. I'll discuss our
work and its impact, but focus on some of the inner workings of the
stochastic modeling. This includes the use of (one of) the Poisson
equation(s) in the computation of a central performance measure, a
proof that the resulting objective function has important structural
properties, and a heuristic underlying a simulation-optimization
principle that is likely useful in many other contexts.
Joint work with Daniel Freund, Nanjing Jian, Eoin O'Mahony, David Shmoys.
-------------------------
On the Heavy-Tail Behavior of the Distributionally Robust Newsvendor
Karthink Natarajan
Singapore University of Technology and Design
Friday, Oct 18, 2019
3:15 - 4:15 PM
Location: Spilker 143
Abstract:
Since the seminal work of Scarf (1958) on the newsvendor problem with ambiguity in the demand distribution, there has been a growing interest in the study of the distributionally robust newsvendor problem. The model is criticized at times for being overly conservative since the worst-case distribution is discrete with a few support points. However, it is the order quantity prescribed from the model that is of practical relevance. A simple calculation shows that the optimal order quantity in Scarf's model with known first and second moment is also optimal for a censored student-t distribution with parameter 2. In this paper, we generalize this "heavy-tail optimality" property of the distributionally robust newsvendor to an ambiguity set where information on the first and the αth moment is known, for any real number α> 1. We show that the optimal order quantity for the distributionally robust newsvendor problem is also optimal for a regularly varying distribution with roughly a power law tail with tail index α.
-------------------------
Click-based MNL: Algorithmic Frameworks for Modeling Click Data in Assortment Optimization
Ali Aouad
London Business School
Wednesday, Nov 20, 2019
4:30 - 5:30 PM
Location: Shriran 262
Abstract:
In this paper, we introduce the click-based MNL choice model, a novel framework for modeling customer purchasing decisions in e-commerce settings. Our main modeling idea is to assume that the click behavior within product recommendation or search results pages provides an exact signal regarding the alternatives considered by each customer. We study the resulting assortment optimization problem, where the objective is to select a subset of products, made available for purchase, to maximize the expected revenue. Our main algorithmic contribution comes in the form of a polynomial-time approximation scheme (PTAS) for this problem, showing that the optimal expected revenue can be efficiently approached within any degree of accuracy. In the course of establishing this result, we develop novel technical ideas, including enumeration schemes and stochastic inequalities, which may be of broader interest for related stochastic knapsack problems. Experiments on data acquired in collaboration with the retailer Alibaba demonstrate the practical significance of the proposed choice model. We generate realistic assortment optimization instances that mirror Alibaba's display customization problem, and implement practical variants of our approximation scheme to compute assortment recommendations in these settings. We find that the recommended assortments have the potential to be at least 9% more profitable than those resulting from a standard MNL model.
Joint work with Jacob Feldman, Danny Segev, and Dennis Zhang.
-------------------------
From Reinforcement Learning to Stochastic Optimization: A Universal Framework for Sequential Decision Analytics
Warren Powell
Princeton University
Wednesday, Jan 29, 2020
4:30 - 5:30 PM
Location: Shriram 262
Abstract:
Reinforcement Learning has attracted considerable attention with its successes in mastering advanced games such as chess and Go. This attention has ignored major successes such as landing SpaceX rockets using tools of optimal control, or optimizing large fleets of trucks and trains using tools from operations research and approximate dynamic programming. In fact, there are 15 different communities all contributing to the vast range of sequential problems that arise in energy, finance, transportation, health, engineering and the sciences. As each community has evolved to address a broader range of problems, there has been a consistent pattern of discovery of tools that sometimes differ in name only, or modest implementation details.
I will represent all of these communities using a single, canonical framework that mirrors the widely used modeling style from deterministic math programming. The key difference when introducing uncertainty is the need to optimize over policies. I will show that all the solution strategies (that is, policies) suggested by the research literature, in addition to some that are widely used in practice, can be organized into four fundamental classes. One of these classes, which we call ``parametric cost function approximations," is widely used in practice, but has been largeley overlooked by the academic community, with notable exception of bandit community. These ideas will be illustrated using applications drawn from transportation, energy, emergency response and material science.
-------------------------
"Theompirical" Research of Service Systems
Avishai Mandelbaum
Technion
Wednesday, Feb 12, 2020
4:30 - 5:30 PM
Location: Shriram 262
Abstract:
I shall describe my personal research journey through service systems (e.g. hospitals, telephone and chat centers, banks, courts). I view these systems through MS/IE/OM lenses, often more specifically as a queueing scientist (e.g. "enjoying" congestion and flows), and sometimes using operational characteristics as surrogates for other performance indicators (e.g. clinical, psychological, financial). The goal of the research is to create principles and tools that support the above viewpoints; and the means for achieving this goal is the marriage of theory with data.
To be more concrete, I am modeling complex service systems as relatively simple (robust) processing networks. My theoretical framework is (asymptotic) queueing theory, specifically parsimonious fluid models and their diffusion refinements: queueing theory is ideally suitable for capturing the operational tradeoff that is at the core of any service, namely quality vs. efficiency (possibly augmented with fairness or profitability); and asymptotic analysis accommodates complex service characteristics that are otherwise mathematically intractable, for example transience, scale and scope. My data/empirical framework builds on an extensive data repository of service event-logs, at the level of the individual customer-server transactions (e.g. patient-physician or customer-agent).
Marrying theory with data, as I see it, will culminate in the creation of models directly from data - automatically and in real-time - and consequently the validation of the models' value againts actual service systems. (This is in contrast to still prevalent OR/MS/IE/OM practice, where models are too often too remote from data, and approximations are validated for merely accuracy relative to their originating models.) More specifically, data-based models - simulation, empirical, statistical and mathematical - will be created via process-mining of their primitives, structure and protocols. Simulation models will then serve as a validation ground for the other models, and all will be universally accessible for applications by researchers, students and in the longer‐run practitioners.
The above research agenda has been advanced, over 15 years or so, at the Technion SEE Laboratory (SEE = Service Enterprise Engineering); SEELab data will hence be used, throughout my lecture, to make it concrete.
-------------------------
From Data to Decisions: Distributionally Robust Optimization is Optimal
Daniel Kuhn
EPFL
Tuesday, March 3, 2020
3:00 - 4:00 PM
Location: Hewlett 103
Abstract:
We study stochastic programs where the decision-maker cannot observe the distribution of the exogenous uncertainties but has access to a finite set of independent samples from this distribution. In this setting, the goal is to find a procedure that transforms the data to an estimate of the expected cost function under the unknown data-generating distribution, i.e., a predictor, and an optimizer of the estimated cost function that serves as a near-optimal candidate decision, i.e., a prescriptor. As functions of the data, predictors and prescriptors constitute statistical estimators. We propose a meta-optimization problem to find the least conservative predictors and prescriptors subject to constraints on their out-of-sample disappointment. The out-of-sample disappointment quantifies the probability that the actual expected cost of the candidate decision under the unknown true distribution exceeds its predicted cost. Leveraging tools from large deviations theory, we prove that this meta-optimization problem admits a unique solution: The best predictor-prescriptor pair is obtained by solving a distributionally robust optimization problem over all distributions within a given relative entropy distance from the empirical distribution of the data.
-------------------------
Optimal Stochastic Trace Estimation
Christopher Musco
New York University
Wednesday, Oct 28, 2020
4:30 - 5:30 PM
Abstract:
I will discuss algorithms for an important computational primitive in linear algebra: approximately computing the trace of an implicit matrix A that can only be accessed through matrix-vector multiplications. Trace approximation finds applications across machine learning and scientific computing, where it is used to compute matrix norms, spectral densities, log-determinants, triangle counts in graphs, and much more. In 1990, Hutchinson introduced an elegant randomized algorithm for the trace approximation problem that has become ubiquitous in practice. I will introduce a simple modified version of this algorithm that provides the same theoretical guarantees, but requires quadratically fewer matrix-vector multiplications, and performs far better in experiments. We pair this result with matching lower bounds based on reductions to communication complexity and hypothesis testing for spiked-covariance matrices. Our lower bounds fall into a broader research agenda of better understanding the computational complexity of basic linear algebra problems in the restricted "matrix-vector query" model of computation, which generalizes common algorithmic frameworks like linear sketching and Krylov subspace methods.
Joint work with Raphael A. Meyer, Cameron Musco, and David P. Woodruff. Paper is available at: https://arxiv.org/abs/2010.09649
-------------------------
Gradient Flows in the Wasserstein Metric: From Discrete to Continuum via Regularization
Katy Craig
UCSB
Wednesday, Nov 4, 2020
4:30 - 5:30 PM
Abstract:
Over the past ten years, optimal transport has become a fundamental tool in statistics and machine learning: the Wasserstein metric provides a new notion of distance for classifying distributions and a rich geometry for interpolating between them. In parallel, optimal transport has led to new theoretical results on the stability and long time behavior of partial differential equations through the theory of Wasserstein gradient flows. These two lines of research recently intersected in a series of works that characterized the dynamics of training neural networks with a single hidden layer as a Wasserstein gradient flow. In this talk, I will briefly introduce the mathematical theory of Wasserstein gradient flows and describe recent results on discrete to continuum limits. In particular, I will show how passing from the discrete to continuum limit by introducing an appropriate regularization can lead to faster rates of convergence, as well as novel, deterministic particle methods for diffusive processes.
-------------------------
Displacement Effects in a Hot Spot Policing Intervention in Medellin: Inference and Pitfalls
Guillaume Basse
Stanford University
Wednesday, Nov 18, 2020
8:30 - 9:30 AM
Abstract:
In hot spot policing, resources are targeted at specific locations predicted to be at high risk of crime; so-called ``hot spots.'' Rather than reduce overall crime, however, there is a concern that these interventions simply displace crime from the targeted locations to nearby non-hot spots. We address this question in the context of a large-scale randomized experiment in Medellin, Colombia, in which police were randomly assigned to increase patrols at a subset of possible hotspots. Estimating the displacement effects on control locations is difficult because the probability that a nearby hotspot is treated is a complex function of the underlying geography. While existing methods developed for this ``general interference'' setting, especially Horvitz-Thompson (HT) estimators, have attractive theoretical properties, they can perform poorly in practice and mislead practitioners. In this talk, I explore the key pitfalls that practitioners should watch out for when conducting this type of analysis, and propose some ways to partially remedy them.
-------------------------
Monte Carlo Methods in Stochastic Convex Optimization
Daniel Bartl
University of Vienna
Wednesday, Nov 10, 2020
12:00 - 1:00 PM
Abstract:
We develop a novel procedure for estimating the optimizer of general convex stochastic optimization problems from an iid sample. This procedure is the first one that exhibits the optimal statistical performance in heavy tailed situations and also applies in high dimensional settings. We discuss the results at hand of the portfolio optimization problem. Joint work with Shahar Mendelson.
Operations Research Colloquia: http://or.stanford.edu/oras_seminars.html