Frank Donald Wood
Relevant Thesis-Based Degree Programs
Affiliations to Research Centres, Institutes & Clusters
Graduate Student Supervision
Doctoral Student Supervision
Dissertations completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest dissertations.
This thesis focuses on the use of Bayesian inference and its practical application in real-world scenarios, such as in scientific stochastic simulators, via probabilistic programming. The execution of a probabilistic program (the reference program) is synonymous with probabilistic inference, and requires "only" the explicit denotation of random variables and their distributions. The inference procedure is carried out by a general-purpose inference backend (or engine) at runtime. The efficacy of many inference backends is a combination of (1) how well the particular engine can capture complex dependency structures between latent variables and (2) the speed at which the reference program runs---i.e. how fast its joint probability distribution can be calculated. Improvements to either attribute will result in more efficient inference. Furthermore, as these inference procedures---and Bayesian inference in general---require exact conditioning, it is not immediately obvious how to carry out inference when the conditional observations are associated with uncertainty.The main contributions of this thesis are the improvement of existing inference approaches in probabilistic programming by adding an attention mechanism to the inference back-end known as inference compilation and the extension of probabilistic programming to facilitate automated surrogate modeling. Additionally, this thesis make theoretical contributions to the problem of performing inference when observations are associated with uncertainty.In summary, this thesis aims to further advance the applicability of probabilistic programming in scientific simulators and Bayesian inference, with contributions focused on improving existing inference approaches, extending the functionality of probabilistic programming, and providing theoretical solutions to the challenges of uncertainty associated with observations. The results of this thesis have the potential to benefit researchers across many fields, ranging from physics to finance, by providing a more efficient and practical approach to simulator inversion and Bayesian inference.
Variational inference (VI) is a popular method used within statistics and machine learning to approximate intractable probability distributions via optimization. Central to VI is the Evidence Lower Bound (ELBO), a variational objective function which lower bounds the log marginal likelihood, and can be used to jointly perform maximum likelihood parameter estimation and approximate posterior inference using stochastic gradient ascent. The core contribution of this thesis is the Thermodynamic Variational Objective (TVO), a novel variational objective derived from a key connection we make between variational inference and thermodynamic integration. The TVO both tightens and generalizes the ubiquitous ELBO, and empirically leads to improvements in model and inference network learning in both discrete and continuous deep generative models. Using a novel exponential family interpretation of the geometric mixture curve underlying the TVO, we characterize the divergence bound gap left by the TVO as a sum of KL divergences between adjacent distributions, with the forward and reverse KL’s corresponding to the lower and upper-bound TVO variants. To enable the TVO to be used in gradient- based optimization algorithms, we provide two computationally efficient score-function and doubly-reparameterized based gradient estimators, as well as two adaptive “schedulers” which choose the discretization locations of a one- dimensional Riemann integral approximation, a key hyperparameter in the TVO. Additionally, we show that the objective functions used in Variational Inference, Variational AutoEncoders, Wake-Sleep, Inference Compilation, and Rényi Divergence Variational Inference are all special cases of the TVO. Finally, we evaluate the TVO in two real-world settings - a stochastic control flow models with discrete latent variables, and multi-agent trajectory prediction with continuous latent variables built on top of a differentiable driving simulator - and find the TVO improves upon baseline objectives in both cases.
Master's Student Supervision
Theses completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest theses.
Modern deep learning requires large-scale extensively labelled datasets for training. Few-shot learning aims to alleviate this issue by learning effectively from few labelled examples. In previously proposed few-shot visual classifiers, it is assumed that the feature manifold arriving at the classifier has uncorrelated feature dimensions and uniform feature variance. In this work, we focus on addressing the limitations arising from this assumption by proposing a variance-sensitive class of models that operates in a low-label regime. The first method, Simple CNAPS, employs a hierarchically regularized Mahalanobis-distance based classifier combined with a state of the art neural adaptive feature extractor to achieve strong performance on Meta-Dataset, mini-ImageNet and tiered-ImageNet benchmarks. We further extend this approach to a transductive learning setting, proposing Transductive CNAPS. This transductive method combines a soft k-means parameter refinement procedure with a two-step task encoder to achieve improved test-time classification accuracy using unlabelled data. Transductive CNAPS achieves state of the art performance across all major few-shot learning benchmarks. Finally, we explore the use of our methods (Simple and Transductive) for “out of the box” continual and active learning. Extensive experiments on large scale benchmarks illustrate robustness and versatility of this, relatively speaking, simple class of models. All trained model checkpoints and corresponding source codes are made publicly available at github.com/plai-group/simple-cnaps.
Motivated by the problem of amortized inference in large-scale simulators, we introduce a probabilistic programming library that brings us closer to this goal. This library enables us to perform Bayesian inference on any simulator written in a wide variety of programming languages, with minimal modification to the simulator's source code. However, there are challenges in achieving this goal in its most general meaning. In particular, we address the obstacles caused by unbounded loops. Existing approaches to amortized inference in probabilistic programs with unbounded loops can produce estimators with infinite variance. An instance of this is importance sampling inference in programs that explicitly include rejection sampling as part of the user-programmed generative procedure. We develop a new and efficient amortized importance sampling estimator. We prove finite variance of our estimator and empirically demonstrate our method's correctness and efficiency compared to existing alternatives on generative programs containing rejection sampling loops and discuss how to implement our method in a generic probabilistic programming framework.