David Poole


Research Classification

Artificial Intelligence

Research Interests

Artificial Intelligence
Machine Learning
Knowledge Representation
Decision Analysis
Preference Elicitation
Reasoning under Uncertainty
Probabilistic Graphical Models
Relational Learning

Relevant Degree Programs


Research Methodology

machine learning
Probabilistic Inference
Multiattribute Uitility


Master's students
Doctoral students
Any time / year round
I support public scholarship, e.g. through the Public Scholars Initiative, and am available to supervise students and Postdocs interested in collaborating with external partners as part of their research.
I support experiential learning experiences, such as internships and work placements, for my graduate students and Postdocs.

Complete these steps before you reach out to a faculty member!

Check requirements
  • Familiarize yourself with program requirements. You want to learn as much as possible from the information available to you before you reach out to a faculty member. Be sure to visit the graduate degree program listing and program-specific websites.
  • Check whether the program requires you to seek commitment from a supervisor prior to submitting an application. For some programs this is an essential step while others match successful applicants with faculty members within the first year of study. This is either indicated in the program profile under "Admission Information & Requirements" - "Prepare Application" - "Supervision" or on the program website.
Focus your search
  • Identify specific faculty members who are conducting research in your specific area of interest.
  • Establish that your research interests align with the faculty member’s research interests.
    • Read up on the faculty members in the program and the research being conducted in the department.
    • Familiarize yourself with their work, read their recent publications and past theses/dissertations that they supervised. Be certain that their research is indeed what you are hoping to study.
Make a good impression
  • Compose an error-free and grammatically correct email addressed to your specifically targeted faculty member, and remember to use their correct titles.
    • Do not send non-specific, mass emails to everyone in the department hoping for a match.
    • Address the faculty members by name. Your contact should be genuine rather than generic.
  • Include a brief outline of your academic background, why you are interested in working with the faculty member, and what experience you could bring to the department. The supervision enquiry form guides you with targeted questions. Ensure to craft compelling answers to these questions.
  • Highlight your achievements and why you are a top student. Faculty members receive dozens of requests from prospective students and you may have less than 30 seconds to pique someone’s interest.
  • Demonstrate that you are familiar with their research:
    • Convey the specific ways you are a good fit for the program.
    • Convey the specific ways the program/lab/faculty member is a good fit for the research you are interested in/already conducting.
  • Be enthusiastic, but don’t overdo it.
Attend an information session

G+PS regularly provides virtual sessions that focus on admission requirements and procedures and tips how to improve your application.


Graduate Student Supervision

Doctoral Student Supervision (Jan 2008 - Nov 2019)
Representing and learning relations and properties under uncertainty (2019)

The world around us is composed of entities, each having various properties and participating in relationships with other entities. Consequently, data is often inherently relational. This dissertation studies probabilistic relational representations, reasoning and learning with a focus on three common prediction problems for relational data: link prediction, property prediction, and joint prediction. For link prediction, we develop a tensor factorization model called SimplE which is simple, interpretable, fully-expressive, and integratable with certain types of domain expert knowledge. On two standard benchmarks for knowledge graph completion, we show how SimplE outperforms the state-of-the-art models. For property prediction, first we study the limitations of the existing StaRAI models when being used for property prediction. Based on this study, we develop relational neural networks which combine ideas from lifted relational models with deep learning and perform well empirically. We base the joint prediction on lifted relational models for which parameter learning typically requires inference over a highly-connected graphical model. The inference step is usually the bottleneck for learning. We study a class of inference algorithms known as lifted inference which makes inference tractable by exploiting both conditional independence and symmetries. We study two ways of speeding up lifted inference algorithms: 1- through proposing heuristics for elimination ordering and 2- through compiling the lifted operations to low-level languages. We also expand the largest known class of models for which we know how to do efficient lifted inference. Thus, structure learning algorithms for lifted relational models that restrict the search space to models for which efficient inference algorithms exist can perform their search over a larger space.

View record

Equilibrium policy for gradients for spatiotemporal planning (2012)

In spatiotemporal planning, agents choose actions at multiple locations in spaceover some planning horizon to maximize their utility and satisfy various constraints. In forestry planning, for example, the problem is to choose actionsfor thousands of locations in the forest each year. The actions at each locationcould include harvesting trees, treating trees against disease and pests, ordoing nothing. A utility model could place value on sale of forest products,ecosystem sustainability or employment levels, and could incorporate legal andlogistical constraints such as avoiding large contiguous areas of clearcuttingand managing road access. Planning requires a model of the dynamics. Existingsimulators developed by forestry researchers can provide detailed models of thedynamics of a forest over time, but these simulators are often not designed foruse in automated planning.This thesis presents spatiotemoral planning in terms of factored Markov decision processes. A policy gradient planning algorithm optimizes a stochastic spatial policy using existing simulators for dynamics.When a planning problem includes spatial interaction between locations, deciding on an action to carry out at one location requires considering the actions performed at other locations. This spatial interdependence is common in forestry and other environmental planning problems and makes policy representation and planning challenging. We define a spatial policy in terms of local policies defined as distributions over actions at one location conditioned upon actions at other locations.A policy gradient planning algorithm using this spatial policy is presented whichuses Markov Chain Monte Carlo simulation to sample the landscape policy, estimate its gradient and use this gradient to guide policy improvement. Evaluation is carried out on a forestry planning problem with 1880 locations using a variety of value models and constraints.The distribution over joint actions at all locations can be seen as the equilibrium of a cyclic causal model. This equilibrium semantics is compared to Structural Equation Models. We also define an algorithm for approximating the equilibrium distribution for cyclic causal networks which exploits graphical structure and analyse when the algorithm is exact.

View record

Learning Latent Theories of Relations and Individuals (2011)

Inductive learning of statistical models from relational data is a key problem in artificial intelligence. Two main approaches exist for learning with relational data, and this thesis shows how they can be combined in a uniform framework. The first approach aims to learn dependencies amongst features (relations and properties), e.g. how users' purchases of products depend on users' preferences of the products and associated properties of users and products. Such models abstract over individuals, and are compact and easy to interpret.The second approach learns latent properties of individuals that explain the observed features, without modelling interdependencies amongst features. Latent-property models have demonstrated good predictive accuracy in practise, and are especially useful when few properties and relations are observed. Interesting latent groupings of individuals can be discovered.Our approach aims to learn a unified representation for dependency structures for both observed features and latent properties. We develop a simple approximate EM algorithm for learning the unified representation, and experiments demonstrate cases when our algorithm can generate models that predicts better than dependency-based models of observed features as well as a state-of-the-art latent-property model. We extend our approximate EM algorithm to handle uncertainty about the number of values for latent properties. We search over the number of values and return error bounds, as an alternative to existing proposals based on sampling in the posterior distribution over the number of values.We also solve a specific case where dependencies involve functional relations, which induces a verbose model with many parameters. In comparison, the standard solution of aggregating over all values of the function yields a simple model that predicts poorly. We show how to learn an optimal intermediate-size representation efficiently by clustering the values of the function. The proposed method generates models that capture interesting clusters of function values, dominates the simple model in prediction, and can surpass the verbose model using much fewer parameters.

View record

Aggregation and constraint processing in lifted probabilistic inference (2010)

Representations that mix graphical models and first-order logic - called either first-orderor relational probabilistic models — were proposed nearly twenty years ago and many more have since emerged. In these models, random variables are parameterized by logical variables.One way to perform inference in first-order models is to propositionalize the model, that is, to explicitly consider every element from the domains of logical variables. This approach might be intractable even for simple first-order models.The idea behind lifted inference is to carry out as much inference as possible without propositionalizing.An exact lifted inference procedure for first-order probabilistic models was developed by Poole [2003] and later extended to a broader range of problems by de Salvo Braz et al. [2007]. The C-FOVE algorithm by Milch et al. [2008] expanded the scope of lifted inference and is currently the state of the art in exact lifted inference.In this thesis we address two problems related to lifted inference: aggregation in directed first-order probabilistic models and constraint processing during lifted inference.Recent work on exact lifted inference focused on undirected models. Directed first-order probabilistic models require an aggregation operator when a parent random variable is parameterized by logical variables that are not present in a child random variable. We introduce a new data structure, aggregation parfactors, todescribe aggregation in directed first-order models. We show how to extend the C-FOVE algorithm to perform lifted inference in the presence of aggregation parfactors. There are cases where the polynomial time complexity (in the domain size of logical variables) of the C-FOVE algorithm can be reduced to logarithmic time complexity using aggregation parfactors.First-order models typically contain constraints on logical variables. Constraints are important for capturing knowledge regarding particular individuals. However, the impact of constraint processing on computational efficiency of liftedinference has been largely overlooked. In this thesis we develop an efficient algorithmfor counting the number of solutions to the constraint satisfaction problems encountered during lifted inference. We also compare, both theoretically and empirically, different ways of handling constraints during lifted inference.

View record

Master's Student Supervision (2010 - 2018)
Finding a record in a database (2017)

Consider the following problem: given a database of records indexed by names (e.g., of companies or restaurants) and a new name, determine whether the new name is in the database, and if so, which record it refers to. This problem is called record linkage. Record linkage is a challenging problem because people do not consistently use the official title of a company, but use abbreviations, synonyms, different orders of terms, and the title can contain typos. We provide a probabilistic model using relational logistic regression to find the probability of each record in the database being the desired record for a given query, and find the best record(s). Our model addresses many of challenges of the record linkage problem and provides good results when exact term matching search algorithms fail. We evaluate our model on a large real-world data set. Obtained results show that the model is a promising probabilistic record linkage model.

View record

Interactive Visualization for Group Decision-Making (2014)

In infrastructure planning, identifying ‘the best solution’ out of a given set of alternatives is a context-dependent multi-dimensional multi-stakeholder challenge in which competing criteria must be identified and trade-offs made. In a recent study, colleagues from Institute of Resources, Sustainability and Environment found that there is a need for a visualization tool that enables planners and decision makers to collectively explore individual preferences among those involved in the decision. This thesis concerns designing and evaluating an interactive visualization toolthat facilitates group decisions by making the problem analysis more participatory, transparent, and comprehensible. To do so, we extend the interactive visualization tool ValueCharts to create Group ValueCharts. We conducted studies withtwo different groups to evaluate the effectiveness of Group ValueCharts in group decision-making. The first group was university staff in leading positions in different departments, presently engaged in and responsible for water infrastructureplanning. The second group was employees of an analytics company who are involved in buying scientific software licenses. Each group was instructed on how to use the tool in application to their current decision problem. The discussions were audio recorded and the participants were surveyed to evaluate usability. The results indicate that participants felt the tool improved group interaction and information exchange, and made the discussion more participatory. Additionally, the participants strongly concur that the tool reveals disagreements and agreements within the group. These results suggest that Group ValueCharts has the ability to enhance transparency and comprehensibility in group decision-making.

View record

Modeling Ordinal Data for Recommendation System (2014)

In this work we investigate the problem of making personalized recommendations by creating models for predicting user-item rating, such as in movie recommendations. The study is based on the Movielens data set which has ratings on an ordinal scale. In the past, partly due to motivation gained by the Netflix challenge, researchers have constructed models that make point predictions to minimize the root mean square error (RMSE) on test sets, typically by learning latent user and movie feature structure. In such models, the difference between ratings of 2 and 3 stars is the same as the difference between ratings of 4 and 5 stars, etc., which is a strong prior assumption. We construct probabilistic models which also learn latent user and movie feature structure but do not make this assumption. These models interpret the ratings as categories (nominal and ordinal) and return a probability distribution over the ratings for each user-movie pair instead of making a point prediction. We evaluate and compare our models with other models for making personalized recommendations for the top-n task and comparing the precision vsrecall, receiver operating characteristic and cost curves. Our results show that our ordinal data model performs better than a nominal data model, a state-of-the-art point prediction model, and other baselines.

View record

Sensing and Sorting Ore Using a Relational Influence Diagram (2014)

Mining companies typically process all the material extracted from a mine site using processes which are extremely consumptive of energy and chemicals. Sorting this material more effectively would reduce the resources required. A high-throughput rock-sorting machine developed by MineSense™ Technologies Ltd. provides the sensors and diverting equipment. After receiving noisy sensor data, the sorting system has 400 ms to decide whether to activate the diverters which will divert the rocks into either a keep or a discard bin. The problem tackled in this thesis is to sort an unknown number of rocks by sensing their mineralogy, position, and size using electromagnetic sensors and diverting them according to how valuable the mineral is to the mine. In real-time we must interpret the sensor data and compute the best action to take. We model the problem with a relational influence diagram which shows relations between random variables, decision variables, and utility nodes. We learn the model offline and do online inference. Inference is achieved using a combination of exhaustive and random search. The model parameters are learned using Sequential Model-based Algorithm Configuration (SMAC). We simulate the diverters for offline evaluation and evaluate our solution on recorded sensor data. Our result improves over the current state-of-the-art across the entire range of utility.

View record

Probabilistic reasoning with undefined properties in ontologically-based belief networks (2013)

This thesis concerns building probabilistic models with an underlying ontology that defines the classes and properties used in the model. In particular, it considers the problem of reasoning with properties that may not always be defined. Furthermore, we may even be uncertain about whether a property is defined for a given individual. One approach is to explicitly add a value "undefined" to the range of random variables, forming (what we call) extended belief networks. Adding an extra value to a random variable's range, however, has a large computational overhead. In this work, we propose an alternative, ontologically-based belief networks, where properties are only used when they are defined. We show how probabilistic reasoning can be carried out without explicitly using the value "undefined" during inference. This, in general, requires that we perform two probabilistic queries to determine (1) the probability that the hypothesis is defined and (2) the probabilities of the hypothesis given it is defined. We prove this is equivalent to reasoning with the corresponding extended belief network and empirically demonstrate on synthetic models that inference becomes more efficient.

View record


If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.


Learn about our faculties, research, and more than 300 programs in our 2021 Graduate Viewbook!