Nancy Heckman

Prospective Graduate Students / Postdocs

This faculty member is currently not actively recruiting graduate students or Postdoctoral Fellows, but might consider co-supervision together with another faculty member.


Research Classification

Research Interests

Statistics and Probabilities
functional data analysis

Relevant Degree Programs

Affiliations to Research Centres, Institutes & Clusters


Graduate Student Supervision

Doctoral Student Supervision (Jan 2008 - April 2022)
Clustering and modelling of phase variation for functional data (2019)

Our work is motivated by an analysis of elephant seal dive profiles which we view as functional data, specifically, as depth as a function of time, with data recorded almost continuously by sensors attached to the animal. The objective is to group profiles by shape to better understand the corresponding behavioural states of the seals. Most existing approaches rely on multivariate clustering methods applied to ad hoc summaries of the dive profile. Instead, we view each profile as arising from a function that is a deformation of a base shape. The deformation is regarded as phase variation and is represented by a latent warping function with a finite mixture distribution.We first propose a curve registration model to explicitly model amplitude and phase variations of functional data, with phase variation represented by smooth time transformations called warping functions. Inference is conducted via the stochastic approximation expectation-maximization (SAEM) algorithm. Our simulation study shows that the SAEM algorithm is computationally more stable and efficient than existing approaches in the literature for inference of this class of curve registration model with flexible warping.We then propose two clustering approaches based on our curve registration model for functional data: 1) a simultaneous approach that smooths the noisy raw profiles and estimates the base shape, the warping functions and the cluster membership and via SAEM algorithms; and 2) a two-step approach that applies clustering algorithms on the estimated warping functions. In contrast to generic clustering algorithms in the literature, our methods treat the clustering structure as heterogeneity in phase variation. The proposed method is applied to the analysis of elephant seal dive profiles and an analysis of human growth curves. We are able to obtain more intuitive clusters by focusing the clustering effort on phase variation.

View record

Switching nonparametric regression models (2013)

In this thesis, we propose a methodology to analyze data arising from a curve that, over its domain, switches among J states. We consider a sequence of response variables, where each response y depends on a covariate x according to an unobserved state z, also called a hidden or latent state. The states form a stochastic process and their possible values are j=1,...,J. If z equals j the expected response of y is one of J unknown smooth functions evaluated at x. We call this model a switching nonparametric regression model. In a Bayesian switching nonparametric regression model the uncertainty about the functions is formulated by modeling the functions as realizations of stochastic processes. In a frequentist switching nonparametric regression model the functions are merely assumed to be smooth. We consider two different data structures: one with N replicates and the other with one single realization. For the hidden states, we consider those that are independent and identically distributed and those that follow a Markov structure. We develop an EM algorithm to estimate the parameters of the latent state process and the functions corresponding to the J states. Standard errors for the parameter estimates of the state process are also obtained. We investigate the frequentist properties of the proposed estimates via simulation studies. Two different applications of the proposed methodology are presented. In the first application we analyze the well-known motorcycle data in an innovative way: treating the data as coming from J>1 simulated accident runs with unobserved run labels. In the second application we analyze daytime power usage on business days in a building treating each day as a replicate and modeling power usage as arising from two functions, one function giving power usage when the cooling system of the building is off, the other function giving power usage when the cooling system is on.

View record

Master's Student Supervision (2010 - 2021)
Improving dive phase definitions in northern resident killer whales (2021)

In contrast with the endangered southern resident killer whales (SRKWs), the northern resident killer whales (NRKWs) have been thriving in their habitats. The main hypotheses proposed to explain the differences in survival of these population are associated with differential reproductive output, body compositions, and feeding rates. Testing some of these hypotheses requires researchers to identify prey captures for these animals. As these events are difficult to directly observe through field operations, researchers equip whales with suction-cup attached biologgers and use kinematic variables during the bottom phase of a dive to predict prey captures. However, universal definitions of the bottom phase have not been established and often appear arbitrarily chosen, leading to potentially over or underestimating foraging events. Using the diving and kinematic data collected from three NRKWs, I show that modifying the bottom phase greatly impacts existing methods used to predict prey capture events. To investigate bottom phase definition variability, I then asked several whale researchers to identify the bottom phase of various dives via an interactive study. Linear mixed-effects model analyses showed that there exists substantial variation in bottom phase definitions across different researchers and across different dive types. I compared several statistical models of the start and end of the bottom phase of a dive, including modifications to existing methods, linear regression models, and functional linear regression models. Compared to the currently used bottom phase definitions, using the model based definitions resulted in significant improvements when predicting prey capture dives. Furthermore, these proposed models offer substantial increases in prediction accuracy of the bottom phase of a dive when comparing these model predictions and the currently used methods to the user-provided bottom phases. Finally, I formulated two methods to determine an adequate sample size for fitting these statistical models. The results of both methods show that an adequate sample size of approximately 50-100 dives can be used to obtain satisfactory model predictions for this data. This work shows that dive phase definitions may impact the results of many existing studies and should be emphasized as an important part of analyzing diving data.

View record

Instantaneous Dynamics of Functional Data (2016)

Time dynamic systems can be used in many applications to data modeling. In the case of longitudinal data, the dynamics of the underlying differential equation can often be inferred under minimal assumptions via smoothing based procedures. This is in contrast to the common technique of assuming a prespecified differential equation, and estimating it's parameters. In many cases, one wants to learn the dynamics of a differential equation that incorporates more than just one stochastic process. In the following, we propose extensions to existing two-step smoothing methods that allow for the presence of additional functional data arising from a second stochastic process. We further introduce model comparison techniques to assess the hypothesis that there is a significant change in fit provided by this additional process. These techniques are applied to the instantaneous dynamics of mouse growth data and allow us to make comparisons between mice who have been assigned different genetic and physical conditions. Finally, to study the statistical properties of our proposed techniques, we carry out a simulation study based on the mouse growth data. Supplementary material :

View record

Kernel Estimation of the Drift Coefficient of a Diffusion Process in the Presence of Measurement Error (2014)

Diffusion processes, a class of continuous-time stochastic processes, can be used to model time-series data observed at discrete time points. A diffusion process can be completely characterized by two functions, called the drift coefficient and the diffusion coefficient. For the nonparametric estimation of these two functions, Bandi and Phillips (2003) proved consistency and asymptotic normality of Nadaraya-Watson kernel estimators of the drift and the diffusion coefficient.In some cases, we observe the time-series data with measurement error. For instance, it is a well-known fact that we observe the financial time-series data with measurement errors (Zhou, 1996). For the nonparametric estimation of the drift and the diffusion coefficients in the presence of measurement error, some works are done for the estimation of integrated volatility, which is the integral of the diffusion coefficient over a fixed period of time, but little work exists on the estimation of the drift and the diffusion coefficients themselves. In this thesis, we focus on the estimation of the drift coefficient, and we propose a consistent and asymptotically normal Nadaraya-Watson type kernel estimator of the drift coefficient in the presence of measurement error.

View record

Dynamic Duration of Load Models (2011)

The duration of load effect is a distinctive and important characteristic of wood strength. It refers to the fact that wood products can usually sustain a high load for a short time but the products may deteriorate and break in the long run. Modelling the duration of load effect and testing wood for specific properties of this effect are important in formulating wood construction standards.Damage accumulation models have been proposed by authors to model the duration of load effects. The models assume that damage is accumulated over time according to the load history, and once the accumulated damage reaches a threshold value, the board will break. Different authors have designed different experiments and proposed different methods for estimating the model parameters. In this work, we consider several damage accumulation models, with a focus on the U.S. model. We investigate the effects of the distributional assumptions for the models, and propose several methods to estimate parameters in the models. Our proposed methods are evaluated via simulation studies. Two real datasets are present for illustration.

View record


Membership Status

Member of G+PS
View explanation of statuses

Program Affiliations



If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.


Sign up for an information session to connect with students, advisors and faculty from across UBC and gain application advice and insight.