Danica Sutherland
Relevant Thesis-Based Degree Programs
Graduate Student Supervision
Master's Student Supervision
Theses completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest theses.
Studying neural networks in the limit of infinite-width has provided us with numerous valuable theoretical and practical insights about the initialization of NNs, their training dynamics and properties of the learnt functions. One of the theoretical tools emerged from this study is the empirical Neural Tangent Kernel (eNTK). The eNTK can provide a good understanding of a given network’s representation: they are often far less expensive to compute and applicable more broadly than infinite-width NTKs. In this work, we use eNTKs to predict the local dynamics of neural networks in an active learning setup to propose a new method for approximating active learning acquisition strategies that are based on retraining with hypothetically-labeled candidate data points. Furthermore, to tackle the notorious space and computational complexity of calculating eNTKs, we propose a fast approximation of eNTK using a block-diagonal kernel resulting from eNTK with respect to only one (or average) of the output neurons of a network. We further use this approximation in our proposed “look-ahead” strategies in deep active learning. We finally present empirical evidence that our querying strategy beats other look-ahead strategies by large margins, and achieves equal or better performance compared to state-of-the-art methods on several benchmark datasets in pool-based active learning.
View record
The gold standard privacy notion, differential privacy (DP), has gained widespread adoption in academic research, industry products, and government databases due to its mathematically provable privacy guarantee. However, the composability property of DP leads to privacy degradation with multiple accesses to the same data. Differentially private data generation has emerged as a solution, creating synthetic datasets resembling private data while allowing repeated access without additional privacy loss. Existing methods often assume specific use cases for synthetic data, limiting flexibility.This thesis addresses the challenge of producing flexible synthetic data by leveraging deep generative modeling and addressing privacy loss in other methods such as generative adversarial networks (GAN). we propose utilizing public data to learn perceptual features (PFs) for comparing real and synthetic data distributions, employing a non-adversarial generator training scheme based on Maximum Mean Discrepancy (MMD) to mitigate privacy loss.Experimental results reveal the efficacy of our method. it successfully generates samples for CIFAR-10, CelebA, MNIST, and FashionMNIST. Theoretical analysis of our privacy-preserving loss function clarifies the privacy-accuracy trade-offs.
View record
This work introduces two novel kernel-based measures to enforce certain invariance properties in the learned representation space of a deep neural network. The first method, MMD-B-Fair, learns fair representations of data via kernel two-sample testing. It finds neural features of data where a maximum mean discrepancy (MMD) test cannot distinguish between different representations of different sensitive groups, while preserving information about the target variable to be predicted. To minimize the power of an MMD test this method exploits the simple asymptotics of a block testing scheme to address challenges presented by the complex dependency of the test threshold on the estimated MMD. Compared to existing methods on fair representation learning, MMD-B-Fair does not require generative modeling or discriminative architectural tuning, and is able to achieve competitive results on fairness benchmarks and downstream transfer. The second method, CIRCE, introduces a measure of conditional independence for multivariate continuous-valued variables that can be efficiently used as a regularizer to learn deep neural features that are conditionally independent of a known distractor Z given a target label Y. CIRCE requires just a single ridge regression from Y to kernelized features of Z, which can be done in advance. It is then only necessary to enforce independence of the learned neural features from the residuals of this regression. By contrast, earlier measures of conditional dependence require multiple regressions for each step of feature learning, resulting in severe bias and variance, and greater computational cost. CIRCE has superior performance to previous methods on challenging benchmarks, including learning conditionally invariant image features. Python implementations of both methods are made publicly available at github.com/namratadeka/mmd-b-fair and github.com/namratadeka/circe.
View record
If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.