Mark Schmidt

Assistant Professor

Relevant Degree Programs

 

Graduate Student Supervision

Doctoral Student Supervision (Jan 2008 - May 2019)
Structured Bandits and Applications (2019)

No abstract available.

Greed is good : greedy optimization methods for large-scale structured problems (2018)

This work looks at large-scale machine learning, with a particular focus on greedy methods. A recent trend caused by big datasets is to use optimization methods that have a cheap iteration cost. In this category are (block) coordinate descent and Kaczmarz methods, as the updates of these methods only rely on a reduced subspace of the problem at each iteration. Prior to our work, the literature cast greedy variations of these methods as computationally expensive with comparable convergence rates to randomized versions. In this dissertation, we show that greed is good. Specifically, we show that greedy coordinate descent and Kaczmarz methods have efficient implementations and can be faster than their randomized counterparts for certain common problem structures in machine learning. We show linear convergence for greedy (block) coordinate descent methods under a revived relaxation of strong convexity from 1963, which we call the Polyak-Lojasiewicz (PL) inequality. Of the proposed relaxations of strong convexity in the recent literature, we show that the PL inequality is the weakest condition that still ensures a global minimum. Further, we highlight the exploitable flexibility in block coordinate descent methods, not only in the different types of selection rules possible, but also in the types of updates we can use. We show that using second-order or exact updates with greedy block coordinate descent methods can lead to superlinear or finite convergence (respectively) for popular machine learning problems. Finally, we introduce the notion of “active-set complexity”, which we define as the number of iterations required before an algorithm is guaranteed to reach the optimal active manifold, and show explicit bounds for two common problem instances when using the proximal gradient or the proximal coordinate descent method.

View record

Master's Student Supervision (2010 - 2018)
Deep kernel mean embeddings for generative modeling and feedforward style transfer (2017)

The generation of data has traditionally been specified using hand-craftedalgorithms. However, oftentimes the exact generative process is unknownwhile only a limited number of samples are observed. One such case isgenerating images that look visually similar to an exemplar image or as ifcoming from a distribution of images. We look into learning the generatingprocess by constructing a similarity function that measures how close thegenerated image is to the target image. We discuss a framework in whichthe similarity function is specified by a pre-trained neural network withoutfine-tuning, as is the case for neural texture synthesis, and a frameworkwhere the similarity function is learned along with the generative processin an adversarial setting, as is the case for generative adversarial networks.The main point of discussion is the combined use of neural networks andmaximum mean discrepancy as a versatile similarity function. Additionally, we describe an improvement to state-of-the-art style transferthat allows faster computations while maintaining generality of the generatingprocess. The proposed objective has desirable properties such as a simpleroptimization landscape, intuitive parameter tuning, and consistent frame-by-frame performance on video. We use 80,000 natural images and 80,000paintings to train a procedure for artistic style transfer that is efficient butalso allows arbitrary content and style images.

View record

 

Membership Status

Member of G+PS
View explanation of statuses

Program Affiliations

 

If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.