Keith Maillard

Professor

Relevant Degree Programs

 

Graduate Student Supervision

Master's Student Supervision (2010 - 2018)
Blue Jackal: go (no more) further beyond (2018)

No abstract available.

The Backroads (2017)

No abstract available.

Songs for a Lost Pod (2016)

No abstract available.

Dangerous Prey (2015)

No abstract available.

An empirical study of practical, theoretical and online variants of random forests (2014)

Random forests are ensembles of randomized decision trees where diversity is created by injecting randomness into the fitting of each tree. The combination of their accuracy and their simplicity has resulted in their adoption in many applications. Different variants have been developed with different goals in mind: improving predictive accuracy, extending the range of application to online and structure domains, and introducing simplifications for theoretical amenability. While there are many subtle differences among the variants, the core difference is the method of selecting candidate split points. In our work, we examine eight different strategies for selecting candidate split points and study their effect on predictive accuracy, individual strength, diversity, computation time and model complexity. We also examine the effect of different parameter settings and several other design choices including bagging, subsampling data points at each node, taking linear combinations of features, splitting data points into structure and estimation streams and using a fixed frontier for online variants. Our empirical study finds several trends, some of which are in contrast to commonly held beliefs, that have value to practitioners and theoreticians. For variants used by practitioners the most important discoveries include: bagging almost never improves predictive accuracy, selecting candidate split points at all midpoints can achieve lower error than selecting them uniformly at random, and subsampling data points at each node decreases training time without affecting predictive accuracy. We also show that the gap between variants with proofs of consistency and those used in practice can be accounted for by the requirement to split data points into structure and estimation streams. Our work with online forests demonstrates the potential improvement that is possible by selecting candidate split points at data points, constraining memory with a fixed frontier and training with multiple passes through the data.

View record

The Brothers Car (2014)

No abstract available.

The Cure for Birds (2013)

No abstract available.

Canoodlers (2012)

No abstract available.

The Peter Stories (2011)

No abstract available.

Do No Harm (2010)

No abstract available.

 

Membership Status

Member of G+PS
View explanation of statuses

Program Affiliations

Department(s)

 

If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.