Michiel Van de Panne

Professor

Research Classification

Computer Science and Statistics
Computer Sciences and Mathematical Tools
Robotics and Automation

Research Interests

simulation of human movement
computer animation
Robotics
deep reinforcement learning
motor control
computer graphics

Relevant Thesis-Based Degree Programs

 
 

Recruitment

Master's students
Doctoral students
Postdoctoral Fellows
Any time / year round

simulation of agile human motion, robotics, motion planning, sensorimotor control, dexterous manipulation, computer animation, machine learning, computer graphics, novel interfaces

I support experiential learning experiences, such as internships and work placements, for my graduate students and Postdocs.

Complete these steps before you reach out to a faculty member!

Check requirements
  • Familiarize yourself with program requirements. You want to learn as much as possible from the information available to you before you reach out to a faculty member. Be sure to visit the graduate degree program listing and program-specific websites.
  • Check whether the program requires you to seek commitment from a supervisor prior to submitting an application. For some programs this is an essential step while others match successful applicants with faculty members within the first year of study. This is either indicated in the program profile under "Admission Information & Requirements" - "Prepare Application" - "Supervision" or on the program website.
Focus your search
  • Identify specific faculty members who are conducting research in your specific area of interest.
  • Establish that your research interests align with the faculty member’s research interests.
    • Read up on the faculty members in the program and the research being conducted in the department.
    • Familiarize yourself with their work, read their recent publications and past theses/dissertations that they supervised. Be certain that their research is indeed what you are hoping to study.
Make a good impression
  • Compose an error-free and grammatically correct email addressed to your specifically targeted faculty member, and remember to use their correct titles.
    • Do not send non-specific, mass emails to everyone in the department hoping for a match.
    • Address the faculty members by name. Your contact should be genuine rather than generic.
  • Include a brief outline of your academic background, why you are interested in working with the faculty member, and what experience you could bring to the department. The supervision enquiry form guides you with targeted questions. Ensure to craft compelling answers to these questions.
  • Highlight your achievements and why you are a top student. Faculty members receive dozens of requests from prospective students and you may have less than 30 seconds to pique someone’s interest.
  • Demonstrate that you are familiar with their research:
    • Convey the specific ways you are a good fit for the program.
    • Convey the specific ways the program/lab/faculty member is a good fit for the research you are interested in/already conducting.
  • Be enthusiastic, but don’t overdo it.
Attend an information session

G+PS regularly provides virtual sessions that focus on admission requirements and procedures and tips how to improve your application.

 

ADVICE AND INSIGHTS FROM UBC FACULTY ON REACHING OUT TO SUPERVISORS

These videos contain some general advice from faculty across UBC on finding and reaching out to a potential thesis supervisor.

Graduate Student Supervision

Doctoral Student Supervision

Dissertations completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest dissertations.

Reinforcement learning for legged robot locomotion (2022)

Deep reinforcement learning (DRL) offers a promising approach for the synthesis of control policies for legged robots locomotion. However, it remains challenging to learn policies that are robust to uncertainty in the real world to put on physical robots or policies that can handle complicated environments. In this thesis, we take several significant steps towards efficiently learning legged locomotion skills with DRL. First, we present a framework to learn feedback policies for a bipedal robotCassie, utilizing rough motion sketches. An iterative design process is then proposed to refine, compress and combine policies for effective sim-to-real transfer. Second, we explore the role of dynamics randomization on a quadrupedal robotLaikago. We demonstrate that with appropriate design choices, dynamics randomization is often not necessary for sim-to-real. We further analyze situations that randomization would become necessary. Third, we propose and analyze multiple curriculum learning approaches to solve the challenging stepping stone tasks for bipedal locomotion. We demonstrate that gradually increasing task difficulties can reliably train policies that solve challenging stepping stone sequences. Finally, we investigate the combination of reinforcement learning and model-based control by training quadrupedal policies using a centroidal model.

View record

Techniques in learning-based approaches for character animation (2022)

Contemporary computer animation research has benefited substantially from the advancement of deep learning and deep reinforcement learning methods in the past decade. Despite the performance and flexibility of learning-based methods, significant manual effort is still required to tune the training data, algorithms, and environments, especially when computing resources are limited. In this thesis, we develop and evaluate a variety of learning-based methods that enable new skills, and that improve the stability of the learning algorithms and the quality of the synthesized motions. First, we present a framework for learning autoregressive kinematic motion generators and controllers from motion capture data. By disentangling the motion modelling and control tasks, our framework can efficiently synthesize controllable virtual characters by leveraging the strengths of supervised and reinforcement learning. Second, we study the effects of symmetry in learning physics-based locomotion controllers for bipedal characters. We evaluate four possible methods to impose symmetry and show that enforcing symmetry improves the naturalness and task performance of the trained controllers. Third, we explore the role of learning curricula in solving challenging physics-based stepping-stone tasks. The learning is significantly more robust and efficient under a learning curriculum which gradually increases the task difficulty. Finally, we combine simplified models and imitation learning to train brachiation controllers. We show that sparse task objective alone is sufficient for training a controller in an abstracted point-mass brachiation environment. Then, using the point-mass as a reference trajectory for the centre-of-mass, we can learn a control policy for a physics-based 14-link planar articulated gibbon model.

View record

Scalable deep reinforcement learning for physics-based motion control (2019)

This thesis studies the broad problem of learning robust control policies for difficult physics-based motion control tasks such as locomotion and navigation. A number of avenues are explored to assist in learning such control. In particular, are there underlying structures in the motor-learning system that enable learning solutions to complex tasks? How are animals able to learn new skills so efficiently? Animals may be learning and using implicit models of their environment to assist in planning and exploration. These potential structures motivate the design of learning systems and in this thesis, we study their effectiveness on physically simulated and robotic motor-control tasks. Five contributions that build on motion control using deep reinforcement learning are presented.First, a case study on the motion control problem of brachiation, the movement of gibbons through trees is presented. This work compares parametric and non-parametric models for reinforcement learning. The difficulty of this motion control problem motivates separating the control problem into multiple levels.Second, a hierarchical decomposition is presented that enables efficient learning by operating across multiple time scales for a complex locomotion and navigation task. First, reinforcement learning is used to acquire a low-level, high-frequency policy for joint actuation, used for bipedal footstep-directed walking. Subsequently, an additional policy is learned that provides directed footstep plans to the first level of control in order to navigate through the environment.Third, improved action exploration methods are investigated. An explicit action valued function is constructed using the learned model. Using this action-valued function we can compute actions that increase the value of future states.Fourth, a new algorithm is designed to progressively learn and integrate new skills producing a robust and multi-skilled physics-based controller. This algorithm combines the skills of experts and then applies transfer learning methods to initialize and accelerate the learning of new skills.In the last chapter, the importance of good benchmarks for improving reinforcement learning research is discussed. The computer vision community has benefited from large carefully processed collections of data, and, similarly, reinforcement learning needs well constructed and interesting environments to drive progress.

View record

Topological modeling for vector graphics (2017)

In recent years, with the development of mobile phones, tablets, and web technologies, we have seen an ever-increasing need to generate vector graphics content, that is, resolution-independent images that support sharp rendering across all devices, as well as interactivity and animation. However, the tools and standards currently available to artists for authoring and distributing such vector graphics content have many limitations. Importantly, basic topological modeling, such as the ability to have several faces share a common edge, is largely absent from current vector graphics technologies. In this thesis, we address this issue with three major contributions.First, we develop theoretical foundations of vector graphics topology, grounded in algebraic topology. More specifically, we introduce the concept of Point-Curve-Surface complex (PCS complex) as a formal tool that allows us to interpret vector graphics illustrations as non-manifold, non-planar, non-orientable topological spaces immersed in R2, unlike planar maps which canonly represent embeddings.Second, based on this theoretical understanding, we introduce the vector graphics complex (VGC) as a simple data structure that supports fundamental topological modeling operations for vector graphics illustrations. It allows for the direct representation of incidence relationships between objects, while at the same time keeping the geometric flexibility of stacking-based systems,such as the ability to have edges and faces overlap each others.Third and last, based on the VGC, we introduce the vector animation complex (VAC), a data structure for vector graphics animation, designed to support the modeling of time-continuous topological events, which are common in 2D hand-drawn animation. This allows features of a connected drawing to merge, split, appear, or disappear at desired times via keyframes that introduce the desired topological change. Because the resulting space-time complex directlycaptures the time-varying topological structure, features are readily edited in both space and time in a way that reflects the intent of the drawing.

View record

Style Exploration and Generalization for Character Animation (2016)

Believable character animation arises from a well orchestrated performance by a digital character. Various techniques have been developed to help drive this performance in an effort to create believable character animations. However, automatic style exploration and generalization from motion data are still open problems. We tackle several different aspects of the motion generation problem which aim to advance the state of the art in the areas of style exploration and generalization. First, we describe a novel optimization framework that produces a diverse range of motions for physics-based characters for tasks such as jumps, flips, and walks. This stands in contrast to the more common use of optimization to produce a single optimal motion. The solutions can be optimized to achieve motion diversity or diversity in the proportions of the simulated characters. Exploration of style of task achievement for physics-based character animation can be performed automatically by exploiting ``null spaces'' defined by the task. Second, we perform automatic style generalization by generalizing a controller for varying degree of task achievement for a specified task. We describe an exploratory approach which explores trade-offs between competing objectives for a specified task. Pareto-optimality can be used to explore various degrees of task achievement for a given style of physics-based character animation. We describe our algorithms for computing a set of controllers that span the pareto-optimal front for jumping motions which explore the trade-off between effort and jump height. We also develop supernatural jump controllers through the optimized introduction of external forces. Third, we develop a data-driven approach to model sub-steps, such as, sliding foot pivots and foot shuffling. These sub-steps are often an integral component of the style observed in task-specific locomotion. We present a model for generating these sub-steps via a foot step planning algorithm which is then used to generate full body motion. The system is able to generalize the style observed in task-specific locomotion to novel scenarios.

View record

Real-Time Planning and Control for Simulated Bipedal Locomotion (2011)

Understanding and reproducing the processes that give rise to purposeful human and animal motions has long been of interest in the fields of character animation, robotics and biomechanics. However, despite the grace and agility with which many living creatures effortlessly perform skilled motions, modeling motor control has proven to be a difficult problem. Building on recent advances, this thesis presents several approaches to creating control policies that allow physically-simulated characters to demonstrate skill and purpose as they interact with their virtual environments.We begin by introducing a synthesis-analysis-synthesis framework that enables physically-simulated characters to navigate environments with significant stepping constraints. First, an offline optimization method is used to compute control solutions for randomly-generated example problems. Second, the example motions and their underlying control patterns are analyzed to build a low-dimensional step-to-step model of the dynamics. Third, the dynamics model is exploited by a planner to solve new instances of the task in real-time. We then present a method for precomputing robust task-based control policies for physically simulated characters. This allows our characters to complete higher-level locomotion tasks, such as walking in a user specified direction, while interacting with the environment in significant ways. As input, the method assumes an abstract action vocabulary consisting of balance-aware locomotion controllers. A constrained state exploration phase is first used to define a dynamics model as well as a finite volume of character states over which the control policy will be defined. An optimized control policy is then computed using reinforcement learning.Lastly, we describe a control strategy for walking that generalizes well across gait parameters, motion styles, character proportions, and a variety of skills. The control requires no character-specific or motion-specific tuning, is robust to disturbances, and is simple to compute. The method integrates tracking using proportional-derivative control, foot placement adjustments using an inverted pendulum model and Jacobian transpose control for gravity compensation and fine-level velocity tuning. We demonstrate a variety of walking-related skills such as picking up objects placed at any height, lifting, pulling, pushing and walking with heavy crates, ducking over and stepping under obstacles and climbing stairs.

View record

Master's Student Supervision

Theses completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest theses.

Imitating optimized trajectories for dynamic quadruped behaviors (2023)

Reinforcement Learning (RL) has seen many recent successes for quadruped robot control. The imitation of reference motions provides a simple and powerful prior for guiding solutions towards desired solutions without the need for meticulous reward design. While much work uses motion capture data or hand-crafted trajectories as the reference motion, relatively little work has explored the use of reference motions coming from model-based trajectory optimization. This may be advantageous because using high quality reference motions from trajectory optimization could alleviate the need to tune RL environments specifically to every task, thus shortening the time necessary to design controllers for robots through RL. In this work, we investigate several design considerations that arise with such a framework, as demonstrated through four dynamic behaviours: trot, front hop, 180 backflip, and biped stepping. These are trained in simulation and transferred to a physical Solo 8 quadruped robot without further adaptation. In particular, we explore the space of feed-forward designs afforded by the trajectory optimizer to understand its impact on RL learning efficiency and sim-to-real transfer. These findings contribute to the long standing goal of producing robot controllers that combine the interpretability and fast optimization of model-based optimization with the robustness that model-free RL-based controllers offer.

View record

Beyond learning curves: understanding stochasiticity and learned solution modes in reinforcement learning (2022)

While deep reinforcement learning (Deep RL) algorithms have been used to successfully solve challenging decision making and control tasks, their behavior often remains poorly understood. Studies and comparisons between algorithms are often done through impoverished and partial signals such as learning curves and individual rollout videos. In this work, we follow along a tradition of work which dives deeper into why exactly algorithms produce different rewards from run to run on different tasks. We aim to go beyond learning curves and develop a more holistic view of both the optimization landscape of particular environments and the multimodal behaviors that algorithms produce for given environments. To this end, we develop a set of tools for comparing many runs of deep reinforcement learning algorithms and rollouts from a single policy. We use these to answer a broad range of questions about RL.

View record

Learning to get up with deep reinforcement learning (2022)

Getting up from an arbitrary fallen state is a basic human skill. Existing methods for learning this skill often generate highly dynamic and erratic get-up motions, which do not resemble human get-up strategies, or are based on tracking recorded human get-up motions. In this paper, we present a staged approach using reinforcement learning, without recourse to motion capture data. The method first takes advantage of a strong character model, which facilitates the discovery of solution modes. A second stage then learns to adapt the control policy to work with progressively weaker versions of the character. Finally, a third stage learns control policies that can reproduce the weaker get-up motions at much slower speeds. We show that across multiple runs, the method can discover a diverse variety of get-up strategies, and execute them at a variety of speeds.The results usually produce policies that use a final stand-up strategy that is common to the recovery motions seen from all initial states. However, we also find policies for which different strategies are seen for prone and supine initial fallen states.The learned get-up control strategies have significant static stability, i.e., they can be pausedat a variety of points during the get-up motion.We further test our method on novel constrained scenarios, such as having a leg and an arm in a cast.

View record

Reinforcement learning in the presence of sensing costs (2022)

In recent years, reinforcement learning (RL) has become an increasingly popular framework for formalizing decision-making problems. Despite its popularity, the use of RL has remained relatively limited in challenging real-world scenarios, due to various unrealistic assumptions made about the environment, such as assuming sufficiently accurate models to train on in simulation, or no significant delays between the execution of an action and receiving the next observation. Such assumptions unavoidably make RL algorithms suffer from poor generalization. In this work, we aim to take a closer look at how incorporating realistic constraints impact the behaviour of RL agents. In particular, we consider the cost in time and energy of making observations and taking a decision, which is an important aspect of natural environments that is typically overlooked in a traditional RL setup. As a first attempt, we propose to explicitly incorporate the cost of sensing the environment into the RL training loop, and analyze the emerging behaviours of the agent on a suite of simulated gridworld environments.

View record

Directable physics-based character animation (2021)

Animated motions should be simple to direct while also being plausible. The work presented in this thesis develops a trajectory optimization system that takes as input a sequence of crudely-specified keyframes as motion sketches and produces a physically simulated, as-plausible-as-possible motion as output. We propose a novel control parameterization scheme for trajectory optimization, compactly incorporating keyframe timing, internal actions, and external assistive force modulations as auxiliary variables to realize desired motion sketches. Our method allows for emergent behaviours between keyframes, does not require advance knowledge of contacts or exact motion timing, supports the creation of physically-impossible motions, and allows for near-interactive motion creation. The use of a shooting method allows for the use of any black-box simulator. We present results for a variety of 2D and 3D motions, motion sketches that have sparse and dense keyframes, and both physically-feasible and physically-infeasible motions. We evaluate our control parameterization scheme against other recent methods that incorporate external assistive forces.

View record

Learning locomotion: symmetry and torque limit considerations (2019)

Deep reinforcement learning offers a flexible approach to learning physics-based locomotion. However, these methods are sample-inefficient and the result usually has poor motion quality when learned without the help of motion capture data.This work investigates two approaches that can make motions more realistic while having equal or higher learning efficiency.First, we propose a way of enforcing torque limits on the simulated character without degrading the performance. Torque limits indicate how strong a character is and therefore has implications on how realistic the resulting motion looks. We show that using realistic limits from the beginning can hinder training performance. Our method uses a curriculum learning approach in which the agent is gradually faced with more difficult tasks. This way the resulting motion becomes more realistic without sacrificing performance.Second, we explore methods that can incorporate left-right symmetry into the learning process which highly increases the motion quality. Gait symmetry is an indicator of health and asymmetric motion is easily noticeable by human observers. We compare two novel approaches as well as two existing methods of incorporating symmetry in the reinforcement learning framework. We also introduce a new metric for evaluating gait symmetry and confirm that the resulting motion has higher motion quality.

View record

Data driven auto-completion for keyframe animation (2018)

Keyframing is the main method used by animators to choreograph appealing motions,but the process is tedious and labor-intensive. In this thesis, we present adata-driven autocompletion method for synthesizing animated motions from inputkeyframes. Our model uses an autoregressive two-layer recurrent neural networkthat is conditioned on target keyframes. Given a set of desired keys, the trainedmodel is capable of generating a interpolating motion sequence that follows thestyle of the examples observed in the training corpus.We apply our approach to the task of animating a hopping lamp character andproduce a rich and varied set of novel hopping motions using a diverse set of hopsfrom a physics-based model as training data. We discuss the strengths and weaknessesof this type of approach in some detail.

View record

Developing locomotion skills with deep reinforcement learning (2017)

While physics-based models for passive phenomena such as cloth and fluids have been widely adopted in computer animation, physics-based character simulation remains a challenging problem. One of the major hurdles for character simulation is that of control, the modeling of a character's behaviour in response to its goals and environment. This challenge is further compounded by the high-dimensional and complex dynamics that often arise from these systems. A popular approach to mitigating these challenges is to build reduced models that capture important properties for a particular task. These models often leverage significant human insight, and may nonetheless overlook important information. In this thesis, we explore the application of deep reinforcement learning (DeepRL) to develop control policies that operate directly using high-dimensional low-level representations, thereby reducing the need for manual feature engineering and enabling characters to perform more challenging tasks in complex environments.We start by presenting a DeepRL framework for developing policies that allow character to agilely traverse across irregular terrain. The policies are represented using a mixture of experts model, which selects from a small collection of parameterized controllers. Our method is demonstrated on planar characters of varying morphologies and different classes of terrain. Through the learning process, the networks develop the appropriate strategies for traveling across various irregular environments without requiring extensive feature engineering. Next, we explore the effects of different action parameterizations on the performance of RL policies. We compare policies trained using low-level actions, such as torques, target velocities, target angles, and muscle activations. Performance is evaluated using a motion imitation benchmark. For our particular task, the choice of higher-level actions that incorporate local feedback, such as target angles, leads to significant improvements in performance and learning speed. Finally, we describe a hierarchical reinforcement learning framework for controlling the motion of a simulated 3D biped. By training each level of the hierarchy to operate at different spatial and temporal scales, the character is able to perform a variety of locomotion tasks that require a balance between short-term and long-term planning. Some of the tasks include soccer dribbling, path following, and navigation across dynamic obstacles.

View record

Embodied perception during walking using Deep Recurrent Neural Networks (2017)

Movements such as walking require knowledge of the environment in order to be robust. This knowledge can be gleaned via embodied perception. While information about the upcoming terrain such as compliance, friction, or slope may be difficult to directly estimate, using the walking motion itself allows for these properties to be implicitly observed over time from the stream of movement data. However, the relationship between a parameter such as ground compliance and the movement data may be complex and difficult to discover. In this thesis, we demonstrate the use of a Deep LSTM Network to estimate slope and ground compliance of terrain by observing a stream of sensory information that includes the character state and foot pressure information.

View record

Design and Integration of Controllers for Simulated Characters (2015)

Developing motions for simulated humanoids remains a challenging problem. While there exists a multitude of approaches, few of these are reimplemented or reused by others. The predominant focus of papers in the area remains on algorithmic novelty, due to the difficulty and lack of incentive to more fully explore what can be accomplished within the scope of existing methodologies. We develop a language, based on common features found across physics based character animation research, that facilitates the controller authoring process. By specifying motion primitives over a number of phases, our language has been used to design over 25 controllers for motions ranging from simple static balanced poses, to highly dynamic stunts. Controller sequencing is supported in two ways. Naive integration of controllers is achieved by using highly stable pose controllers (such as a standing or squatting) as intermediate transitions. More complex controller connections are automatically learned through an optimization process. The robustness of our system is demonstrated via random walkthroughs of our integrated set of controllers.

View record

Design and Optimization of Control Primitives for Simulated Characters (2014)

Physics-based character motion has the potential of achieving realistic motionswithout laborious work from artists and without needing to use motion capturedata. It has potential applications in film, games and humanoid robotics. However,designing a controller for physics motions is a difficult task. It requires expertisein software engineering and understanding of control methods. Researchers typicallydevelop their own dedicated software framework and invent their own setsof control rules to control physics-based characters. This creates an impedimentto the non-expert who wants to create interesting motions and others who want toshare and revise motions. In this thesis, we demonstrate that a set of motion primitivesthat have been developed in recent years constitute effective building blocksfor authoring physics-based character motions. These motion primitives are madeaccessible using an expressive and flexible motion scripting language. The motionlanguage allows a motion designer to create controllers in a text file that can beloaded at runtime. This is intended to simplify motion design, debugging, understandingand sharing. We use this framework to create several interesting 2D planarmotions. An optimization framework is integrated that allows the hand-designedmotion controller to be optimized for more interesting behaviors, such as a fastprone-to-standing motion.We also develop a state-action compatibility model for adaping controllers tonew situations. The state-action compatibility model maintains a hypervolume ofcompatible states (“situations”) and actions (controllers). It allows queries for compatibleactions given a state.

View record

Exploring structured predictions from sensorimotor data during non-prehensile manipulation using both simulations and robots (2014)

Robots are equipped with an increasingly wide array of sensors in order to enable advanced sensorimotor capabilities. However, the efficient exploitation of theresulting data streams remains an open problem. We present a framework for learning when and where to attend in a sensorimotor stream in order to estimate specifictask properties, such as the mass of an object. We also identify the qualitativesimilarity of this ability between simulation and robotic system. The framework isevaluated for a non-prehensile ”topple-and-slide” task, where the data from a setof sensorimotor streams are used to predict the task property, such as object mass,friction coefficient, and compliance of the block being manipulated. Given the collected data streams for situations where the block properties are known, the methodcombines the use of variance-based feature selection and partial least-squares estimation in order to build a robust predictive model for the block properties. Thismodel can then be used to make accurate predictions during a new manipulation.We demonstrate results for both simulation and robotic system using up to 110sensorimotor data streams, which include joint torques, wrist forces/torques, andtactile information. The results show that task properties such as object mass,friction coefficient and compliance can be estimated with good accuracy from thesensorimotor streams observed during a manipulation.

View record

Real-Time Predictions from Unlabled High-Dimensional Sensory Data During Non-Prehensile Manipulation (2014)

Robots can be readily equipped with sensors that span a growing range of modalities and price-points. However, as sensors increase in number and variety, making the best use of the rich multi-modal sensory streams becomes increasingly challenging. In this thesis, we demonstrate the ability to make efficient and accurate task-relevant predictions from unlabeled streams of sensory data for a non-prehensile manipulation task. Specifically, we address the problem of making real-time predictions of the mass, friction coefficient, and compliance of a block during a topple-slide task, using an unlabeled mix of 1650 features composed of pose, velocity, force, torque, and tactile sensor data samples taken during the motion. Our framework employs a partial least squares (PLS) estimator as computed based on training data. Importantly, we show that the PLS predictions can be made significantly more accurate and robust to noise with the use of a feature selection heuristic, the task variance ratio, while using as few as 5% of the original sensory features. This aggressive feature selection further allows for reduced bandwidth when streaming sensory data and reduced computational costs of the predictions. We also demonstrate the ability to make online predictions based on the sensory information received to date. We compare PLS to other regression methods, such as principal components regression. Our methods are tested on a WAM manipulator equipped with either a spherical probe or a BarrettHand with arrays of tactile sensors.

View record

Reinforcement learning using sensorimotor traces (2014)

The skilled motions of humans and animals are the result of learning good solutionsto difficult sensorimotor control problems. This thesis explores new modelsfor using reinforcement learning to acquire motion skills, with potential applicationsto computer animation and robotics. Reinforcement learning offers a principledmethodology for tackling control problems. However, it is difficult to applyin high-dimensional settings, such as the ones that we wish to explore, where thebody can have many degrees of freedom, the environment can have significantcomplexity, and there can be further redundancies that exist in the sensory representationsthat are available to perceive the state of the body and the environment.In this context, challenges to overcome include: a state space that cannot be fullyexplored; the need to model how the state of the body and the perceived state ofthe environment evolve together over time; and solutions that can work with onlya small number of sensorimotor experiences.Our contribution is a reinforcement learning method that implicitly representsthe current state of the body and the environment using sensorimotor traces. Adistance metric is defined between the ongoing sensorimotor trace and previouslyexperienced sensorimotor traces and this is used to model the current state as aweighted mixture of past experiences. Sensorimotor traces play multiple roles inour method: they provide an embodied representation of the state (and thereforealso the value function and the optimal actions), and they provide an embodiedmodel of the system dynamics.In our implementation, we focus specifically on learning steering behaviors fora vehicle driving along straight roads, winding roads, and through intersections.The vehicle is equipped with a set of distance sensors. We apply value-iteration using off-policy experiences in order to produce control policies capable of steeringthe vehicle in a wide range of circumstances. An experimental analysis is providedof the effect of various design choices.In the future we expect that similar ideas can be applied to other high-dimensionalsystems, such as bipedal systems that are capable of walking over variable terrain,also driven by control policies based on sensorimotor traces.

View record

Modeling standing, walking and rolling skills for physics-based character animation (2013)

Physics-based character simulation is an important open problem with potential applications in robotics and biomechanics and computer animation for films and games. In this thesis we develop controllers for the real-time simulation of several motion skills, including standing balance, walking, forward rolling, and lateral rolling on the ground. These controllers are constructed from a common set of components. We demonstrate that the combination of a suitable vocabulary of components and optimization has the potential to model a variety of skills.

View record

Improvisational interfaces for visualization construction and scalar function sketching (2012)

Presentations are an important aspect of daily communication in most organizations. As sketch, and gesture-capable interfaces such as tablets and smart boards become increasingly common, they open up new possibilities for interacting with presentations. This thesis explores two new interface prototypes to improve upon otherwise tedious presentation needs such as demonstrating models based on scalar functions, and visualization of data. We combine a spreadsheet style interface with sketching of scalar mathematical functions to develop and demonstrate intuitive mathematical models without the need of coding or complex equations. We also explore sketch and gesture based creation of data visualizations.

View record

The animation canvas: a sketch-based visual language for motion editing (2012)

We propose the Animation Canvas, a system for working with character animation. The canvas is an interactive two-dimensional environment similar to a sketch editor. Abstract interaction modes and controls are provided to support editing tasks. Consistent motion-as-curve and pose-as-point metaphors unify different features of the system. The metaphors and interactive elements of the system define a visual language allowing users to explore, manipulate, and create motions.The canvas also serves as a framework for presenting interactive motion editing techniques. We have developed two techniques in order to explore possibilities for motion editing while demonstrating the flexibility of the system. The first technique is a method for interacting with motion graphs in order to explore motion connectivity and construct new blended motions from shorter clips. The second is a real-time spatial interpolation system that enables users to construct new motions or control an animated character.

View record

Learning reduced order linear feedback policies for motion skills (2011)

Skilled character motions need to adapt to their circumstances and this is typically accomplished with the use of feedback. However, good feedback strategies are difficult to author and this has been a major stumbling block in the development of physics-based animated characters. In this thesis we present a framework for the automated design of compact linear feedback strategies. We show that this can be an effective substitute for manually-designed abstract models such as the use of inverted pendulums for the control of simulated walking. Results are demonstrated for a variety of motion skills, including balancing, hopping, ball kicking, single-ball juggling, ball volleying, and bipedal walking. The framework uses policy search in the space of reduced-order linear feedback matrices as a means of developing an optimized linear feedback strategy. The generality of the method allows for the automated development of highly-effective unconventional feedback loops, such as the use of foot pressure feedback to achieve robust physics-based bipedal walking.

View record

Physics-based animation of primate locomotion (2011)

Quadrupedal animals commonly appear in films and video games, and their locomotion is of interests to several research and industrial communities. Because of the difficulty of handling and motion capturing these animals, physics-based animationis a promising method for synthesizing quadrupedal locomotion. In this thesis, we investigate control strategies for animating a gorilla model, as an example of primate quadrupeds.We review the state of the art in quadrupedal animation and robotics, and in particular a control framework designed for a simulated dog. We investigate the essential control strategy modifications as necessitated by the unique characteristics of gorilla morphology and locomotion style. We generate controllers for physically realistic walking and trotting gaits for a 3D gorilla model. We also rig a 3D mesh model of a gorilla with Maya, a commercial animation software. Gorilla gait motions are synthesized in our simulation using the rigged skeleton, and synthesized gaits are exported though a motion data pipeline back to Maya for rendering.

View record

Rising motion controllers for physically simulated characters (2011)

The control of physics-based simulated characters is an important open problem with potential applications in film, games, robotics, and biomechanics. While many methods have been developed for locomotion and quiescent stance, the problem of returning to a standing posture from a sitting or fallen posture has received much less attention. In this thesis, we develop controllers for biped sit-to-stand, quadruped getting-up, and biped prone-to-stand motions. These controllers are created from a shared set of simple components including pose tracking, root orientation correction, and virtual force based control. We also develop an optimization strategy that generates fast, dynamic rising motions from an initial statically stable motion. This strategy is also used to generalize controllers to sloped terrain and characters of varying size.

View record

Staged learning of agile motor skills (2011)

Motor learning lies at the heart of how humans and animals acquire their skills. Understanding of this process enables many benefits in Robotics, physics-based Computer Animation, and other areas of science and engineering. In this thesis, we develop a computational framework for learning of agile, integrated motor skills. Our algorithm draws inspiration from the process by which humans and animals acquire their skills in nature. Specifically, all skills are learned through a process of staged, incremental learning, during which progressively more complex skills are acquired and subsequently integrated with prior abilities. Accordingly, our learning algorithm is comprised of three phases. In the first phase, a few seed motions that accomplish goals of a skill are acquired. In the second phase, additional motions are collected through active exploration. Finally, the third phase generalizes from observations made in the second phase to yield a dynamics model that is relevant to the goals of a skill. We apply our learning algorithm to a simple, planar character in a physical simulation and learn a variety of integrated skills such as hopping, flipping, rolling, stopping, getting up and continuous acrobatic maneuvers. Aspects of each skill, such as length, height and speed of the motion can be interactively controlled through a user interface. Furthermore, we show that the algorithm can be used without modification to learn all skills for a whole family of parameterized characters of similar structure. Finally, we demonstrate that our approach also scales to a more complex quadruped character.

View record

Template-based sketch recognition using Hidden Markov Models (2011)

Sketch recognition is the process by which the objects in a hand-drawn diagramcan be recognized and identified. We provide a method to recognizeobjects in sketches by casting the problem in terms of searching for known2D template shapes in the sketch. The template is defined as an orderedpolyline and the recognition requires searching for a similarly-shaped sequentialpath through the line segments that comprise the sketch. The searchfor the best-matching path can be modeled using a Hidden Markov Model(HMM). We use an efficient dynamic programming method to evaluate theHMM with further optimizations based on the use of hand-drawn sketches.The technique we developed can cope with several issues that are commonto sketches such as small gaps and branching. We allow for objects with eitheropen or closed boundaries by allowing backtracking over the templates.We demonstrate the algorithm for a variety of templates and scanned drawings.We show that a likelihood score produced by the results can provide ameaningful measure of similarity to a template. An example-based methodis presented for setting a meaningful recognition threshold, which can allowfurther refinement of results when that template is used again. Limitationsof the algorithm and directions for future work are discussed.

View record

News Releases

This list shows a selection of news releases by UBC Media Relations over the last 5 years.

Current Students & Alumni

This is a small sample of students and/or alumni that have been supervised by this researcher. It is not meant as a comprehensive list.
 
 

If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.

 
 

Sign up for an information session to connect with students, advisors and faculty from across UBC and gain application advice and insight.