Renjie Liao

Assistant Professor

Relevant Thesis-Based Degree Programs

Graduate Student Supervision

Master's Student Supervision

Theses completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest theses.

Diffusion models for visual content generation: challenges and insights (2025)

Significant advancements have recently been made in image and video generative models. Among these, diffusion models have demonstrated a strong capability for generating high-quality images and videos, thus inviting significant study within the field. However, despite these exciting achievements, diffusion models for visual content generation still face numerous challenges. In this thesis, we focus on two key challenges facing diffusion models and propose potential solutions to address them.Firstly, the research on metrics for assessing generative models remains relatively underexplored, particularly in the domain of video generation. To bridge this research gap, we propose the Fréchet Video Motion Distance (FVMD) metric, which focuses on evaluating motion consistency in video generation. Specifically, we design explicit motion features based on key-point tracking and then measure the similarity between these features via the Fréchet distance. We conduct a sensitivity analysis by injecting noise into real videos to verify the effectiveness of FVMD. Further, we carry out a large-scale human study, demonstrating that our metric effectively detects temporal noise and aligns better with human perceptions of generated video quality than existing metrics.Second, diffusion models face challenges in compositionality and interpretability. While humans understand images structurally, generative models typically generate all pixels simultaneously. Latent Diffusion Models, widely used in this domain, rely on continuous latent variables from Variational Autoencoders (VAEs), which lack interpretability and structure. To address this, we propose DiffuseDRAW, a novel framework incorporating structured latent variables with diffusion models. Our approach integrates non-parametric structured latent variables from NP-DRAW with discrete vector-quantized representations from VQ-GAN. Built upon VQ-GAN, our model transforms input images into combined discrete latent variables and applies a diffusion model in the discrete latent space. We model dependencies between structured and discrete latent variables using a Transformer backbone with cross-conditioning. Experiments on CIFAR-10 and LSUN datasets demonstrate that our model outperforms prior structured generative models and competes with state-of-the-art diffusion models. Moreover, its compositionality and interpretability offer significant advantages in zero-shot latent space editing.

View record

Exploring Video Diffusion Models in Echocardiogram Generation: A Novel Approach to Data Augmentation in Cardiac Imaging (2025)

Ejection fraction (EF) serves as a critical indicator of cardiac function, traditionally assessed through expert clinicians' manual interpretation of echocardiograms. However, the labor-intensive nature of this process, along with inter-observer variability and data scarcity, highlights the need for automated and scalable solutions. This thesis explores the application of video diffusion models to generate synthetic echocardiograms as a means to augment limited datasets, thereby enhancing ejection fraction estimation models. By leveraging synthetic data generation, we aim to address data scarcity, enhance model performance, and validate the effectiveness of synthetic data in echocardiography (echo).The proposed methodology integrates diffusion models with echo video data to create realistic cardiac echocardiograms tailored to each patient. We also develop a data augmentation framework aiming at improving EF estimation. Extensive experiments are conducted to evaluate the contribution of the synthetic data to model performance on EF prediction accuracy, focusing on scenarios with limited labeled data. Our results demonstrate that incorporating diffusion-augmented training data leads to improvements in both the accuracy and robustness of automated EF estimation models.In addition, we investigate and present various strategies for the rapid generation of synthetic echocardiograms through model distillation. Our preliminary findings establish a foundation for future research in real-time echocardiogram synthesis, facilitating applications in clinical training and procedural guidance.Ultimately, this thesis provides a novel approach to controlled synthetic data-driven augmentation, contributing to the broader field of cardiac imaging by enabling more efficient and precise diagnostic tools. This work advances the potential for scalable, Artificial Intelligence (AI)-driven cardiac assessments, offering enhanced accessibility to high-quality care in high-resource and low-resource clinical environments.

View record

Visual question answering with contextualized commonsense knowledge (2024)

Exploring algorithmic reasoning and memorization in transformers : challenges and insights (2023)

If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.

Renjie Liao's Profile

Membership Status

Member of G+PS

View explanation of statuses

Program Affiliations

Electrical and Computer Engineering

Academic Unit(s)

Department of Electrical & Computer Engineering