Lang Wu


Relevant Degree Programs

Affiliations to Research Centres, Institutes & Clusters


Graduate Student Supervision

Doctoral Student Supervision (Jan 2008 - May 2021)
Joint modelling of complex longitudinal and survival data, with applications to HIV studies (2019)

In HIV vaccine studies, a major research objective is to identify immune responsebiomarkers measured longitudinally that may be associated with risk of HIV infection.This objective can be assessed via joint modelling of longitudinal and survivaldata. Joint models for HIV vaccine data are complicated by the following issues:(i) left censoring of some longitudinal data due to lower limits of quantification;(ii) mixed types of longitudinal variables; (iii) measurement errors, missing values,and outliers in longitudinal data; and (iv) computational challenges associatedwith likelihood inference. In this thesis, we propose innovative joint models andmethods for complex longitudinal and survival data to address the foregoing issuessimultaneously. Specifically, we consider two approaches to handle left censoreddata and a robust method to address b-outliers and e-outliers in longitudinal data.For parameter estimation, we propose approximate likelihood estimation methodsbased on so-called h-likelihood, which are computationally much more efficientthan “exact” or Monte Carlo methods such as Monte Carlo EM algorithms. Weevaluate the performances of the models and methods via comprehensive simulationstudies. Real data analyses are carried out in depth for a HIV vaccine study.

View record

Multivariate one-sided tests for multivariate normal and mixed effects regression models with missing data, semi-continuous data and censored data (2018)

In many applications, statistical models for real data often have natural constraints or restrictions on some model parameters. For example, the growth rate of a child is expected to be positive, and patients receiving anti-HIV treatments are expected to exhibit a decline in their viral loads. Hypothesis testing for certain model parameters incorporating the natural constraints is expected to be more powerful than testing ignoring the constraints. Although constrained statistical inference, especially multi-parameter order-restricted hypothesis testing, has been studied in the literature for several decades, methods for models for complex longitudinal data are still very limited. In this thesis, we develop innovative multi-parameter orderrestricted (or one-sided) hypothesis testing methods for modelling the following complex data: (1) multivariate normal data with non-ignorable missing values; (2) semi-continuous longitudinal data; and (3) left censored or truncated longitudinal data due to detection limits. We focus on testing mean parameters in the models, and the approaches are based on the likelihood methods. Some asymptotic results are obtained, and some computational challenges are discussed. Simulation studies are conducted to evaluate the proposed methods. Several real datasets are analyzed to illustrate the power advantages of proposed new tests.

View record

Joint Inference of NLME and GLMM Models with Informative Censoring (2015)

Non-linear mixed effects models (NLME) and generalized linear mixed effects models (GLMM)are commonly used to model longitudinal process. This thesis goes beyond the single processmodelling and focuses on jointly modelling multiple longitudinal processes with different typesof variables. In particular, we investigate methods on joint inference of NLME and GLMMmodels for the following three problems: (1) joint models of NLME and GLMM for completedata with NLME for the time-dependent mis-measured covariate and GLMM for discretelongitudinal response; (2) joint models with covariate subject to informative left censoring;and (3) joint models with informative right censoring with respect to both response andcovariate. For each problem, we propose two joint modelling methods to obtain "exact" andapproximate maximum likelihood estimates (MLEs) of all model parameters. Measurementerrors and missing data are addressed simultaneously in a unified way. Some asymptotic resultsare also developed. The proposed methods are illustrated with a HIV data. Simulation resultsshow that the joint modelling methods perform better than the commonly used naive methodand two-step method.

View record

Multivariate one-sided tests for multivariate normal and nonlinear mixed effects models with complete and incomplete data (2011)

Multivariate one-sided hypotheses testing problems arise frequently in practice. Various tests haven been developed for multivariate normal data. However only limited literatures are available for multivariate one-sided testing problems in regression models. In particular, one-sided tests for nonlinear mixed effects (NLME) models, whichare popular in many longitudinal studies, have not been studied yet, even in the cases of complete data. In practice, there are often missing values in multivariate data and longitudinal data. In this case, standard testing procedures based on complete data may not be applicable or may perform poorly if the observations that contain missing data are discarded. In this thesis, we propose testing methods for multivariate one-sided testing problems in multivariate normal distributions with missing data and for NLME models with complete and incomplete data. In the missing data case, testing methods are based on multiple imputations. Some theoretical results are presented. The proposedmethods are evaluated using simulations. Real data examples are presented to illustrate the methods.

View record

Master's Student Supervision (2010 - 2020)
Jointly modeling longitudinal process with measurement errors, missing data, and outliers (2013)

In many longitudinal studies, several longitudinal processes may be associated. For example, a time-dependent covariate in a longitudinal model may be measured with errors or have missing data, so it needs to be modeled together with the response process in order to address the measurement errors and missing data. In such cases, a joint inference is appealing since it can incorporate information of all processes simultaneously. The joint inference is not only more efficient than separate inferences but it may also avoid possible biases. In addition, longitudinal data often contain outliers, so robust methods for the joint models are necessary. In this thesis, we discuss joint models for two correlated longitudinal processes with measurement errors, missing data, and outliers. We consider two-step methods and joint likelihood methods for joint inference, and propose robust methods based on M-estimators to address possible outliers for joint models. Simulation studies are conducted to evaluate the performances of the proposed methods, and a real AIDS dataset is analyzed using the proposed methods.

View record

Two-Step and Jointliklihood Methods for Joint Models (2012)

Survival data often arise in longitudinal studies, and the survival process and the longitudinal process may be related to each other. Thus, it is desirable to jointly model the survival process and the longitudinal process to avoid possible biased and inefficient inferences from separate inferences. We consider mixed effects models (LME, GLMM, and NLME models) for the longitudinal process, and Cox models and accelerated failure time (AFT) models for the survival process. The survival model and the longitudinal model are linked through shared parameters or unobserved variables. We consider joint likelihood method and two-step methods to make joint inference for the survival model and the longitudinal model. We have proposed linear approximation methods to joint models with GLMM and NLME submodels to reduce computation burden and use existing software. Simulation studies are conducted to evaluate the performances of the joint likelihood method and two-step methods. It is concluded that the joint likelihood method outperforms the two-step methods.

View record

Approximate methods for joint models in longitudinal studies (2010)

Longitudinal studies often contain several statistical issues, suchas longitudinal process and time-to-event process, the associationamong which requires joint modeling strategy.We firstly review the recent researches on the joint modeling topic. After that, four popular inference methods are introduced for jointly analyzing longitudinal data and time-to-event data based on a combination of typical parametric models. However, some of them may suffer from non-ignorable bias of the estimators. Others may be computationally intensive or even lead to convergence problems.In this thesis, we propose an approximate likelihood-based simultaneous inference method for jointly modeling longitudinalprocess and time-to-event process with covariate measurement errors problem. By linearizing the joint model, we design a strategy for updating the random effects that connect the two processes, and propose two algorithm frameworks for different scenarios of joint likelihood function. Both frameworks approximate the multidimensional integral in the observed-data joint likelihood by analytic expressions, which greatly reduce the computational intensity of the complex joint modeling problem.We apply this new method to a real dataset along with some available methods. The inference result provided by our new method agrees with those from other popular methods, and makes sensible biological interpretation. We also conduct a simulation study for comparing these methods. Our new method looks promising in terms of estimation precision, as well as computation efficiency, especially when more subjects are given. Conclusions and discussions for future research are listed in the end.

View record

Joint inference for longitudinal and survival data with incomplete time-dependent covariates (2010)

In many longitudinal studies, individual characteristics associated with their repeated measures may be covariates for the time to an event of interest. Thus, it is desirable to model both the survival process and the longitudinal process together. Statistical analysis may be complicated with missing data or measurement errors in the time-dependent covariates. This thesis considers a nonlinearmixed-effects model for the longitudinal process and the Cox proportional hazards model for the survival process. We provide a method based on the joint likelihood for nonignorable missing data, and we extend the method to the case of time-dependent covariates. We adapt a Monte Carlo EM algorithm to estimate the model parameters. We compare the method with the existing two-step method with some interesting findings. A real example from a recent HIV study is used as an illustration.

View record

Wood Property Relationships and Survival Models in Reliability (2010)

It has been a topic of great interest in wood engineering tounderstand the relationships between the different strengthproperties of lumber and the relationships between the strengthproperties and covariates such as visual grading characteristics. Inour mechanical wood strength tests, each piece fails (breaks) aftersurviving a continuously increasing load to a level. The response ofthe test is the wood strength property --load-to-failure, which is in a verydifferent context from the standardtime-to-failure data in Biostatistics. Thistopic is also called reliability analysis inengineering.In order to describe the relationships among strength properties, wedevelop joint and conditional survival functions by both aparametric method and anonparametric approach. However,each piece of lumber can only be tested to destruction with onemethod, which makes modeling these joint strengths distributionschallenging. In the past, this kind of problem has been solved bysubjectively matching pieces of lumber, but the quality of thisapproach is then an issue.We apply the methodologies in survival analysis to the wood strengthdata collected in the FPInnovations (FPI) laboratory. The objectiveof the analysis is to build a predictive model that relates thestrength properties to the recorded characteristics (i.e. a survivalmodel in reliability). Our conclusion is that a type of wood defect(knot), a lumber grade status (off-grade: Yes/No) and a lumber'smodule of elasticity (moe) have statistically significant effects onwood strength. These significant covariates can be used to matchpieces of lumber. This paper also supports use of the acceleratedfailure time (AFT) model as an alternative to the Coxproportional hazard (Cox PH) model in the analysis ofsurvival data. Moreover, we conclude that the Weibull AFT modelprovides a much better fit than the Cox PH model in our data setwith a satisfying predictive accuracy.

View record


Membership Status

Member of G+PS
View explanation of statuses

Program Affiliations



If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.


Explore our wide range of course-based and research-based program options!