Jiahua Chen


Research Classification

Research Interests

empirical likelihood
finite mixture model
sample survey
asymptotic theory

Relevant Thesis-Based Degree Programs

Research Options

I am available and interested in collaborations (e.g. clusters, grants).
I am interested in and conduct interdisciplinary research.
I am interested in working with undergraduate students on research projects.


Master's students
Doctoral students

I have research problems related to empirical likelihood, mixture models, and survey sampling. It is up to students to explore and take adventures.

To conduct research under my supervision, a strong technical foundation is essential. While many students may manage to solve challenging assignment problems through guesswork, this approach falls short of my expectations. What I value most is the ability to provide answers rooted in fundamental principles, rather than merely mimicking solutions to specific assignments.

Furthermore, I encourage a focus on the question, 'Is the statement true?' rather than 'Is my answer acceptable?' It's crucial to prioritize the objective truth of a statement over whether it aligns with pre-conceived notions.

Additionally, it's important to be prepared for rigorous questioning on every statement made and to substantiate each claim with logical reasoning rather than relying on emotional arguments. Your best effort does not validate any scientific proposition.

I support experiential learning experiences, such as internships and work placements, for my graduate students and Postdocs.
I am interested in hiring Co-op students for research placements.

Complete these steps before you reach out to a faculty member!

Check requirements
  • Familiarize yourself with program requirements. You want to learn as much as possible from the information available to you before you reach out to a faculty member. Be sure to visit the graduate degree program listing and program-specific websites.
  • Check whether the program requires you to seek commitment from a supervisor prior to submitting an application. For some programs this is an essential step while others match successful applicants with faculty members within the first year of study. This is either indicated in the program profile under "Admission Information & Requirements" - "Prepare Application" - "Supervision" or on the program website.
Focus your search
  • Identify specific faculty members who are conducting research in your specific area of interest.
  • Establish that your research interests align with the faculty member’s research interests.
    • Read up on the faculty members in the program and the research being conducted in the department.
    • Familiarize yourself with their work, read their recent publications and past theses/dissertations that they supervised. Be certain that their research is indeed what you are hoping to study.
Make a good impression
  • Compose an error-free and grammatically correct email addressed to your specifically targeted faculty member, and remember to use their correct titles.
    • Do not send non-specific, mass emails to everyone in the department hoping for a match.
    • Address the faculty members by name. Your contact should be genuine rather than generic.
  • Include a brief outline of your academic background, why you are interested in working with the faculty member, and what experience you could bring to the department. The supervision enquiry form guides you with targeted questions. Ensure to craft compelling answers to these questions.
  • Highlight your achievements and why you are a top student. Faculty members receive dozens of requests from prospective students and you may have less than 30 seconds to pique someone’s interest.
  • Demonstrate that you are familiar with their research:
    • Convey the specific ways you are a good fit for the program.
    • Convey the specific ways the program/lab/faculty member is a good fit for the research you are interested in/already conducting.
  • Be enthusiastic, but don’t overdo it.
Attend an information session

G+PS regularly provides virtual sessions that focus on admission requirements and procedures and tips how to improve your application.



These videos contain some general advice from faculty across UBC on finding and reaching out to a potential thesis supervisor.

Graduate Student Supervision

Doctoral Student Supervision

Dissertations completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest dissertations.

Inference under finite mixture models: distributed learning and approximate inference (2022)

Finite mixture models are widely used to model data that exhibit heterogeneity. In machine learning, they are often used as probabilistic models for clustering analysis. In the application of finite mixtures to real datasets, the most fundamental task is to learn model parameters. In modern applications, datasets are often too large to be stored in a single facility and are distributed across data centres. To learn models on these distributed datasets, split-and-conquer (SC) approaches are often used. SC approaches consist of two steps: they first locally learn one model per data centre and then send these local results to a central machine to be aggregated. Since the parameter spaces of mixtures are non-Euclidean, existing aggregation methods are not appropriate. We develop a novel computationally efficient aggregation approach for SC learning of finite mixtures. We show that the resulting estimator is root-n-consistent under some general conditions. Experiments show the proposed approach has comparable statistical performance with the global estimator based on the full dataset, if the latter is feasible. It also has better statistical and computational performance than some existing methods for learning mixtures on large datasets.Finite mixtures are also widely used to approximate density functions of various shapes. When mixtures are used in graphical models to approximate density functions, the order of the mixture increases exponentially due to recursive procedures and the inference becomes intractable. One way to make the inference tractable is to use mixture reduction, that is, to approximate the mixture by one with a lower order. We propose a novel reduction approach by minimizing the Composite Transportation Divergence (CTD) between two mixtures. The optimization problem can be solved by a majorization minimization algorithm. We show that many existing algorithms for reduction are special cases of our approach. This finding allows us to provide theoretical support for these existing algorithms. Our approach also allows flexible choices for cost functions in CTD. This flexibility allows our approach to have better performance than existing approaches. We also discuss other related learning issues under finite mixtures.

View record

Semiparametric inferences under a density ratio model (2022)

In many applications, we collect independent samples from interconnected populations. These population distributions share some latent structures, so it is advantageous to jointly analyze the samples. Recently, many researchers have advocated the use of the semiparametric density ratio model (DRM) to account for the latent structures these distributions share and have developed more efficient data analysis procedures based on pooled data. The advantages and several asymptotic properties of the DRM-based inferences have been demonstrated in many fields and studies, and they show that the DRM helps to improve statistical efficiency. In this thesis, we investigate several inference problems related to the DRM. The first research problem we study is on the efficiency of the inference under a two-sample DRM. We consider a scenario where we have two samples whose sizes grow to infinity at different rates. The DRM-based inferences for the smaller-sized sample are studied. We find that some DRM-based estimators achieve the same asymptotic efficiency as the parametric estimators under some parametric model assumptions. Our simulation studies support our theoretical results. Our second work studies hypothesis test problems on population quantiles when we have multiple samples whose population distributions are connected via a DRM. We explore the use of the empirical likelihood ratio test for these hypotheses, which fills a gap in the literature in this context. Our major contribution is the derivation of the limiting chi-square distribution of the test statistic. Simulation experiments and a real-data example illustrate the efficacy of the proposed method. Finally, we solve an important open problem in the literature of DRM. The DRM postulates that the log density ratios are linear combinations of prespecified basis functions. The benefit of DRM relies on correctly specifying the basis functions. However, in applications, we do not have complete knowledge to enable a perfect choice of the basis functions. A data-adaptive choice can alleviate the risk of severe model misspecification. We propose a data-adaptive approach to the choice of basis functions based on functional principal component analysis. Our simulations and real-data analyses demonstrate that our proposed method leads to an efficiency gain.

View record

Some research problems under finite mixture models (2021)

This thesis studies two types of research problems under finite mixture models. The first type is mixing distribution estimation. It is well-known that the maximum likelihood estimator (MLE) fails under some finite mixture models because their likelihood function is unbounded. This unboundedness occurs, for instance, under the finite normal mixture model, the finite gamma mixture model, and the finite location-scale mixture model. In the literature, different estimation methods have been developed by modifying the likelihood functions to restore consistency. The penalized MLE is one of the popular remedies. Though the consistency of penalized MLE has been studied extensively by many researchers, the results were often acquired under finite mixture models with some specific forms of kernel distribution. We provide a route to establish the consistency of penalized MLE under a unified framework that covers many finite mixture models. This route summarizes and helps to improve the existing results. We also study a novel way to modify the likelihood function based on data augmentation. This modified likelihood function produces a consistent estimator, the augmented MLE, under those finite mixture models with unbounded likelihood functions. In some circumstances, the augmented MLE is more efficient than its competitors in the literature.The second type of research problem is hypothesis testing for homogeneity un- der finite mixture models. We develop two tests for this purpose. First, our work migrates the Expectation-Maximization (EM) test to a finite vector-parameter mixture model with structural parameters. The EM test is a numeric and analytically tractable likelihood-based test, which has been applied to many finite mixture models. The second test is Neyman’s C(α) test, a score test variant. We generalize this test to finite vector-parameter mixture models following the principles of Neyman. Our generalization aligns with some existing results in the literature, though they are motivated differently. Yet we find this generalized test can be asymptotically biased under the finite gamma mixture model. To overcome this deficiency, we develop another C(α) test that we conjecture to be asymptotically unbiased.

View record

Sequential ED-design for binary dose-response experiments (2018)

Dose-response experiments and subsequent data analyses are often carried out according to optimal designs for the purpose of accurately determining a specific effective dose (ED) level. If the interest is the dose-response relationship over a range of ED levels, many existing optimal designs are not accurate. In this dissertation, we propose a new design procedure, called two-stage sequential ED-design, which directly and simultaneously targets several ED levels. We use a small number of trials to provide a tentative estimation of the model parameters. The doses of the subsequent trials are then selected sequentially, based on the latest model information, to maximize the efficiency of the ED estimation over several ED levels. Although the commonly used logistic and probit models are convenient summaries of the dose-response relationship, they can be too restrictive. We introduce and study a more flexible albeit slightly more complex three-parameter logistic dose-response model. We explore the effectiveness of the sequential ED-design and the D-optimal design under this model, and develop an effective model fitting strategy. We develop a two-step iterative algorithm to compute the maximum likelihood estimate of the model parameters. We prove that the algorithm iteration increases the likelihood value, and therefore will lead to at least a local maximum of the likelihood function. We also study the numerical solution to the D-optimal design for the three-parameter logistic model. Interestingly, all our numerical solutions to the D-optimal design are three-point-support distributions.We also discuss the use of the ED-design when experimental subjects become available in groups. We introduce the group sequential ED-design, and demonstrate how to construct this design. The ED-design has a natural extension to more complex model and can satisfy a broad range of the demands that may arise in applications.

View record

On Dual Empirical Likelihood Inference under Semiparametric Density Ratio Models in the Presence of Multiple Samples: With Applications to Long Term Monitoring of Lumber Quality (2014)

Maintaining a high quality of lumber products is of great social and economic importance. This thesis develops theories as part of a research program aimed at developing a long term program for monitoring change in the strength of lumber. These theories are motivated by two important tasks of the monitoring program, testing for change in strength populations of lumber produced over the years and making statistical inference on strength populations based on Type I censored lumber samples. Statistical methods for these inference tasks should ideally be efficient and nonparametric. These desiderata lead us to adopt a semiparametric density ratio model to pool the information across multiple samples and use the nonparametric empirical likelihood as the tool for statistical inference.We develop a dual empirical likelihood ratio test for composite hypotheses about the parameter of the density ratio model based on independent samples from different populations. This test encompasses testing differences in population distributions as a special case. We find the proposed test statistic to have a classical chi-square null limiting distribution. We also derive the power function of the test under a class of local alternatives. It reveals that the local power is often increased when strength is borrowed from additional samples even when their underlying distributions are unrelated to the hypothesis of interest. Simulation studies show that this test has better power properties than all potential competitors adopted to the multiple sample problem under the investigation, and is robust to model misspecification. The proposed test is then applied to assess strength properties of lumber with intuitively reasonable implications for the forest industry.We also establish a powerful inference framework for performing empirical likelihood inference under the density ratio model when Type I censored samples are present. This inference framework centers on the maximization of a concave dual partial empirical likelihood function, and features an easy computation. We study the properties of this dual partial empirical likelihood, and find its corresponding likelihood ratio test to have a simple chi-square limiting distribution under the null model and a non-central chi-square limiting distribution under local alternatives.

View record

Applications of penalized likelihood methods for feature selection in statistical modeling (2012)

Feature selection plays a pivotal role in knowledge discovery and contemporary scientific research. Traditional best subset selection or stepwise regression can be computationallyexpensive or unstable in the selection process, and so various penalized likelihood methods (PLMs) have received much attention in recent decades. In this dissertation, we develop approaches based on PLMs to deal with the issues of feature selection arising from several application fields.Motivated by genomic association studies, we first address feature selection in ultra-high-dimensional situations, where the number of candidate features can be huge. Reducing the dimension of the data is essential in such situations. We propose a novel screening approach via the sparsity-restricted maximum likelihood estimator that removes most of the irrelevant features before the formal selection. The model after screening serves as an excellent starting point for the use of PLMs. We establish the screening and selection consistency of the proposed method and develop efficient algorithms for its implementation.We next turn our attention to the analysis of complex survey data, where the identification of influential factors for certain behavioral, social, and economic indices forms a variable selection problem. When data are collected through survey sampling from a finite population, they have an intrinsic dependence structure and may provide a biased representation of the target population. To avoid distorted conclusions, survey weights are usually adopted in these analyses. We use a pseudo-likelihood to account for the survey weights and propose a penalized pseudo-likelihood method for the variable selection of survey data. The consistency of the proposed approach is established for the joint randomization framework.Lastly, we address order selection for finite mixture models, which provides a flexible tool for modeling data from a heterogeneous population. PLMs are attractive for such problems. However, this application requires maximizations over nonsmooth and nonconcave objective functions, which are computationally challenging. We transform the original multivariate objective function into a sum of univariate functions and design an iterative thresholding-based algorithm to efficiently solve the sparse maximization without ad hoc steps. We establish the convergence of the new algorithm and illustrate its efficiency through both simulations and real-data examples.

View record

Master's Student Supervision

Theses completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest theses.

On the improvement of density ratio estimation via probabilistic classifier : theoretical study and its applications (2023)

Density ratio estimation has a broad application in the world of machine learning and data science, especially in transfer learning and contrastive learning. This work mainly focuses on a particular type of density ratio estimation based on a probabilistic classification from the perspective of statistical inference. We show such a density ratio estimation relates to a probabilistic classifier such as logistic regression. We analyze the potential cause for its inefficiency and inaccuracy when the two distributions are much different from each other. Opposite to the target of a probabilistic classification, a density ratio estimation task with a more efficient estimator indicates the corresponding classification task is harder, which means it is more difficult to separate the two samples by a probabilistic classifier. We provide a theoretical explanation for this phenomenon from a mathematical and statistical standpoint. For the basic density ratio estimation by a probabilistic classification, we give a necessary and sufficient condition for its existence under a sample level. We analyze the probability that such conditions holds asymptotically if the supports of two densities are the same. Besides, we explore the asymptotic properties of a recent proposed approach to improving density ratio estimation by a probabilistic classification, Telescoping Density Ratio Estimation (TDRE) in (B. Rhodes, K. Xu, and M. U. Gutmann. Telescoping density-ratio estimation. Advances in neural information processing systems, 33:4905–4916, 2020). Numerically, we compare the asymptotic variance of basic density ratio estimation and TDRE We also explore some generalization on TDRE with unbalanced data and under some (partly) model misspecification through both theoretical discussion and empirical analysis. Some suggestions for future work on un-normalized model inference are also provided.

View record

An R package for monitoring test under density ratio model and its applications (2018)

Quantiles and their functions are important population characteristics in many applications.In forestry, lower quantiles of the modulus of rapture and other mechanicalproperties of the wood products are important quality indices. It is importantto ensure that the wood products in the market over the years meet the establishedindustrial standards. Two well-known risk measures in finance and hydrology,value at risk (VaR) and median shortfall (MS), are quantiles of their correspondingmarginal distributions. Developing effective statistical inference methods andtools on quantiles of interest is an important task in both theory and applications.When samples from multiple similar natured populations are available, Chen et al.[2016] proposed to use a density ratio model (DRM) to characterize potential latentstructures in these populations. The DRM enables us to fully utilized the informationcontained in the data from connected populations. They further proposeda composite empirical likelihood (CEL) to avoid a parametric model assumptionthat is subject to model-mis-specification risk and to accommodate clustered datastructure. A cluster-based bootstrap procedure was also investigated for varianceestimation, construction of confidence interval and test of various hypotheses.This thesis contains complementary developments to Chen et al. [2016]. First,a user-friendly R package is developed to make their methods easy-to-use for practitioners.We also include some diagnostic tools to allow users to investigate thegoodness of the fit of the density ratio model. Second, we use simulation to comparethe performance DRM-CEL-based test and the famous Wilcoxin rank test forclustered data. Third, we study the performance of DRM-CEL-based inferencewhen the data set contains observations with different cluster sizes. The simulationresults show that DRM-CEL method works well in common situations.

View record

Small area quantile estimation under unit-level models (2017)

Sample surveys are widely used as a cost-effective way to collect information on variables of interest in target populations. In applications, we are generally interested in parameters such as population means, totals, and quantiles. Similar parameters for subpopulations or areas, formed by geographic areas and socio-demographic groups, are also of interest in applications. However, the sample size might be small or even zero in subpopulations due to the probability sampling and the budget limitation. There has been intensive research on how to produce reliable estimates for characteristics of interest for subpopulations for which the sample size is small or even zero. We call this line of research Small Area Estimation (SAE). In this thesis, we study the performance of a number of small area quantile estimators based on a popular unit-level model and its variations. When a finite population can be regarded as a sample from some model, we may use the whole sample from the finite population to determine the model structure with a good precision. The information can then be used to produce more reliable estimates for small areas. However, if the model assumption is wrong, the resulting estimates can be misleading and their mean squared errors can be underestimated. Therefore, it is critical to check the robustness of estimators under various model mis-specification scenarios. In this thesis, we first conduct simulation studies to investigate the performance of three small area quantile estimators in the literature. They are found not to be very robust in some likely situations. Based on these observations, we propose an approach to obtain more robust small area quantile estimators. Simulation results show that the proposed new methods have superior performance either when the error distribution in the model is non-normal or the data set contain many outliers.

View record

Generalized method of moments - theoretical, econometric and simulation studies (2011)

The GMM estimator is widely used in the econometrics literature. This thesis mainly focus on three aspects of the GMM technique. First, I derive the prooves to study the asymptotic properties of the GMM estimator under certain conditions. To my best knowledge, the original complete prooves proposed by Hansen (1982) is not easily available. In this thesis, I provide complete prooves of consistency and asymptotic normality of the GMM estimator under some stronger assumptions than those in Hansen (1982). Second, I illustrate the application of GMM estimator in linear models. Specifically, I emphasize the economic reasons underneath the linear statistical models where GMM estimator (also referred to the Instrumental Variable estimator) is widely used. Third, I perform several simulation studies to investigate the performance of GMM estimator under different situations.

View record

Properties of Empirical and Adjusted Empirical Likelihood (2010)

Likelihood based statistical inferences have been advocated by generations of statisticians. As an alternative to the traditional parametric likelihood, empirical likelihood (EL) is appealing for its nonparametric setting and desirable asymptotic properties.In this thesis, we first review and investigate the asymptotic and finite-sample properties of the empirical likelihood, particularly its implication to constructing confidence regions for population mean. We then study the properties of the adjusted empirical likelihood (AEL) proposed by Chen et al. (2008). The adjusted empirical likelihood was introduced to overcome the shortcomings of the empirical likelihood when it is applied to statistical models specified through general estimating equations. The adjusted empirical likelihood preserves the first order asymptotic properties of the empirical likelihood and its numerical problem is substantially simplified.A major application of the empirical likelihood or adjusted empirical likelihood is the construction of confidence regions for the population mean. In addition, we discover that adjusted empirical likelihood, like empirical likelihood, has an important monotonicity property.One major discovery of this thesis is that the adjusted empirical likelihood ratio statistic is always smaller than the empirical likelihood ratio statistic. It implies that the AEL-based confidence regions always contain the corresponding EL-based confidence regions and hence have higher coverage probability. This result has been observed in many empirical studies, and we prove it rigorously.We also find that the original adjusted empirical likelihood as specified by Chen et al. (2008) has a bounded likelihood ratio statistic. This may result in confidence regions of infinite size, particularly when the sample size is small. We further investigate approaches to modify the adjusted empirical likelihood so that the resulting confidence regions of population mean are always bounded.

View record


  • A discussion of ‘A selective review on calibration information from similar studies based on parametric likelihood or empirical likelihood’ (2022)
    Statistical Theory and Related Fields,
  • Consistency of the MLE under a two-parameter Gamma mixture model with a structural shape parameter (2022)
  • Strong consistency of the MLE under two-parameter Gamma mixture models with a structural scale parameter (2022)
    Advances in Data Analysis and Classification, 16 (1), 125--154
  • A three-parameter logistic regression model (2021)
    Statistical Theory and Related Fields, 5 (3), 265--274
  • Composite empirical likelihood for multisample clustered data (2021)
    Journal of Nonparametric Statistics, , 1--22
  • Homogeneity testing under finite location‐scale mixtures (2020)
    Canadian Journal of Statistics,
  • Small Area Quantile Estimation (2019)
    International Statistical Review,
  • Consistency of the MLE under Mixture Models (2017)
    Statistical Science, 32 (1), 47--63
  • Hypothesis testing in the presence of multiple samples under density ratio models (2017)
    Statistica Sinica,
  • Composite likelihood under hidden Markov model (2016)
    Statistica Sinica, 26 (4), 1569--1586
  • Consistency of the penalized MLE for two-parameter gamma mixture models (2016)
    Science China Mathematics, 59 (12), 2301--2318
  • Empirical Likelihood Inference Under Density Ratio Models Based on Type I Censored Samples: Hypothesis Testing and Quantile Estimation (2016)
    Advanced Statistical Methods in Data Science, , 123--151
  • Monitoring test under nonparametric random effects model (2016)
    arXiv preprint arXiv:1610.05809,
  • Package ?MixtureInf? (2016)
  • Regularization in Regime-Switching Gaussian Autoregressive Models (2016)
    Advanced Statistical Methods in Data Science, , 13--34
  • Sample-size calculation for tests of homogeneity (2016)
    Canadian Journal of Statistics, 44 (1), 82--101
  • Sequential design for binary dose--response experiments (2016)
    Journal of Statistical Planning and Inference, 177, 64--73
  • Small area estimation under density ratio model (2016)
  • Testing the Order of a Normal Mixture in Mean (2016)
    Communications in Mathematics and Statistics, 4 (1), 21--38
  • A Thresholding Algorithm for Order Selection in Finite Mixture Models (2015)
    Communications in Statistics-Simulation and Computation, 44 (2), 433--453
  • Likelihood Ratio Test for Multi-Sample Mixture Model and Its Application to Genetic Imprinting (2015)
    Journal of the American Statistical Association, 110 (510), 867--877
  • Resampling calibrated adjusted empirical likelihood (2015)
    Canadian Journal of Statistics, 43 (1), 42--59
  • The joy of proofs in statistical research (2015)
    Canadian Journal of Statistics, 43 (4), 481--497
  • Building a Classification Model Based on miRNA Data (2014)
    Banff International Research Station for Mathematical Innovation and Discovery,
  • Level-specific correction for nonparametric likelihoods (2014)
    Journal of Nonparametric Statistics, 26 (3), 433--449
  • The sparse MLE for ultrahigh-dimensional feature screening (2014)
    Journal of the American Statistical Association, 109 (507), 1257--1269
  • A Markov regime-switching model for crude-oil markets: Comparison of composite likelihood and full likelihood (2013)
    Canadian Journal of Statistics, 41 (2), 353--367
  • A partial order on uncertainty and information (2013)
    Journal of Theoretical Probability, , 1--11
  • Finite-sample properties of the adjusted empirical likelihood (2013)
    Journal of Nonparametric Statistics, 25 (1), 147--159
  • Quantile and quantile-function estimations under density ratio model (2013)
    The Annals of Statistics, 41 (3), 1669--1692
  • Adjusted empirical likelihood with high-order one-sided coverage precision (2012)
    Statistics and its Interface, 5 (3), 281--292
  • Extended BIC for small-n-large-P sparse GLM (2012)
    Statistica Sinica, 22 (2), 9-9
  • Inference on the order of a normal mixture (2012)
    Journal of the American Statistical Association, 107 (499), 1096--1105
  • Order selection in finite mixture models with a nonsmooth penalty (2012)
    Journal of the American Statistical Association,
  • Partial monotonicity of entropy measures (2012)
    Statistics & Probability Letters, 82 (11), 1935--1940
  • A pseudo-GEE approach to analyzing longitudinal surveys under imputation for missing responses (2011)
    Journal of Official Statistics, 27 (2), 255
  • Constructing nonparametric likelihood confidence regions with high order precisions (2011)
    Statistica Sinica, , 1767--1783
  • The Limiting Distribution of the EM Test of the Order of a Finite Mixture (2011)
    Mixtures: Estimation and Applications, , 55--75
  • Tuning the EM-test for finite mixture models (2011)
    Canadian Journal of Statistics, 39 (3), 389--404
  • Adjusted empirical likelihood with high-order precision (2010)
    The Annals of Statistics, 38 (3), 1341--1362
  • Confidence intervals for the mean of a population containing many zero values under unequal-probability sampling (2010)
    Canadian Journal of Statistics, 38 (4), 582--597
  • Empirical likelihood based variable selection (2010)
    Journal of Statistical Planning and Inference, 140 (4), 971--981
  • Feature selection in finite mixture of sparse normal linear models in high-dimensional feature space (2010)
    Biostatistics, 12 (1), 156--172
  • Testing the order of a finite mixture (2010)
    Journal of the American Statistical Association, 105 (491), 1084--1092
  • The pseudo-GEE approach to the analysis of longitudinal surveys (2010)
    Canadian Journal of Statistics, 38 (4), 540--554
  • Uncertainty and the conditional variance (2010)
    Statistics & probability letters, 80 (23), 1764--1770
  • Adjusted exponentially tilted likelihood with applications to brain morphology (2009)
    Biometrics, 65 (3), 919--927
  • Hypothesis test for normal mixture models: The EM approach (2009)
    The Annals of Statistics, , 2523--2542
  • Inference for multivariate normal mixtures (2009)
    Journal of Multivariate Analysis, 100 (7), 1367--1383
  • Modified likelihood ratio test for homogeneity in a two-sample problem (2009)
    Statistica Sinica, , 1603--1619
  • Non-finite Fisher information and homogeneity: an EM approach (2009)
    Biometrika, 96 (2), 411--426
  • Tournament screening cum EBIC for feature selection with high-dimensional feature spaces (2009)
    Science in China Series A: Mathematics, 52 (6), 1327--1341
  • Adjusted empirical likelihood and its properties (2008)
    Journal of Computational and Graphical Statistics, 17 (2), 426--443
  • Extended Bayesian information criteria for model selection with large model spaces (2008)
    Biometrika, 95 (3), 759--771
  • Inference for normal mixtures in mean and variance (2008)
    Statistica Sinica, , 443--465
  • Modified likelihood ratio test for homogeneity in a mixture of von Mises distributions (2008)
    Journal of Statistical Planning and Inference, 138 (3), 667--681
  • Test for homogeneity in Hardy--Weinberg normal mixture model (2008)
    Journal of Statistical Planning and Inference, 138 (12), 3774--3788
  • Testing homogeneity in a mixture of von Mises distributions with a structural parameter (2008)
    Canadian Journal of Statistics, 36 (1), 129--142
  • U-statistic based modified information criterion for change point problems (2008)
    Communications in Statistics?Theory and Methods, 37 (17), 2687--2712
  • Asymptotic normality under two-phase sampling designs (2007)
    Statistica Sinica, , 1047--1064
  • Asymptotic properties of likelihood ratio test statistics in affected-sib-pair analysis (2007)
    Canadian Journal of Statistics, 35 (3), 351--364
  • Consistency of the constrained maximum likelihood estimator in finite normal mixture models (2007)
    Proceedings of the American Statistical Association, American Statistical Association, Alexandria, VA, , 2113--2119
  • Variable selection in finite mixture of regression models (2007)
    Journal of the american Statistical association, 102 (479), 1025--1038
  • A tournament approach to the detection of multiple associations in genome-wide studies with pedigree data (2006)
  • Application of modified information criterion to multiple change point problems (2006)
    Journal of multivariate analysis, 97 (10), 2221--2241
  • Information criterion and change point problem for regular models (2006)
    Sankhyā: The Indian Journal of Statistics, , 252--282
  • Sampling and Experimental Design (2006)
    Department of Statistics and Actuarial Science University of Waterloo, Belgium,
  • Testing for homogeneity in genetic linkage analysis (2006)
    Statistica Sinica, , 805--823
  • A Bartlett type correction for Rao's score test in Cox regression model (2005)
    Sankhyā: The Indian Journal of Statistics, , 722--735
  • Analysis of performance measures in experimental designs using jackknife (2005)
    Journal of quality technology, 37 (2), 91
  • Modified likelihood ratio test in finite mixture models with a structural parameter (2005)
    Journal of statistical Planning and Inference, 129 (1), 93--107
  • The universal validity of the possible triangle constraint for affected sib pairs (2005)
    Canadian Journal of Statistics, 33 (2), 297--310
  • Estimation of fish abundance indices based on scientific research trawl surveys (2004)
    Biometrics, 60 (1), 116--123
  • Testing for a finite mixture model with two components (2004)
    JR Stat. Soc. Ser. B Stat. Methodol, 66, 95--115
  • Empirical Bayes estimation and its superiority for two-way classification model (2003)
    Statistics & probability letters, 63 (2), 165--175
  • Empirical likelihood confidence intervals for the mean of a population containing many zero values (2003)
    Canadian Journal of Statistics, 31 (1), 53--68
  • Information-theoretic approach for detecting change in the parameters of a normal model (2003)
    Mathematical Methods of Statistics, 12 (1), 116
  • On the single item fill rate for a finite horizon (2003)
    Operations Research Letters, 31 (2), 119--123
  • Tests for homogeneity in normal mixtures in the presence of a structural parameter (2003)
    Statistica Sinica, , 351--365
  • Tests for homogeneity in normal mixtures with presence of a structural parameter: technical details (2003)
    Preprint, Statistica Sinica,
  • Estimation of distribution function and quantiles using the model-calibrated pseudo empirical likelihood method (2002)
    Statistica Sinica, , 1223--1239
  • Statistical inference on comparing two distribution functions with a possible crossing point (2002)
    Statistics & Probability Letters, 60 (3), 329--341
  • Using empirical likelihood methods to obtain range restricted weights in regression estimators for surveys (2002)
    Biometrika, 89 (1), 230--237
  • A diagnostic tool for mixture models (2001)
    Journal of Statistical Computation and Simulation, 69 (4), 293--313
  • A modified likelihood ratio test for homogeneity in finite mixture models (2001)
    Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63 (1), 19--29
  • Jackknife variance estimation for nearest-neighbor imputation (2001)
    Journal of the American Statistical Association, 96 (453), 260--269
  • Large sample distribution of the likelihood ratio test for normal mixtures (2001)
    Statistics & probability letters, 52 (2), 125--133
  • The likelihood ratio test for homogeneity in finite mixture models (2001)
    Canadian Journal of Statistics, 29 (2), 201--215
  • Bahadur representations of the empirical likelihood quantile processes (2000)
    Journal of nonparametric statistics, 12 (5), 645--660
  • Efficient random imputation for missing data in complex surveys (2000)
    Statistica Sinica, , 1153--1169
  • Empirical likelihood inference in the presence of measurement error (2000)
    Canadian Journal of Statistics, 28 (4), 841--852
  • Hybrid resampling methods for confidence intervals - Comment (2000)
    Statistica Sinica, 10 (1), 40-42
  • Nearest neighbor imputation for survey data (2000)
    Journal of official statistics, 16 (2), 113
  • A pseudo empirical likelihood approach to the effective use of auxiliary information in complex surveys (1999)
    Statistica Sinica, , 385--406
  • Computational methods for mixture estimation (1999)
  • Constrained nonparametric maximum-likelihood estimation for mixture models (1998)
    Canadian Journal of Statistics, 26 (4), 601--617
  • Geometric quality inspection (1998)
    Statistica Sinica, 8 (1), 135-149
  • Intelligent search for 2(13-6) and 2(14-7) minimum aberration designs (1998)
    Statistica Sinica, 8 (4), 1265-1270
  • On the identifiability of a supersaturated design (1998)
    Journal of Statistical Planning and Inference, 72 (1), 99--107
  • Penalized likelihood-ratio test for finite mixture models with multinomial observations (1998)
    Canadian Journal of Statistics, 26 (4), 583--599
  • Fractional Resolution and Minimum Aberration in Blocked 2 n?k Designs (1997)
    Technometrics, 39 (4), 382--390
  • On testing the number of components in finite mixture models with known relevant component distributions (1997)
    Canadian Journal of Statistics, 25 (3), 389--400
  • On the conditional and mixture model approaches for matched pairs (1996)
    Journal of statistical planning and inference, 55 (3), 319--329
  • Penalized minimum-distance estimates in finite mixture models (1996)
    Canadian Journal of Statistics, 24 (2), 167--175
    Annals of Statistics, 23 (1), 221-233
  • Generalized likelihood-ratio test of the number of components in finite mixture models (1994)
    Canadian Journal of Statistics, 22 (3), 387--399
  • Inverse problems in fractal construction: Hellinger distance method (1994)
    Journal of the Royal Statistical Society. Series B (Methodological), , 687--700
  • A catalogue of two-level and three-level fractional factorial designs with small runs (1993)
    International Statistical Review/Revue Internationale de Statistique, , 131--145
  • Edgeworth expansion and the bootstrap for stratified sampling without replacement from a finite population (1993)
    Canadian Journal of Statistics, 21 (4), 347--357
  • Empirical likelihood estimation for finite populations and the effective usage of auxiliary information (1993)
    Biometrika, 80 (1), 107--116
    Annals of Statistics, 21 (2), 1071-1092
  • Geometric quality assurance (1992)
    Institute for Improvement in Quality and Productivity Research Report RR-92-06, University of Waterloo, Ontario,
    Annals of Statistics, 20 (4), 2124-2141
  • On the identity relationships of 2- p designs (1991)
    Journal of Statistical Planning and Inference, 28 (1), 95--98
    Annals of Statistics, 19 (2), 1028-1041
  • On minimum aberration fractional factorial designs (1990)
  • Weak and strong representations for quantile processes from finite populations with application to simulation size in resampling inference (1990)
    Canadian Journal of Statistics, 18 (2), 141--148

If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.


Explore our wide range of course-based and research-based program options!