Victoria Savalei


Relevant Thesis-Based Degree Programs


Graduate Student Supervision

Doctoral Student Supervision

Dissertations completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest dissertations.

Exploring the impact of different missing data mechanisms on the efficiency of parameter estimates (2022)

Modern missing data techniques, such as full information maximum likelihood (FIML) and multiple imputation (MI), have become more accessible and popular in recent years. When data are missing at random (MAR), these techniques produce consistent parameter estimates, correct standard errors, and valid statistical inferences, without the need for researchers to specify the details of the underlying missing data mechanisms. However, details of MAR mechanisms can affect the efficiency of parameter estimates. Under current practice, this efficiency loss is typically measured only by the rate of missing data; yet, even when the rate of missing data is held constant, variations in the MAR mechanism can lead to different efficiency loss. As a result, the statistical power of psychological studies can be impacted by missing data in ways that are unexpected and unreported. The fraction of missing information (FMI) is a direct measure of efficiency loss due to missing data. In this dissertation, I first explored the impact of different variations of MAR under a wide range of scenarios using FMI. It was discovered that efficiency loss has a complex relationship to many factors, such as the conditioning relationship and the values of model parameters. These findings demonstrate the need to adopt FMI as a diagnostic measure in empirical studies when data are missing. Furthermore, these findings show the need to control for moderators of efficiency loss in simulation studies, and to use FMI as a guide for study design. Next, I conducted a series of simulation studies to evaluate the properties of several sample estimates of FMI, and found that the accurate estimation of FMI required a sample size of at least 200. In the final part of the dissertation, I provided further reasons why it is important to study efficiency loss under MAR, by demonstrating the connection between MAR efficiency and sampling strategies in psychological studies. Using extreme groups design (EGD) as an example, I showed that more efficient forms of MAR mechanisms can help improve planned missing data designs.

View record

Examining how missing data affect approximate fit indices in structural equation modelling under different estimation methods (2021)

The full-information maximum likelihood (FIML) is a popular estimation method for missing data in structural equation modeling (SEM). However, it is not commonly known that approximate fit indices (AFIs) can be distorted, relative to their complete data counterparts, when FIML is used to handle missing data. In the first part of the dissertation work, we show that two most popular AFIs, the root mean square error of approximation (RMSEA) and the comparative fit index (CFI), often approach different population values under FIML estimation when missing data are present. By deriving the FIML fit function for incomplete data and showing that it is different from the usual maximum likelihood (ML) fit function for complete data, we provide a mathematical explanation for this phenomenon. We also present several analytic examples as well as the results of two large sample simulation studies to illustrate how AFIs change with missing data under FIML. In the second part of the dissertation work, we propose and examine an alternative approach for computing AFIs following the FIML estimation, which we refer to as the FIML-Corrected or FIML-C approach. We also examine another existing estimation method, the two-stage (TS) approach, for computing AFIs in the presence of missing data. For both FIML-C and TS approaches, we also propose a series of small sample corrections to improve the estimates of AFIs. In two simulation studies, we find that the FIML-C and TS approaches, when implemented with small sample corrections, can estimate the complete data population AFIs with little bias across a variety of conditions, although the FIML-C approach can fail in a small number of conditions with a high percentage of missing data and a high degree of model misspecification. In contrast, the FIML AFIs as currently computed often performed poorly. We recommend FIML-C and TS approaches for computing AFIs in SEM.

View record

Relaxed methods for evaluating measurement invariance within a multiple-group confirmatory factor analytic framework (2021)

Measurement Invariance (MI) refers to the equivalent functioning of psychometric instruments when applied across different groups. Violations of MI can lead to spurious between-group differences, or obscure true differences, on observed scores, means, and covariances. Chapter 1 introduces the multiple-group confirmatory factor analysis (MGCFA) approach to evaluating MI. The present research seeks to identify overly restrictive assumptions of the MGCFA approach, and to provide alternative recommendations. Chapter 2 notes that typical MGCFA MI models assume equivalent functioning of each item, while in practice, applied researchers are often primarily interested in equivalent functioning of composite scores. Chapter 2 introduces an approach to assessing MI of composite scores that does not assume MI of all items, by placing between-group equality constraints on measurement parameter totals. Invariance of parameter totals is referred to as “scale-level MI”, while the invariance of individual measurement parameters is referred to as “item-level MI.” Power analyses of tests of scale-level and item-level MI illustrate that, despite item-level MI models being nested within scale-level MI models, tests of scale-level MI are often more sensitive to violations of MI that affect the between-group comparability of composite scores. Chapter 3 introduces an approach to quantifying between-group differences in classification accuracy when critical composite scores are used for selection and a minimum of partial scalar MI – MI of some, but not all, loadings and intercepts – is retained. Chapter 3 illustrates that different patterns of violations of MI differentially affect classification accuracy ratios for different measures of classification accuracy. Between-group differences on multiple sets of measurement parameters can have compensatory or additive effects on classification accuracy ratios. Finite sample variability of classification accuracy ratios is discussed, and a Bollen-Stine bootstrapping approach for estimating confidence intervals around classification accuracy ratios is recommended. Chapter 4 addresses limitations of popular methods of assessing fit of nested MI models. Chapter 4 introduces a modified RMSEA, RMSEAD, for comparing fit of nested MI models, which avoids the sensitivity to minor misspecifications of chi-square tests, as well as the differential interpretation of ΔGFIs depending on model degrees of freedom. Recommendations, limitations, and future research are discussed in Chapter 5.

View record

Master's Student Supervision

Theses completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest theses.

Distinguish the bifactor and higher-order factor model : a comparison of three RMSEA-related approaches under model misspecification (2023)

The bifactor model (BFM) is widely used in psychology, often compared with its nested models like the higher-order factor model (HFM) using fit indices. Previous simulation studies have shown that the BFM tends to outperform the HFM via fit indices, even when the HFM is the data-generating model (Greene et al., 2019; Morgan et al., 2015; Murray & Johnson, 2013). The superior model fit of the BFM has been described as a fit index “bias” rather than an indication of model correctness. Focusing on the root mean square error of approximation (RMSEA), the dominant approaches in the nested model comparison are the simple RMSEA approach (i.e., compare RMSEA of both models) and the ∆RMSEA approach (i.e., calculate the difference between two RMSEA values and use a non-zero cut-off to evaluate). An alternative approach, which uses an RMSEA associated with the chi-square difference test (i.e., RMSEA_D) has also been re-discovered and advocated (Brace, 2020; Savalei, et al., 2023). In the study, I evaluated the performance of three approaches when the true model is the HFM, containing varying degrees of misspecification. The results showed that the simple RMSEA approach was biased in favour of the BFM under minor misspecification, while the ∆RMSEA approach leaned towards HFM, even with severe misspecification. The RMSEA_D approach quantifies the misfit introduced by the HFM for each model misspecification condition without favouring either model. Additionally, I investigated how sample size and model size affect the equivalence test results of RMSEA_D. Recommendations are also made for future nested model comparison.

View record

Two-stage maximum likelihood approach for item-level missing data in regression (2017)

Psychologists often use scales composed of multiple items to measure underlying constructs, such as well-being, depression, and personality traits. Missing data often occurs at the item-level. For example, participants may skip items on a questionnaire for various reasons. If variables in the dataset can account for the missingness, the data is missing at random (MAR). Modern missing data approaches can deal with MAR missing data effectively, but existing analytical approaches cannot accommodate item-level missing data. A very common practice in psychology is to average all available items to produce scale means when there is missing data. This approach, called available-case maximum likelihood (ACML) may produce biased results in addition to incorrect standard errors. Another approach is scale-level full information maximum likelihood (SL-FIML), which treats the whole scale as missing if even one item is missing. SL-FIML is inefficient and prone to bias. A new analytical approach, called the two-stage maximum likelihood approach (TSML), was recently developed as an alternative (Savalei & Rhemtulla, 2017b). The original work showed that the method outperformed ACML and SL-FIML in structural equation models with parcels. The current simulation study examined the performance of ACML, SL- FIML, and TSML in the context of bivariate regression. It was shown that when item loadings or item means are unequal within the composite, ACML and SL-FIML produced biased estimates on regression coefficients under MAR. Outside of convergence issues when the sample size is small and the number of variables is large, TSML performed well in all simulated conditions, showing little bias, high efficiency, and good coverage. Additionally, the current study investigated how changing the strength of the MAR mechanism may lead to drastically different conclusions in simulation studies. A preliminary definition of MAR strength is provided in order to demonstrate its impact. Recommendations are made to future simulation studies on missing data.

View record

Improving the Factor Structure of Psychological Scales: The Expanded Format as the Alternative to the Likert Scale Format (2015)

Many psychological scales written in the Likert format include reverse worded (RW) items in order to control acquiescence bias. However, studies have shown that RW items often contaminate the factor structure of the scale by creating one or more method factors. The present study examines an alternative scale format, called the Expanded format, which replaces each response option in the Likert scale with a full sentence. We hypothesized that this format would result in a cleaner factor structure as compared to the Likert format. We tested this hypothesis on three popular psychological scales: the Rosenberg Self-Esteem scale, the Conscientiousness subscale of the Big Five Inventory, and the Beck Depression Inventory II. Scales in both formats showed comparable reliabilities and convergent validities. However, scales in the Expanded format had better (i.e., lower and more theoretically defensible) dimensionalities than scales in the Likert format, as assessed by both exploratory factor analyses and confirmatory factor analyses. We encourage further study and wider use of the Expanded format, particularly when the dimensionality of a scale is of theoretical interest.

View record

Type I Error Rates and Power of Robust Chi-Square Difference Tests in Investigations of Measurements Invariance (2015)

A Monte Carlo simulation study was conducted to investigate Type I error rates and power of several corrections for non-normality to the normal theory chi-square difference test in the context of evaluating measurement invariance via Structural Equation Modeling (SEM). Studied statistics include: 1) the uncorrected difference test, DML, 2) Satorra’s (2000) original computationally intensive correction, DS0, 3) Satorra and Bentler’s (2001) simplified correction, DSB1, 4) Satorra and Bentler’s (2010) strictly positive correction, DSB10, and 5) a hybrid procedure, DSBH (Asparouhov & Muthén, 2010), which is equal to DSB1 when DSB1 is positive, and DSB10 when DSB1 is negative. Multiple-group data were generated from confirmatory factor analytic models invariant on some but not all parameters. A series of six nested invariance models was fit to each generated dataset. Population parameter values had little influence on the relative performance of the scaled statistics, while level of invariance being tested did. DS0 was found to over-reject in many Type I error conditions, and it is suspected that high observed rejection rates in power conditions are due to a general positive bias. DSB1 generally performed well in Type I error conditions, but severely under-rejected in power conditions. DSB10 performed reasonably well and consistently in both Type I error and power conditions. We recommend that researchers use the strictly positive corrected difference test, DSB10, to evaluate measurement invariance when data are not normally distributed.

View record

Investigation of type I error rates of three versions of robust chi-square difference tests in structureal equation modeling (2013)

A Monte Carlo simulation was conducted to investigate the Type I error rates of several versions of chi-square difference tests for nonnormal data in confirmatory factor analysis (CFA) models. The studied statistics include: 1) the original uncorrected difference test, D, obtained by taking the difference of the ML chi-squares for the respective models; 2) the original robust difference test, DR₁, due to Satorra and Bentler (2001); 3) the recent modification to this test, DR₂, which ensures that the statistic remains positive (Satorra & Bentler, 2010); and 4) a hybrid statistic, DH, proposed by Asparouhov and Muthén (2010), which is equal to DR₁ when DR₁ > 0, and otherwise is equal to DR₁. Types of constraints studied included constraining factor correlations to 0, constraining factor correlations to 1, and constraining factor loadings to equal each other within or across factors. An interesting finding was that the uncorrected test appeared to be robust to nonnormality when the constraint was setting factor correlations to zero. The robust tests performed well and similarly to each other in many conditions. The new strictly positive test, DR₂ exhibited slightly inflated rejection rates in conditions that involved constraining factor loadings, while DR₁ and DH exhibited rejection rates slightly below nominal in conditions that involved constraining factor correlations or factor loadings. While more research is needed on the new strictly positive test, the original robust difference test or the hybrid procedure are tentatively recommended.

View record

The effects of misspecfication type and nuisance variables on the behaviors of population fit indices used in structural equation modeling (2011)

The present study examined the performance of population fit indices used in structural equation modeling. Index performances were evaluated in multiple modeling situations that involved misspecification due to either omitted error covariances or to an incorrectly modeled latent structure. Additional nuisance parameters, including loading size, factor correlation size, model size, and model balance, were manipulated to determine which indices’ behaviors were influenced by changes in modeling situations over and above changes in the size and severity of misspecification. The study revealed that certain indices (CFI, NNFI) are more appropriate to use when models involve latent misspecification, while other indices (RMSEA, GFI, SRMR) are more appropriate in situations where models involve misspecification due to omitted error covariances. It was found that the performances of all indices were affected to some extent by additional nuisance parameters. In particular, higher loading sizes led to increased sensitivity to misspecification and model size affected index behavior differently depending on the source of the misspecification.

View record


Membership Status

Member of G+PS
View explanation of statuses

Program Affiliations

Academic Unit(s)


If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.


Planning to do a research degree? Use our expert search to find a potential supervisor!