Gabriela Cohen Freue
Relevant Degree Programs
Affiliations to Research Centres, Institutes & Clusters
Graduate Student Supervision
Master's Student Supervision (2010 - 2018)
In order to predict the population of Indian reserves in Canada for the 2016 Census, we can construct a suitable model using data from the Indian Register and past censuses. Linear mixed effects models are a popular method for predicting values of responses on longitudinal data. However, linear mixed effects models require repeated measures in order to fit a model. Alternative methods such as linear regression only require data from a single time point in order to fit a model, but it does not directly account for within-individual correlation when predicting. Since we are predicting the responses of the same set of individuals, we can expect responses at the next time point to be strongly correlated with past responses for an individual.We introduce a new method of prediction, temporal adjusted prediction (TAP), that addresses the issue of within-individual correlation in predictions and only requires data from a single time point to estimate model parameters. Predictions are based on the last recorded response of an individual and adjusted based on changes to the values of their covariates and estimated regression coefficients that relate the response and the covariates. Predictions are made using a random intercept model rather than a linear regression model. It is shown that if the random intercept accounts for a larger proportion of the random variation in the data than the random error term, then temporal adjusted prediction achieves a lower mean squared prediction error than linear regression.TAP performs better than linear regression when predicting on the same set of individuals at different time points. It also shows similar prediction performance compared to linear mixed effects models estimated with maximum likelihood estimation despite only requiring data from one time point in order to fit a model.
Instrumental variables are commonly used in statistics, econometrics, and epidemiology to obtain consistent parameter estimates in regression models when some of the predictors are correlated with the error term. However, the properties of these estimators are sensitive to the choice of valid instruments. Since in many applications, valid instruments come in a bigger set that includes also weak and possibly irrelevant instruments, the researcher needs to select a smaller subset of variables that are relevant and strongly correlated with the predictors in the model. This thesis reviews part of the instrumental variables literature, examines the problems caused by having many potential instruments, and uses different variables selection methods in order to identify the relevant instruments. Specifically, the performance of different techniques is compared by looking at the number of relevant variables correctly detected, and at the root mean square error of the regression coefficients’ estimate. Simulation studies are conducted to evaluate the performance of the described methods.