This statistics webinar series is annually presented by the Applied Statistics and Data Science Group. Recordings and descriptions of each session are provided.
Fundamentals of Statistics Series
Exploratory Data Analysis (EDA) is essential for understanding your data and a necessary step prior to any testing or modeling. In this webinar, you will learn how to use insightful graphical and numerical techniques for investigating important aspects of your data such as relationships between variables and unusual observations.
This webinar focuses on the first two crucial steps in a statistical investigation: 1) identify a question and 2) collect the right data to answer this question.
In this webinar, you will learn how to gain insight from a random and unbiased data sample of a population.
Understanding relationships is a key part of the scientific inquiry process. You will learn how to describe relationships between two numerical quantities through correlation measures and simple linear regression models. This will also be extended to multiple linear regression for including additional predictor variables.
This webinar is critical for understanding when linear regression models aren’t applicable and how to model dependent data.
Reproducible Research Series
In this webinar we will introduce the importance of reproducible research, as well as give a historical overview on the beginnings of reproducible research. Once we introduce benefits and elements of reproducibility, we will discuss the most common pitfalls and guide ways toward reproducibility.
In this webinar we will introduce the best practices and tools for creating reproducible workflows for your statistical analyses. We will start by introducing concepts on automated version control via git, and then we will slowly build up the git tools from working locally within a git repository to uploading and sharing your web repository on GitLab, supported by visual diagrams to help you navigate through these concepts.
In this webinar we will introduce the best practices and tools for sharing your reproducible workflows and collaborating with your colleagues in private networks or with any public GitLab users. We will start by introducing concepts on branching and forking using diagrams.
Some specific types of analyses that are intended to be repeated each time new data arrive can be automated. Building interactive dashboards is made possible via the Shiny package in R.
Advanced Statistical Modeling
In this webinar, we will introduce the basic concepts behind machine learning algorithms, their difference from the statistical methods and the most common type of problems that can be tackled with machine learning algorithms.