# Statistics with R

In this two-day training we introduce you to exploratory data analysis, inferential statistics and the concept of regression. Learn how to implement these in the programming environment R.

**Exploratory analysis:** "*Each and every investigation should start with an explorative data analysis*" (Tukey 1977). This crucial step is often skipped when working with data. It is extremely important though, to get a ‚feeling‘ for the data. This includes checking for coding and measurement errors or ensuring that assumptions hold e.g. the assumption of normal distribution. In addition to an introduction to descriptive statistics you’ll learn how to create professional graphics using the ggplot2 package. Our data science experts provide you with tips and tricks on how to present your results appropriately.

**Inferential statistics:** Even in the age of Big Data most analyses are based on samples. However, the findings of an analysis are often generalized i.e. they are applied to the population. This is a matter of inferential statistics. Our experts put you in touch with the theoretical concepts of confidence intervals, hypothesis- and significance tests and how to realize them in R.

**Regression:** One very important part of data science is prediction. In order to approach this topic we’ll discuss regression analysis. This broadly applicable model class uses information of various variables to predict the outcome of another variable of interest. Once you’ve completed this section you’ll be able to make, evaluate and interpret your own linear predictions in R.

**Knowledge:** This module requires basic knowledge of the functionality of R, similar to the contents of the module R Basics: object classes and their properties, basic operations, data import and data management with the package dplyr. Fundamental knowledge of statistics might be an advantage but is not necessary because the presented methods are explained from the scratch.

**Hard- and software:** You will need a Laptop with the current versions of R and RStudio. The statistical programming environment R can be downloaded from the website of the Comprehensive R Archive Network. The free desktop version of RStudio is available on the website of RStudio.

#### Marina Runge

Marina gained experience in statistical consulting and teaching while earning her degree in mathematics and statistics. Since 2014 she has been working at INWT in the field of data science. Marina’s work focuses on predictive analytics, online marketing, and training.

#### Dr. Steffen Wagner

Steffen is co-founder of INWT. He specializes in predictive analytics, online marketing, and customer relationship management. He holds a Ph.D. in physics and gives insights into his data science work as a lecturer in the joint master’s program in statistics offered by a consortium of Berlin universities.