Non-Degree / Dates: 15-24 July 2024
This is a 32-hour hands-on course for statistical data analysis using R. The main goal of this course is to empower participants to use R for data analysis and machine learning applications.
The course introduces students to the statistical programming language R and the use of R studio. We will cover the concepts of data manipulation and data preparation as well as uni- and bivariate statistics in R.
We will cover the so-called grammar of graphics in R with the ggplot2 package to create stunning and publication-ready data visualizations. We will also discuss how to conduct basic descriptive statistics (such as mean, standard deviation, correlation) in R to describe your data.
Our main focus will be the discussion of a selection of machine learning algorithms and their implementation in R. We will for example try to model the factors that influenced the survival of the Titanic passengers, predict customer churn for a telecommunications company and try to classify traffic signs based on images.
The course is designed to give a robust theoretical understanding of the methods and allow students to use the algorithms with real-world data sets.
Aims of the curriculum:
- Introduction to the statistical programming language R and the use of R studio
- Data manipulation and data preparation in R
- Uni- und and bivariate statistics in R
- Grammar of graphics in R (with ggplot2) to produce publication-ready graphics
- Theoretical understanding of selected machine learning algorithms (e.g. logistics regression, decision trees and random forests, k-nearest-neighbours, hierarchical cluster analysis)
- Practical application of selected machine learning algorithms in R
Why this course?
Teacher(s)
Dr. Daniel Hoppe is a Professor of Business Administration, especially Retail Management and e-Commerce at Cooperative University Gera-Eisenach.
Prior to his career in academia, Dr. Hoppe held various responsible roles in the corporate sector, for example, a position at ALDI Nord Germany as a data analytics business partner, advising departments on data-driven problems.
His educational background includes a doctorate degree from Philipps-University of Marburg in Marketing and a Master of Arts degree from South Westphalia University of Applied Sciences in Business Administration.
Timetable
Classes take place on working days: 8:00-9:30, 9:45-11:15 (4 academic hours a day; total hours: 32).
Participants
Generally, anyone interested in learning the statistical programming language R for data analysis and application of machine learning algorithms are welcome to apply. Specifically:
* aspiring Bachelor students (after successfully passing the statistics course)
* master students / PhD students.
No previous knowledge in R is required. However, basic statistical knowledge (descriptive and analytical statistics) is recommended.
Students should bring their own laptop (Windows or Mac) and have R and R studio installed. Details on how to install R and R Studio will be provided.
Credit points
4 ECTS.
Assessment criteria: written assignment (10 – 15 pages of text plus R code), application of uni- and bivariate statistics, graphical visualization and (at least) one machine learning algorithm to be applied to a data set.
Course fee
- Early-Bird Course Fee (until 31 March 2024)
- Regular Course Fee (after 31 March 2024)
- 400€
- 450€
Accommodation, cultural programme and meals are not included in the price.