Agenda
ab 8.30 Uhr: Registrierung und Begrüßungskaffee
9.30 Uhr: Beginn
9.30 - 9.45 Uhr: Organisation und Umgebung
9.45 - 10.00 Uhr: Spark Kurzvorstellung
10.00 - 11.00 Uhr: Spark DataFrame API (hands on training)
- Loading Data into Apache Spark
- Simple DataFrame Operations (Selects, ...)
11.00 - 11:15 Uhr: Kaffeepause
11.15 - 12.30 Uhr: Data Engineering
- Extracting records from different file types
- Storing data efficiently as files
12.30 - 13.30 Uhr: Mittagspause
13.30 - 14.00 Uhr: Machine Learning Introduction
- Linear Regression
- Model Validation
14.00 - 15.30 Uhr: ML Example
- Presenting the example
- Inspecting and preparing the data
- Building and Training ML Pipelines
- Prediction using Pipelines
15.30 - 15.45 Uhr: Kaffeepause
15.45 - 16.30 Uhr: Refining the example
- Integrating multiple sources
- Model Evaluation
ca. 17.00 Uhr Ende