Zurück

Stream Processing with Apache Flink

As data processing becomes more real-time, stream processing is becoming more important. Apache Flink makes it easier to build and manage scalable stream processing applications.

In this workshop you will learn the basics about stream processing with Apache Flink. You will learn how to implement stream processing applications that ingest events from Apache Kafka and submit them to a (Docker) local Flink cluster for execution.

We will show the basic commands to manage a continuously running application and how to access its metrics. Later, we will introduce you to Flink's streaming SQL interface. You will submit SQL queries that are evaluated over unbounded data streams, producing results that are continuously updated as more and more data is ingested.

The Workshop is designed in English, but Fabian Hueske will present in German, if attending participants prefer so.

Vorkenntnisse

• Basics of Java and SQL
• Basic knowledge of distributed data processing (MapReduce/Spark/etc.) will be helpful but is not required
• You will need a notebook with at least 8 GB RAM and the following software installed: Docker, Java 8, Java IDE (preferably IDEA IntelliJ), Apache Maven

Lernziele

Learn how to run SQL on streaming data

Agenda

ab 8.30: Registrierung und Begrüßungskaffee
9.30: Beginn

Einführung - Was ist Apache Flink?
Praxis: Aufsetzen der Workshop Umgebung

11.00 - 11.15: Kaffeepause

Flink's DataStream API
Praxis: Starten und Betrieb einer Flink Stream Processing Anwendung

12.30 - 13.30: Mittagspause

Flink's Ansatz zu SQL auf Data Streams
Praxis: Starten der SQL Umgebung
Praxis: Erste SQL Anfragen auf Data Streams

15.30 - 15.45: Kaffeepause

Window Aggregationen und Joins mit SQL
Praxis: Fortgeschrittene SQL Anfragen

ca. 17.00 Uhr: Ende

Technische Anforderungen

Den praktischen Teil des Workshop führen die Teilnehmer in einer Docker Umgebung auf einem eigenen Rechner durch. Die Teilnehmer benötigen einen Laptop mit mind. 8 GB RAM auf dem Docker installiert ist (mit 3 GB RAM für Docker). Wir haben Erfahrungen mit Windows, macOS und Linux Umgebungen.

Vorbereitung:
Auf der folgenden Webseite sind alle Informationen wie benötigte Software und Vorbereitungsschritte erklärt: https://github.com/fhueske/flink-intro-tutorial

Speaker

Fabian Hueske is a committer and PMC member of the Apache Flink project and has been contributing to Flink since its earliest days. Fabian is a co-founder of Ververica, a Berlin-based startup devoted to fostering Flink, where he works as a software engineer and contributes to Apache Flink. He holds a PhD in computer science from TU Berlin and is currently writing a book about Stream Processing.

Timo Walther ist Committer des Apache Flink Projekts und Teil des Projektmanagement-Komitees. Er studierte Informatik an der TU Berlin. Während seiner Studienzeit war er Teil der Database Systems and Information Management Group der TU und arbeitete bei IBM Deutschland. Timo Walther arbeitet als Software Engineer bei Ververica. In Flink beschäftigt er sich hauptsächlich mit der Table & SQL API.

Jetzt Tickets sichern