Data Lake scale SQL with Presto
At Scout24, we are building a Data Platform to allow Scouties to make data-driven decisions everyday. We follow a Data Lake pattern to meet our high volume demand of data. Doing so allows us to leverage a wide range of tools atop our Data Lake.
While most of our analytics team is well versed with SQL and uses MicroStrategy for creating reports and dashboards, a SQL access layer to our Data Lake is highly relevant. We have added Presto, an open source SQL engine from Facebook, into our toolset to compliment our Data Lake offering.
Come join us to have a look at what our architecture looks like, why we chose Presto and how we tune Presto for our workloads.
Vorkenntnisse
A general know how of the Hadoop ecosystem will get you going. However, having some experience with AWS EMR will make it even more fun.
Lernziele
The purpose is to share findings about using Presto on EMR, the challenges we tackled on our way and the how our overall architecture looks like.