Data Lake scale SQL with Presto

At Scout24, we are building a Data Platform to allow Scouties to make data-driven decisions everyday. We follow a Data Lake pattern to meet our high volume demand of data. Doing so allows us to leverage a wide range of tools atop our Data Lake.

While most of our analytics team is well versed with SQL and uses MicroStrategy for creating reports and dashboards, a SQL access layer to our Data Lake is highly relevant. We have added Presto, an open source SQL engine from Facebook, into our toolset to compliment our Data Lake offering.

Come join us to have a look at what our architecture looks like, why we chose Presto and how we tune Presto for our workloads.


A general know how of the Hadoop ecosystem will get you going. However, having some experience with AWS EMR will make it even more fun.


The purpose is to share findings about using Presto on EMR, the challenges we tackled on our way and the how our overall architecture looks like.




Muhammad Nouman Shahzad is an ever curious software enthusiast, passionate about distributed systems, cloud computing and data analytics. Currently working with Scout24 to help build a data platform and allow the organization to become data driven.





Bronze-Sponsor GmbH


Sie möchten über die data2day
auf dem Laufenden gehalten werden?