Apache: Big Data North America 2017 will be held at the Intercontinental Miami in Miami, Florida. 

Register now for the event taking place May 16-18, 2017. 
Back To Schedule
Tuesday, May 16 • 12:05pm - 12:55pm
Even Faster: When Presto Meets Parquet @ Uber - Zhenxiao Luo, Uber

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
As Uber continues to grow, our big data systems need to grow in scalability, reliability, and performance, to help Uber make business decisions, give user recommendations, and analyze experiments across all data sources. Since 2016, we put Presto in production. Now Presto is serving ~100K queries per day @ Uber, and it becomes a key component for interactive SQL queries on big data. In this presentation, we would like to talk about our experiences and engineering efforts, we start with general introduction about Hadoop Infrastructure & Analytics @ Uber, then comes a brief introduction to Presto, the Interactive SQL engine for big data. We will focus on how we build the New Parquet Reader for Presto, and the detail techniques, Columnar Reads, Lazy Reads, Nested Column Pruning. We will show performance improvements and Uber's Use Cases. Finally, we would like to share our ongoing work.


Zhenxiao Luo

Sr. Staff Engineer, Twitter
Zhenxiao Luo is leading Interactive Query Engines team at Twitter, where he focuses on Druid, Presto, and Spark. Before joining Twitter, Zhenxiao was running Interactive Analytics team at Uber. He has big data experience at Netflix, Facebook, Cloudera, and Vertica. Zhenxiao is Committer... Read More →

Tuesday May 16, 2017 12:05pm - 12:55pm EDT
  • Experience Level Any