Apache: Big Data North America 2017 will be held at the Intercontinental Miami in Miami, Florida. 

Register now for the event taking place May 16-18, 2017. 
Tuesday, May 16 • 12:05pm - 12:55pm
Even Faster: When Presto Meets Parquet @ Uber - Zhenxiao Luo, Uber

Sign up or log in to save this to your schedule and see who's attending!

Feedback form is now closed.
As Uber continues to grow, our big data systems need to grow in scalability, reliability, and performance, to help Uber make business decisions, give user recommendations, and analyze experiments across all data sources. Since 2016, we put Presto in production. Now Presto is serving ~100K queries per day @ Uber, and it becomes a key component for interactive SQL queries on big data. In this presentation, we would like to talk about our experiences and engineering efforts, we start with general introduction about Hadoop Infrastructure & Analytics @ Uber, then comes a brief introduction to Presto, the Interactive SQL engine for big data. We will focus on how we build the New Parquet Reader for Presto, and the detail techniques, Columnar Reads, Lazy Reads, Nested Column Pruning. We will show performance improvements and Uber's Use Cases. Finally, we would like to share our ongoing work.


Zhenxiao Luo

Zhenxiao Luo is a software engineer at Uber. He leads interactive SQL engine projects for Hadoop, specifically, Presto and Parquet. Before joining Uber, he led the development and operations of Presto at Netflix. Zhenxiao has big data experience at Facebook, Cloudera, and Vertica... Read More →

Tuesday May 16, 2017 12:05pm - 12:55pm
  • Experience Level Any

Attendees (35)