Loading…
This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
Apache: Big Data North America 2017 will be held at the Intercontinental Miami in Miami, Florida. 

Register now for the event taking place May 16-18, 2017. 
View analytic
Tuesday, May 16 • 4:40pm - 5:30pm
Efficient Columnar Storage with Apache Parquet - Ranganathan Balashanmugam, ThoughtWorks

Sign up or log in to save this to your schedule and see who's attending!

Feedback form is now closed.
Apache Parquet brings the advantages of compressed, efficient columnar data representation available to any project in the Hadoop ecosystem. Apache Parquet is built from the ground up with complex nested data structures in mind, and uses the record shredding and assembly algorithm described in the Dremel paper. We believe this approach is superior to simple flattening of nested name spaces. Apache Parquet is built to support very efficient compression and encoding schemes. Multiple projects have demonstrated the performance impact of applying the right compression and encoding scheme to the data. Apache Parquet allows compression schemes to be specified on a per-column level and is future-proofed to allow adding more encodings as they are invented and implemented. This talk highlights the internal implementation of Apache Parquet.

Speakers
avatar for Ranganathan Balashanmugam

Ranganathan Balashanmugam

Head of Engineering - India, Aconex
Ranganathan has nearly twelve years of experience of developing awesome products and loves to works on full stack - from front end, to backend and scale. He is Head of Engineering - India at Aconex and prior to that was Technology Lead at ThoughtWorks. He is Microsoft MVP for Dat... Read More →



Tuesday May 16, 2017 4:40pm - 5:30pm
Alhambra

Attendees (28)