https://spark-summit.org/east-2017/events/spark-parquet-in-depth/

This is a talk I gave at Spark Summit East 2017 with my mentor Robbie Strickland.

Parquet is a big data storage format that wag integral to our analytics workflow. This talk details the ins and outs of the format in connection with Apache Spark.