Talk: The Right Data Format for the Right Job
What if you could get blazing fast queries on your data without having to be on call for a giant, expensive database?
By picking the right file format for your data, you can store your data on disk in the cloud and still get the performance you need for modern analytics.
We’ll discuss benchmarks of four different data storage formats: Parquet, ORC, Avro, and traditional character-separated files like CSV.
We’ll cover what they are, how they work at a bits-and-bytes level, and why you might choose each one for your use case.