Parquet is a column-based file format which is used to optimize the speed of queries and is very efficient than a CSV or JSON file format. Spark SQL supports both read and write functions on parquet files which capture schema of original data automatically

