What is the difference between coalesce and repartition in Spark?
|It is used for definitely decreasing the number of partitions used in a Dataframe.||This method can decrease or increase the number of partitions used in a Dataframe.|
|It uses the existing partitions to minimize the amount of data being shuffled in a Dataframe.||It just creates new partitions and while doing a full shuffle.|
|The partitions through this method are of variable sizes.||The partitions in this method are roughly the same sizes.|
BY Best Interview Question ON 10 Jun 2020