This is one of the most commonly asked data scientist questions which if answered correctly can increase your chances of getting hired. It is impossible to do data analysis on a large volume of data at a given time, especially on larger datasets. It is mandatory to take some data samples that can represent the whole data and then perform an analysis on it. While doing this, the sample data we are taking must be taken in a way that truly covers the whole dataset. This process is known as Sampling.

Categories of techniques used for sampling

  • Probability Sampling Techniques: Simple Random Sampling, Stratified Sampling, Clustered Sampling.
  • Non-Probability Sampling Techniques: Convenience Sampling, Snowball, and Quota Sampling.
BY Best Interview Question ON 02 Nov 2022